Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
BM25 is the workhorse of search; vectors are its visionary cousin (transistor.fm)
1 point by nicolay-ai on Nov 20, 2024 | hide | past | favorite | 1 comment


Vector search is more precise and effective for semantic similarity, but its operational costs and memory requirements make it prohibitive for massive datasets like GitHub’s over 100 billion documents.

BM25’s scaling challenges (e.g., reliance on disk IOPS) are manageable compared to the memory-bound nature of vector search engines like HNSW and IVF.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: