Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I don't know if it was done already but it should be possible to make a compression format that also aids in searching the archive in a bloomfilter-ish kind of way.

Then get 4 goals: compression ratio, compression speed, decompression speed and search (which could be split further)



As pointed out: it's done, look for algorithmics over grammar-based compression. Querying and search is one of the operation that is doable on compressed data.

See Algorithmics on SLP-compressed strings: A survey (Markus Lohrey) (https://www.degruyter.com/document/doi/10.1515/gcc-2012-0016...)

Implementation-wise, you probably loose on the first goal, gain on the second and third (simpler, faster implementations), if you make the fourth one easy to implement.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: