Some new developments with Elasticsearch have made writing a search engine a more pressing task to revisit:
- It changed its license; using nonfree dependencies in an AGPL project is in poor taste at best, even if Philomena does not link a nonfree version of Elasticsearch
- Its terrible performance has started causing trouble for scaling Philomena down to save cost in practical scenarios
- Its high memory usage causes contention even on servers with plenty to go around
To address this, I have decided to revisit the search engine via bitmap tree implementation idea. But instead of implementing it via a custom database, I will probably take the following shortcuts:
- Still implemented in Rust
- SQLite or LMDB as the storage backend
- Bitmaps are rewritten fully when they are modified, just like in Elasticsearch
This allows making the following simplifications which drag this project into relative feasibility for a minimum viable product:
- All transaction and serialization support is delegated to the storage backend
- Bitmap operations and writing can be delegated to an optimized third-party implementation like CRoaring
Hopefully next time you see me write on this topic, I will have at least some working serialization implemented. Right now with the bitmap tree I am limited to memory-only benchmarks, which while they are very fast, are not necessarily a fair fight against Elasticsearch.