Liam White
Making a search engine?

Some new developments with Elasticsearch have made writing a search engine a more pressing task to revisit:

  1. It changed its license; using nonfree dependencies in an AGPL project is in poor taste at best, even if Philomena does not link a nonfree version of Elasticsearch
  2. Its terrible performance has started causing trouble for scaling Philomena down to save cost in practical scenarios
  3. Its high memory usage causes contention even on servers with plenty to go around

To address this, I have decided to revisit the search engine via bitmap tree implementation idea. But instead of implementing it via a custom database, I will probably take the following shortcuts:

  1. Still implemented in Rust
  2. SQLite or LMDB as the storage backend
  3. Bitmaps are rewritten fully when they are modified, just like in Elasticsearch

This allows making the following simplifications which drag this project into relative feasibility for a minimum viable product:

  1. All transaction and serialization support is delegated to the storage backend
  2. Bitmap operations and writing can be delegated to an optimized third-party implementation like CRoaring

Hopefully next time you see me write on this topic, I will have at least some working serialization implemented. Right now with the bitmap tree I am limited to memory-only benchmarks, which while they are very fast, are not necessarily a fair fight against Elasticsearch.