The MongoDB Zombie

Those involved in the history of the Derpibooru may know that the site did not originally use PostgreSQL as its primary datastore. In the past, we used MongoDB. During the time period when the site was in its infacy, MongoDB enjoyed considerable popularity due to its ability to automatically split large amounts of data between multiple server instances (shard routing). It has excellent ORM binding support in many languages, including Ruby, JavaScript, and Python. The database server is fully scriptable with JavaScript. It has built-in support for an important data structure called an inverted index, can manipulate JSON at the binary storage level (with BSON), and even has some simple document-level ACID guarantees (that is to say, certain operations on a document can lock and operate in an transactional nature).

The reasons we decided to switch when we did are varied, but overall we felt that the site did not have a good future with MongoDB. A few months before the switch, a routine upgrade to the database server caused it to start leaking large amounts of memory and bring down the system sporadically, which we had to deal with on an ad-hoc basis. Each time it crashed or ran the server out of memory, it caused downtime, sometimes for hours when an admin was unavailable to restart the services after they died.

Clover and I were also feeling increasingly hesitant about the reliability and maturity of the Ruby adapter for MongoDB. It was discovered a few months prior to our switch that there had been a long-standing code injection vulnerability in Mongoid due to a misunderstanding by the developers of regular expression metacharacters in Ruby. This wasn't in and of itself a deal-breaker, because the issue, while easily exploitable, was also well understood and easily patched, and we patched it in our code as soon as we found out about it. But it was part of a generally consistent pattern of insecure behavior with MongoDB.

While MongoDB has always been fun to ridicule for its lack of referential integrity and strong consistency guarantees, and for its strongest proponents for being somewhat oblivious to the value of data integrity, one can argue that this is like trying to have your cake and eat it too. Perhaps we should have known better. But I would argue that MongoDB unfairly advertised itself as being far more capable, performant and reliable than its competition, and ended up cornering us in a bad situation.

2015-10-05 16:21:56     +byte[] >not just burning mongo
2015-10-05 16:22:26     @CloverTheClever        yeah, I'd rather bite the bullet and migrate to postgres :p
2015-10-05 16:23:05     +byte[] that's gonna take a while
2015-10-05 16:23:10     @CloverTheClever        yeah.

There were two straws that broke the camel's back. First, as mentioned, the crashing after the database upgrade. We had no idea what was going on, and the sparse amount of logging information that MongoDB generated was no help to us. Second, and more importantly to us, was a severe and nasty performance regression introduced in Mongoid 5 during a routine gem upgrade to pull in updates and security fixes. Before this upgrade, MongoDB was not exceptionally fast in our workload, clocking in on average 30ms of any given application page. This seems to have been somewhat of a performance ceiling. At the time, we had already made every index we thought would be appropriate, and even added query hints to help out MongoDB when its query planner went awry. But after upgrading to Mongoid 5, we saw that 30ms figure almost triple to 90ms on average, and with the relatively limited processing power of our server at the time, this started to cause minor availability issues with the site. After rolling back the upgrade, we saw that Mongoid had introduced a large amount of query logging which could not be disabled without monkey-patching the code. This was when I took the idea of using a different database seriously.

2015-10-08 16:30:15     +byte[] new relic says that perf is considerably worse
2015-10-08 16:30:29     +byte[] vast majority of time spent in controller now
2015-10-08 16:31:41     @CloverTheClever        wonder if it's doing the log spam in production
2015-10-08 16:31:55     @CloverTheClever        ... yep
2015-10-08 16:32:13     @CloverTheClever        App 10470 stdout: D, [2015-10-08T22:31:51.759741 #10470] DEBUG -- : MONGODB | localhost:27017 | derpibooru.find | SUCCEEDED | 0.000849998s
2015-10-08 16:32:17     @CloverTheClever        every single fucking query
2015-10-08 16:32:32     @CloverTheClever        why the hell was this released
2015-10-08 16:37:47     @CloverTheClever        think I figured out your performance hit, too
2015-10-08 16:43:38     @CloverTheClever        reverted
2015-10-08 16:43:40     @CloverTheClever        deploying
2015-10-08 16:44:07     @CloverTheClever        mongoid 5 needs a query cache middleware to avoid duplication of calls to the db when hitting a criteria
2015-10-08 16:44:23     @CloverTheClever        the aasm breakage is actually a shitload of stuff
2015-10-08 16:44:48     @CloverTheClever        and the logging thing is totally unfixable as far as I can tell
2015-10-08 16:45:07     +byte[] so I guess that means we're stuck on 4 for the foreseeable future?
2015-10-08 16:45:13     @CloverTheClever        there's not even a jira ticket for "database lib writes two big fat rows of logs for every single query made in a production environment"
2015-10-08 16:45:26     @CloverTheClever        we're stuck on 4 till aasm gets updated
2015-10-08 16:45:34     @CloverTheClever        but 5 looks like a really shitty release
2015-10-08 16:45:40     @CloverTheClever        I smell politics
2015-10-08 16:45:50     @CloverTheClever        (it's moved to mongodb's main repos, their issue tracker, etc)
2015-10-08 16:46:24     @CloverTheClever        I honestly can't summon up enough fucks to bother opening an issue because I have to log into their JIRA
2015-10-08 16:46:39     @CloverTheClever        and I'd rather shoot myself
2015-10-08 16:46:47     +byte[] ...than keep using mongodb
2015-10-08 16:47:52     @CloverTheClever        that too
2015-10-08 16:56:02     +byte[] remember to lock mongoid at 4
2015-10-08 17:03:15     @CloverTheClever        there
2015-10-08 17:03:18     @CloverTheClever        knock yourself out
2015-10-08 17:03:37     +byte[] hooray
2015-10-08 17:04:29     +byte[] mongodb as a whole makes me angry
2015-10-08 17:04:58     @CloverTheClever        feel free to start the postgres migration fork

So I started the branch. I rotated my sleep cycle around to wake up at approximately 04:00 every day for a month to coordinate with Clover, who was in the UK, and worked morning and night on rewriting the application using the standard pg adapter on ActiveRecord. For me, it was essentially an all-out race to get MongoDB out of the code. I took as many shortcuts as I possibly could to get the application code running with the new schema. It was immensely helpful that our schemas were already essentially established, and this was strongly indicative to me at the time that MongoDB was a poor fit for us.

In order to make migrating the tables easier, I chose a schema in Postgres that essentially matched what was already prescribed by our table layout in MongoDB. This is also why I decided to use Postgres in the first place, and not e.g. MySQL with InnoDB. Postgres has an excellent query planner, many different types of indices, and support for a special denormalized datatype: arrays. Postgres's array type made migrating data out of MongoDB much easier, because we have many "has and belongs to many" associations that are trivially modeled with an array on the more common document (like watched tags and recent filters for users).

Once I completed the rewrite, which mostly just a process of finding Mongoid queries and replacing them with their ActiveRecord equivalents, we started the migration. We ended up completing the entire migration in 3 days. After the migration was completed, the site came back up in good health, but quickly became a fair bit faster after I added some indices that I missed on the initial round of importing. So we were already a little bit faster than before.

At this point, it didn't take long for an issue with poor table architecture to appear. Transaction traces in New Relic indicated that there were users with hundreds of thousands of votes on the site being recorded in an auxiliary statistics table, and even with jsonb, this was extremely slow for certain users since the entire statistics row, which was sometimes over a megabyte of JSON, had to be fetched from the database, parsed by Ruby, modified to add/remove a single piece of information, and then sent back. Fortunately, this was actually quite easy to split out into a properly normalized table, and I took the site down for an hour to perform the migration a few weeks later. After it came back up, New Relic reported that we had quite a happy application, having cut our database response time in half in most cases with our average appserver response time under 80ms in off-peak hours.

That was the part requiring the most downtime. But it turns out it wasn't the hardest part, and I thought things were done far before they really were. Which is how I ended up here now.

Like many reasonably complicated Rails applications, our own Postgres-based schema has its fair share of oddities and idiosyncrasies. But unlike most similar applications, our database is not properly normalized. There are a couple of reasons for this. While migrating the app code to use a new Postgres schema, I made the executive decision to not add any foreign key or check constraints, because from my testing with dumps of production data, there was a LOT of inconsistency. The times MongoDB crashed due to running out of memory or crashed when a runaway logger filled up all the space on the disk definitely didn't help, but the main cause of this was likely just poor coding on our part. Which, again, is why it's so important to have a database that can properly enforce referential integrity. There is also a table using a broken single-table inheritance model (channels) which likely contains lots of invalid data, and several tables using Rails's terrible polymorphic associations (notifications, unread_notifications, reports). Given all this, it shouldn't be much of a shocker to learn that the data quality of these tables is rather poor with many integrity violations.

To my surprise, it actually turned out to be easy and extremely fast to fix all of the offending tables. For example, here is a SQL snippet that does just that (with Users' recent_filters association):

-- create the table
create table users_recent_filters (user_id integer not null, filter_id integer not null);

-- copy data over from the array on the users table
insert into users_recent_filters (user_id, filter_id) select id as user_id, unnest(recent_filter_ids) as filter_id from users group by user_id, filter_id;

-- delete rows which could have foreign key violations, since we don't care if one gets lost
delete from users_recent_filters where not exists (select null from filters where id=users_recent_filters.filter_id);
-- add constraints and indices
alter table only users_recent_filters add constraint users_recent_filters_user_key   foreign key (user_id)   references users(id);
alter table only users_recent_filters add constraint users_recent_filters_filter_key foreign key (filter_id) references filters(id);
create unique index index_users_recent_filters_on_user_id_and_filter_id on users_recent_filters (user_id, filter_id);
create index index_users_recent_filters_on_filter_id on users_recent_filters (filter_id);

Most tables can be normalized this way in a matter of milliseconds; the table that takes the longest to create in this case is the join table between images and tags, and a quick bit of benchmarking on my part determined that it takes only about two minutes to fully normalize every table in the database on the production set. You might be asking yourself now, "Well, if it's so easy byte[], why did you make me read through close to 2000 words of your long-ass blog post to say that?"

The answer is that it is most definitely not that easy. This is actually due first to limitations in the way ActiveRecord operates, and second to extremely bad coding in our app.

For the former, we have a few tables, like images, that are rather large, and one, user_interactions, that is enormous. We also have a system where we can update thousands, if not hundreds of thousands, or even millions of tuples at once in certain cases, like merging and aliasing tags. Unfortunately, doing these batch updates can take quite a bit of time and if misused can result in extended downtimes waiting for an update to finish. This is because in Postgres's MVCC model, updates will never write a new version of an existing tuple; they will instead write a new version of a tuple somewhere else in the table, and then propagate updates to indices. This process of writing new tuples and updating associated indices requires a significant amount of locking and has taken down the site repeatedly before, so we have been careful about avoiding this.

What we do instead in this case is use ActiveRecord's handy in_batches feature. This operates by using the primary key in the database to our advantage, using it to loop over affected rows in small batches and avoid taking expensive locks, at the cost of not preserving any isolation of changes. However, this is actually okay for the application, because all that is required is that these change are consistent and durable, and it doesn't matter if it's atomic or isolated (they're idempotent).

This can't work with a typical join table, because join tables don't have an integer primary key, and the indexing on them may not be appropriate as they grow larger. What I will need to end up modeling in most cases is a has_many :through relationship on a new model that does have a primary key. However, even if I were to do this, then I would still run up against another roadblock. Since my data conversion adds a composite unique constraint to the join tables, preventing accidental duplication, it also makes it exceedingly likely for a constraint violation to occur.

Despite approaching 15 years in age, it seems as though the classical Rails advice to ignore database constraints and rely on ActiveRecord (sure...) has been so heavily ingrained into the community that most have seen no real urgency for the ability to do complex inserts and updates. I have no trouble writing the SQL for this by hand (it's an easy query), but I'd prefer not to, and as it turns out, Rails has just now gotten around to adding handling for bulk inserts and updates ("ON CONFLICT ...") in Rails 6, so my intent is to write it by hand for now and then change it to use the upsert_all syntax when we do move to Rails 6.

For the latter, I cannot really stress how much stress this code has put me under. A lot of the code made for interacting with the database is truly awful to work with, to the point where just fixing a longstanding normalization issue causes a pile of hacks built originally on top of a bad denormalized schema to instantly collapse, failing almost every single integration test. This has now led me to reevaluate whether the issue I need to address first is the schema or the bad code written atop it.

We are still haunted by the zombie of MongoDB, years after removing it from our code.