Liam White
Upgrading to object storage

The finishing touches are being placed on Philomena's support for object storage. This was a long time coming, so I thought I'd write a bit more about how we got here.

Derpibooru has over 30 million files, which consist of over 2 million image uploads (and various scaled versions of them), as well as symbolic links to larger versions and miscellaneous files like user avatars, advertisements, and tag spoiler images. This has been the case since early days of the site software, since we found that self-hosted replicated storage solutions at the time were too complicated to be practical.

Storing and serving this many files from a local filesystem is challenging. On a single-server configuration, it requires an impractical SSD-based hardware configuration for the approximately 7TB of content this represents. With typical replicated configurations on SAS hard drive arrays, responding to more than 300 HTTP image requests per second results in the entire array being totally swamped with IO requests and slow CDN performance, and we only have so much RAM to cache them. The change to these SSDs was an expensive one, and one that can't scale in the future.

Fortunately, local storage solutions like MinIO that reliably replicate and store data with commodity hardware exist and can be used to emulate this with self-hosted storage. But storage is the kind of problem that is worthwhile to pay someone else to look after, and Scaleway has recently revamped its hosted object storage service to be able to handle millions of files per bucket, and when combined with a proxy server hosted on the same network, has zero egress or API fees—just pay for storage. This makes it a really compelling option to scale up into the future.

With our storage problem taken care of, we can drop the inordinately expensive SSDs and move the server to more commodity-grade hardware once again. It also makes hardware upgrades a breeze, because we just have to change the server, and the files don't need to be painstakingly rsynced every time.

Of course, there is more to worry about, like a catastrophic failure of the datacenters used to host the content, or being cut off from the account due to circumstances outside of our control, so the linked pull request also includes a method for mirroring the changes to an alternate object storage host. But even paying twice for the data is less than half the cost of the SSD server.