Deploying Mastodon at Scale

The recent controversy over Twitter’s ownership and evolving policies have driven massive interest and growth for the open-source Mastodon software for running self-hosted social networks.

Mastodon is unique in that it is decentralized and distributed; no single entity owns or controls it. Anyone can stand up a Mastodon server and create an isolated community – or they can become part of the Fediverse for a more global and connected Mastodon experience. In fact, as of this writing, there are now almost 6 million accounts and over 9,500 Fediverse-connected instances. Using data from this recent ZDNet article as a benchmark, that’s a 125% growth in connected instances in the last 3 weeks.

The intent with Mastodon – or any social network, for that matter – is to connect users and communities. If you’re successful, whether connected to the Fediverse or not, that may mean two things: 1. The potential for explosive growth and 2. An appeal to a geographically dispersed base of users. Together, these factors mean you need to plan ahead for scale.

Mastodon consists of a Ruby on Rails backend, a JavaScript frontend, Sidekiq jobs management and a PostgreSQL relational database. You can learn more about the underlying systems from the Mastodon documentation, or check out our tutorial on how to deploy Mastodon.

While providers like Section are making it easier than ever to deploy a Mastodon server with just a few clicks, developers might also wonder about the process and benefits of deploying Mastodon at scale. The simple answer is that without the right provider, you’re likely stuck. Let’s take a look at why.

User Experience

A primary consideration with any online community is user experience, specifically, how responsive are Mastodon instances to user queries and load. A key factor impacting responsiveness is latency, which is a reflection of proximity between users and server instances. The farther away those Mastodon instances are from users, the slower the response.

Which begs the question: do you know where your users are located, and are they all fairly close together? Because a typical cloud is not – despite many misnomers – everywhere, all at once. With most providers (including the hyperscalers like AWS, Azure, GCP, etc.), you’ll need to choose one or more hosting regions. If you choose one (say USA East), then every user outside that region gets a progressively poorer experience based on distance. If you choose more than one region, congratulations, you’re now responsible for managing a distributed network.

If you’re trying to reach a global audience with Mastodon, this is a problem. The answer is to distribute your instances globally. That puts Mastodon as close as possible to users for a better experience. Section makes this simple.

Database Load

As mentioned above, another issue impacting responsiveness is the load on backend systems, specifically the PostgreSQL database. As a Mastodon instance experiences user growth, that increased scale can cause a slow-down in how quickly the PostgreSQL can process and respond to all those new database calls. This can actually be exacerbated by global distribution, as you’ve now got distributed servers “phoning home” across a distributed network to a centralized database (this can also impact cost; see below).

The answer is to distribute the data out to where the users and servers are located. Section has teamed with PolyScale, whose intelligent serverless caching sits between the Mastodon application front-end and the back-end PostgreSQL database.

The beauty of this architecture is that it still uses a single database backend, creating one source of truth for Mastodon. But the distributed PolyScale data caches sit physically close to the distributed Mastodon instances on Section – and closer to your users – dramatically improving latency, responsiveness and load-handling ability. Routing between these data caches and Mastodon instances is dynamic, optimizing performance by ensuring that database calls go to the nearest cache.

This architecture also allows developers to keep PostgreSQL databases small and efficient, lowering overall hosting costs.

“Global distribution is a critical consideration for modern applications, but it can create a significant challenge for DevOps teams when it comes to efficiently managing the underlying compute hosting and backend systems,” notes Ben Hagan, CEO at PolyScale. “Section and PolyScale solve these twin problems with a few clicks, creating an ideal platform for deploying Mastodon at scale.”

Reliability, Scalability and Resilience

Do you remember the Twitter Fail Whale? It’s a notorious graphic that indicated that Twitter was experiencing technical difficulties. Unless you’re considering reliability, scalability and resilience for your Mastodon instance(s), you might want to think about what your comparable Mastodon graphic might be.

There are two important considerations for maintaining availability of Mastodon – or any cloud workload, for that matter. The first is the provider network. If your provider goes down, so does Mastodon. The way to avoid this single-provider point of failure is to use a global, redundant, federated network of providers, as we do here at Section. If any one provider goes down, we’ll route traffic to other providers to keep your Mastodon instances running.

The second consideration is single versus multi-cluster deployment. Simply put, a single cluster not only represents another single point of failure for Mastodon, it also impacts the ability to scale Mastodon in response to load. The solution is a multi-cluster deployment, but running a multi-cluster containerized environment is hard – unless you’re using Section. We use a clusterless deployment model (basically, a cluster of clusters) that make it as easy to run on Section as it is to deploy a single cluster, yet you enjoy all the benefits of a multi-cluster environment.

Optimal Cost

Deploying workloads such as Mastodon at global scale can quickly get expensive and create major management and operations headaches. Considerations include number of instances, when, where and how frequently those instances are running, how “chatty” those instances are with backend systems, etc. Managing and optimizing all of these factors manually is difficult, if not impossible.

Section’s distributed global network allows organizations to control workload placement while our dynamic orchestration engine optimizes resource utilization to always deliver the best performance for cost outcome.

Section adapts to developer policies, such as “run containers only in Europe and where there are at least 20 HTTP requests per second” – allowing the distributed compute platform to limit the target deploy field and continuously adjust within that field accordingly. And our partnership with PolyScale ensures that scalability without overloading backend systems.

Start Now

If you’re ready to take your Mastodon deployment to the next level, check out our tutorial to see how easy it can be. And if you’re just getting started with Mastodon, we encourage you to start off right with Section. We make it simple to deploy Mastodon globally with just a few clicks.

Similar Articles