If you’re just joining this column, it is one aspect of a response to the gap between how development and how operations view technology and measure their success – it is wholly possible for development and operations to be individually successful, but for the organization to fail.
So, what can we do to better align development and operations so that they can speak the same language and work towards the success of the organization as a whole? This article series attempts to address a portion of this problem by presenting operation teams insight into how specific architecture and development decisions affect the day-to-day operational requirements of an application.
The current article series is reviewing Big Data and the various solutions that have been built to capture, manage, and analyze very large amounts of data. Unlike the relational databases of the past, Big Data is not one-size-fits-all, but rather individual solutions have been built that address specific problem domains.
- It contains documents that will ultimately be presented in a JSON format
- Those documents can be located by specifying JSON matching criteria
- Data in MongoDB can be indexed for rapid access
- MongoDB does not impose strict constraints on the format of the data (you are free to store some fields in one document and omit them in others and you can change your “schema” on-the-fly.)
Furthermore, as more web applications move into the single page application (SPA) model, it is advantageous to store your data in a format that can be easily and quickly translated to JSON because that is the native format upon which most SPA frameworks operate.
As a matter of fact, one of the most popular SPA technology stacks is called MEAN: MongoDB, Express, Angular, and Node.js, where MongoDB maintains the data, Node.js provides services that expose the data in MongoDB, Express is a web framework that runs inside Node.js, and AngularJS is the web client library used to construct your SPA. This is all to say that developers like MongoDB!
MongoDB provides some powerful capabilities, so the next logical question is how do we optimally configure and deploy MongoDB for high performance access to its “web-scale storage” capabilities (measured in petabytes)? There are several things to keep in mind when configuring and deploying MongoDB:
- Deployment Topology
This article reviews each of these topics and provides recommendations about how you can address each. Because of the brevity of this article, I recommend that you review Manning’s MongoDB in Action book for more details.
Replication involves distributing and maintaining a live database server across multiple machines. MongoDB supports two flavors of replication:
- Master-slave replication
- Replica sets
Both solutions define a single primary node that receives all write requests and then secondary nodes read those changes from the primary node and apply them asynchronously to their replicated storage.
Replica sets provide automatic failover, meaning that if the primary node goes down then one of the secondary nodes is automatically promoted to become the primary and processing continues as usual; master-slave replication requires manual failover if the master node fails. Because of automatic failover and some more sophisticated deployment topologies (that we’ll see shortly) replica sets are the preferred replication strategy.
Replication is important in a MongoDB deployment for several reasons:
- Recovery from failure: hardware ultimately fails; you might experience a network issue, a server crash, or a hard drive failure, and if you do not have multiple copies of your data then your data will be lost. Replication addresses failure by maintaining multiple copies of the data so that if a server does fail, there is another copy.
- Support for planned down time: if you need to take servers out of your deployment for any reason (such as OS patching or updates) then the replicated copies of data can service live requests while those servers are out of your deployment.
- Durability: MongoDB can run with or without journaling, where journaling captures the actions performed on the database so that those changes can be replayed in the event of failure. If you are running without journaling enabled and a server fails or a shutdown is not clean, MongoDB’s files are not guarantee to be free from corruption. Maintaining replicated data helps you recover from a corrupt data file because the data is maintained in the replicas.
- High-Availability and Failover: if a server needs to be temporarily removed from your deployment or a server crashes and you’re not replicating data then that data managed by the server will not be available. By replicating data you not only can recover from planned and unplanned outages but you can quickly failover from one machine to another and protect your users from your outages.
- Load balancing: if you have data replicated across multiple machines then you can distribute requests across those machines to improve concurrent performance.
It is important to note that because replication is asynchronous you are not guaranteed that the data on the secondary node will always be correct, or more specifically, “consistent".
We call this eventual consistency: the data will eventually reflect the changes made to the master, but at any point in time it might not be consistent with the master. Furthermore, because secondary servers are asynchronously processing writes made to the primary server, if you are writing more than you are reading, these secondary servers might be too busy handling those writes to rapidly service your read requests.
In short, load balancing is a great way to improve performance, but you need to be aware of how you are using your database to fully understand if you will benefit from it.
MongoDB implements replica sets with the following components:
- Primary node: contains the master data for the MongoDB instance
- One or more secondary nodes: contains replicated copies of the data
- Arbiter node: this server observes the status of the primary and secondary nodes and if the primary goes down, it is the arbiter elects a new primary node
The minimum configuration for a replica set, therefore, is three nodes: one primary, one secondary, and one arbiter.
It is recommended that you define a replication strategy that meets your business needs. You should opt for creating a replica set rather than using a master-slave model unless absolutely necessary. You need at least one backup copy of all of your data, but depending on the importance of your data, you may want to opt for more.
We already talked about replication, which involves maintaining multiple copies of the data hosted on a node, but if replication was the only tool at your disposal then you would need to maintain all of the data in your database on a single node (albeit multiple copies of that database.) MongoDB recommends that, for optimal performance, you maintain your “working set” of data and your indexes in RAM, where the working set is the set of active data or the data upon which you are regularly operating.
This presents a problem: what happens when your database is managing more data than your available RAM? I don’t know of too many machines with petabytes of RAM!
The answer to that dilemma is sharding. Sharding allows you to evenly distribute your data across multiple machines and scale up by adding additional shards, as your needs dictate. In the past we have implemented manual sharding to meet increased data and processing demands on top of non-shard-able databases like MySQL.
To manually shard a huge table, for example a table that maintains purchase orders, we would insert ranges of purchase orders in multiple MySQL instances and then maintain a specific MySQL instance that knows the location of all of purchase orders. To locate a specific purchase order, we would first query for the location of the purchase order we are searching for (the MySQL instance that contains that purchase order) and then query that specific instance for the purchase order itself.
This is very manual and requires quite a bit of logic to be embedded in your application itself. Furthermore, say that you built two shards and you reach the point where your data needs require three shards. You could start up another and use it fresh, but it would be far better if you were to redistribute the data from the two existing shards across all three shards so that your load and your data will be evenly distributed across all three machines.
This is a non-trivial problem and, in our manual process, would involve a lot of queries to extract data from the existing shards, insert them into the new shard, and update the location. Not something you want to do while trying to maintain a live production database!
Figure 1 shows an example of how you might implement manual sharding.
When the application needs to find data, it first queries the index database, which tells the application where the requested data is located. Then the application must make a query to the correct database to retrieve the data.
Sharding was one of the primary use cases that inspired the creation of MongoDB and since version 1.6 it has natively supported sharding. MongoDB abstracts the fact that data is sharded from the application, which means that the application does not need to know where the data is located in order to access it.
Figure 2 shows a common sharded MongoDB deployment.
In a sharded deployment we introduce a new component that routes our requests to the appropriate shard: mongos.
- Shards: shards are MongoDB instances that contain a subset of the entire data. Note that a shard is a replica set and not just a single server – this is how we combine replication with sharding. You will want to have at least two nodes in your replica set and more depending on your needs
- Config Server: there are three config servers in a MongoDB sharded deployment and these servers know the location of all data spread across the shard. Config servers do not follow the same replication strategy that we saw in the previous section because these servers are required to stay consistent and asynchronous replication does not guarantee consistency. Therefore these servers are updated using a two-phase commit strategy
- Mongos: the mongos process sits relatively close to your application (possibly collocated on the same hardware) and it maintains an in-memory copy of the metadata contained in the config server. It is responsible for routing reads and writes to the correct shard in the deployment.
This infrastructure allows you to run a local mongos process through which you application can seamlessly retrieve data from any of the shards without ever needing to know from where its data is coming.
Defining shards is a somewhat cumbersome process of MongoDB configuration, but it is well worth it. The important thing to keep in mind is defining how you choose to shard your data. You can allow MongoDB to shard on the ObjectId, which is built into each document across your collections, but this may not be the best solution.
For example, if you frequently load data for a web application about a particular user, it would be helpful if all of that user’s data was located on the same machine so your application does not need to bounce back-and-forth between different databases.
Therefore a common solution to this problem is to shard on both the user ID and the ObjectId. This is referred to as defining a “shard key” and a good shard key will depend on the nature of the data you are managing and what data you want to collocate for performance reasons.
As you can surmise at this point, replication and sharding work hand-in-hand. At a minimum you should define a cluster with at least two shards and those shards should be composed of three replica set members: one primary, one secondary, and one arbiter. And, of course, in order to support sharding you will need to define a cluster of exactly three config servers and run mongos processes that connect to those config servers to allow your application to read and write data across the cluster.
Figure 3 tries to put this minimal configuration together into a diagram.
The application executes its queries directly against the mongos instance, but under-the-hood, mongos queries its in-memory copy of the metadata it loaded from the config server cluster to identify where the requested data should be loaded from (or where the new data should be written to) and then routes the request to the correct shard instance.
Each shard is running as a replica set for high-availability; if the primary node goes down, the arbiter will identify the outage and automatically promote the secondary node to be the new primary.
In terms of the number of shards, you need to keep in mind that MongoDB recommends that you have machines with enough RAM to host the working set (live data) and indexes. If you have 50GB of memory and you have 200GB of working set plus indexes then you’ll need four shards.
Furthermore, you need to examine the nature of operations your application is performing: if it is writing excessively then you might need far more shards than the RAM-to-working set ratio in order to optimize the load; if your application is primarily reading a small subset of data then can get by with a higher RAM-to-working set ratio.
This is a lot to keep track of, but I hope this brief introduction has helped you understand where you’re going to have to dig deeper to better configure your MongoDB cluster.
This article provided a brief and superficial overview of how to deploy a MongoDB cluster to a production environment. It reviewed replication strategies (replica set vs master-slave) and the requirements for a replica set (one primary node, one or more secondary nodes, and one arbiter) and it reviewed the problem that sharding solves and summarized how sharding works (config servers, sharded replica sets, and mongos processes).
Finally, this article showed the minimum deployment for a sharded and replicated MongoDB environment and suggested some sizing recommendations.
This is a complicated topic and I highly recommend you dive deeper with a more robust resource like Manning’s MongoDB in Action.