Many people run their MongoDB servers in a sharded cluster. In such a setup, a mongos sits between the user’s application and their sharded data. Clients connect to the mongos and send it queries, and mongos routes those queries to one or more shards to be fulfilled.
In most cases, mongos can pinpoint a single shard for each given query. However, some queries require “scatter gather” routing; in other words, mongos has to send the query to all shards, wait for their responses, and assemble them into a single master response. We could fan these requests out to shards serially, but then one slow connection would block mongos’ entire system. To do this efficiently, we needed a way to run requests concurrently.
Given the structure of the networking code in MongoDB 3.0, the only way to run requests concurrently was to run them in different threads. Some clusters have hundreds of shards—that’s a lot of requests to fan out. You can imagine what might occur in a mongos handling many requests: thread explosion!! Having too many threads can bog down a system, causing contention over hardware resources.
In 3.2, we wrote an alternate solution: asynchronous outbound networking for mongos. This new networking layer eliminates our thread explosion problem, but this new implementation brought with it difficult memory management challenges. It took a lot of experimentation, failure, iteration, and above all, obsessive testing to implement a new callback driven, asynchronous system.