The MongoDB Engineering Journal

A tech blog for builders, by builders

Latest Posts
  • When Switching Projects, Check your Assumptions or Risk Disaster

    On January 10, I released a badly broken version of the MongoDB C Driver, libmongoc 1.5.2. For most users, that version could not connect to a server at all! Luckily, in under 24 hours a developer reported the bug, I reverted the mistake and released a fix. Although it was resolved before it did any damage, this is among the most dramatic mistakes I’ve made since I switched from the PyMongo team to libmongoc almost two years ago. My error stemmed from three mistaken assumptions I’ve had ever since I changed projects. What were they?

    Inception

    Here’s how the story began. In December, a libmongoc user named Alexey pointed out a longstanding limitation: it would only resolve hostnames to IPv4 addresses. Even if IPv6 address records existed for a hostname, the driver would not look them up – when it called getaddrinfo on the hostname to do the DNS resolution, it passed AF_INET as the address family, precluding anything but IPv4. So if you passed the URI mongodb://example.com, libmongoc resolved “example.com” to an IPv4 address like 93.184.216.34 and tried to connect to it. If the connection timed out, the driver gave up.

    Read More
  • Testing Linearizability with Jepsen and Evergreen: “Call Me Continuously!”

    Intro

    What do you do with a third-party tool that proves your application lacks a feature? Add that tool to your continuous integration system (after adding the feature, of course)! In our case we have added linearizable reads to MongoDB 3.4 and use Jepsen to test it.

    Evergeeen logo plus Jepsen

    What is Linearizability?

    Linearizability is a property of distributed systems first introduced by Herlihy & Wing in their July 1990 article “Linearizability: a correctness condition for concurrent objects” (ACM Transactions on Programming Languages and Systems Journal). Peter Bailis probably provides the most accessible explanation of linearizability: “writes should appear to be instantaneous. Imprecisely, once a write completes, all later reads (where “later” is defined by wall-clock start time) should return the value of that write or the value of a later write. Once a read returns a particular value, all later reads should return that value or the value of a later write.”

    In MongoDB 3.4, linearizable reads are now supported on single documents, using a new read concern called “linearizable”. Previously, linearizable reads were possible only by using a findAndModify operation on a single document and updating an extraneous field in the document, with a writeConcern of “majority”. Keep in mind there is a performance penalty for this. Linearizable reads have a performance profile similar to majority writes, as each linearizable read makes use of a no-op majority write to the replica set to ensure the data being read is durable and not stale.

    Read More
  • BSD YouTubers Honor Us With Dramatic Readings

    From the desk(top) of the editor-in-chief:

    Recently we published a piece by A. Jesse Jiryu Davis about his undertaking to prove that getaddrinfo was thread-safe on OS X, thus enabling Python to do away with an unnecessary and troublesome mutex around hostname resolution. The convolutions of tracking down that evidence, and the shroud of secrecy involved in all correspondence with Apple, inspired us to render the piece in a whimsical, high-fantasy style. It was called “The Saga of Concurrent DNS in Python, and the Defeat of the Wicked Mutex Troll”

    It seems the unconventional style has inspired a couple of dramatic readings, which we’re just thrilled about. We’d love to share them with you.

    On BSD Now, host Allan Jude and his guest Kris Moore read some excerpts in episode 172, “A tale of yore”. He really gets the tone across and his delight is palpable. At minute 17, they throw down a challenge: “Someone should do a dramatic reading of this whole story.”

    That challenge was taken up by Mason Egger, who created his YouTube channel BSD Synergy to serve those not yet ready for the firehose of insider expertise that is BSD Now. He was so tickled with what he heard on BSD Now that he read the entire thing for BSD Synergy episode 20, against a backdrop of images that set the scene beautifully.

    Allan, Mason, thank you both for bringing our work to life!

    We cannot confirm any rumors that we are in talks with Peter Jackson to do a motion picture adaptation of this story.

    creaky old wooden chest with medieval-style engravings and a scroll depicting the history of getaddrinfo on macOS

    Read More
  • D3 Round Two: How to Blend HTML5 Canvas with SVG to Speed Up Rendering

    Soon after the publication of “Digging Into D3 Internals to Eliminate Jank,” I was pleased to see that it had sparked a discussion on Twitter, with D3 community members, notably Noah Veltman and Mike Bostock, sharing suggestions for improving our rendering solution:

    A suggestion we received both in this discussion and on lobste.rs was to use canvas to render the data points. We had originally avoided canvas because of time constraints, lack of team familiarity with canvas, and the complications it introduced with regards to mouse interactions. However, Noah proposed a combination of SVG and canvas that strikes a balance between canvas’ performance and SVG’s convenience, complete with a demo:

    It piqued my interest, and so I decided to explore it in some more detail here.

    Read More
  • The Saga of Concurrent DNS in Python, and the Defeat of the Wicked Mutex Troll

    creaky old wooden chest with medieval-style engravings and a scroll depicting the history of getaddrinfo on macOS

    Tell us about the time you made DNS resolution concurrent in Python on Mac and BSD.

    No, no, you do not want to hear that story, my friends. It is nothing but old lore and #ifdefs.

    But you made Python more scalable. The saga of Steve Jobs was sung to you by a mysterious wizard with a fanciful nickname! Tell us!

    Gather round, then. I will tell you how I unearthed a lost secret, unbound Python from old shackles, and banished an ancient and horrible Mutex Troll.

    Let us begin at the beginning.


    A long time ago, in the 1980s, a coven of Berkeley sorcerers crafted an operating system. They named it after themselves: the Berkeley Software Distribution, or BSD. For generations they nurtured it, growing it and adding features. One night, they conjured a powerful function that could resolve hostnames to IPv4 or IPv6 addresses. It was called getaddrinfo. The function was mighty, but in years to come it would grow dangerous, for the sorcerers had not made getaddrinfo thread-safe.

    As ages passed, BSD spawned many offspring. There were FreeBSD, OpenBSD, NetBSD, and in time, Mac OS X. Each made its copy of getaddrinfo thread safe, at different times and different ways. Some operating systems retained scribes who recorded these events in the annals. Some did not.

    Because getaddrinfo is ringed round with mystery, the artisans who make cross-platform network libraries have mistrusted it. Is it thread safe or not? Often, they hired a Mutex Troll to stand guard and prevent more than one thread from using getaddrinfo concurrently. The most widespread such library is Python’s own socket module, distributed with Python’s standard library. On Mac and other BSDs, the Python interpreter hires a Mutex Troll, who demands that each Python thread hold a special lock while calling getaddrinfo.

    Read More
  • MongoDB’s JavaScript Fuzzer: Harnessing the Havoc (2/2)

    Fuzz testing is a method for subjecting a codebase to a tide of hostile input to supplement the test cases engineers create on their own. In part one of this pair, we looked at the hybrid nature of our fuzzer – how it combines “smart” and “dumb” fuzzing to produce input random enough to provoke bugs, but structured enough to pass input validation and test meaningful codepaths. To wrap up, I’ll discuss how we isolate signal from the noise a fuzzer intrinsically produces, and the tooling that augments the root cause analyses we do when the fuzzer finds an issue.

    An unbridled fuzzer creates too much noise

    Fuzz testing is a game of random numbers. That randomness makes the fuzzer powerful… too powerful. Without some careful harnessing, it would just blow itself up all the time by creating errors within the testing code itself. Take the following block of code, which is something you would see in one of MongoDB’s JavaScript tests:

    while(coll.count() < 654321)
        assert(coll.update({a:1}, {$set: {...}}))
    

    This code does a large number of updates to a document stored in MongoDB. If we were to put it through the fuzzer, a possible test-case that the fuzzer could produce is this:

    while(true) 
        assert(coll.update({}, {$set: {"a.654321" : 1}}))
    

    The new code now tests something completely different. It tries to set the 654321th element in an array stored in all documents in some MongoDB collection.

    Now, this is an interesting test-case. Using the $set operator with such a large array may not be something we thought of testing explicitly and could trigger a bug (in fact it does). But the interaction between the fuzzed true condition and the residual while loop is going to hang the test! Unless, that is, the assert call in the while loop fails, which could happen if the line defining coll in the original test (not shown here) is mutated or deleted by the fuzzer, leaving coll undefined. If the assert call fails, it would be caught by the Mongo shell and cause it to terminate.

    But neither the hang nor the assertion failure are caused by bugs in MongoDB. They are just byproducts of a randomly generated test-case, and they represent the two classes of noise we have to filter out of our fuzz testing: branch logic and assertion failures.

    Read More
  • MongoDB’s JavaScript Fuzzer: Creating Chaos (1/2)

    As MongoDB becomes more feature rich and complex with time, our need for more sophisticated bug-finding methods grows as well. We recently added a homegrown JavaScript fuzzer to our toolkit, and it is now our most prolific bug finding tool, responsible for finding almost 200 bugs over the course of two release cycles. These bugs span a range of MongoDB components from sharding to the storage engine, with symptoms ranging from deadlocks to data inconsistency. We run the fuzzer as part of our continuous integration system, Evergreen, where it frequently catches bugs in newly committed code.

    In part one of two, we examine how our fuzzer hybridizes the two main types of fuzzing to achieve greater coverage than either method alone could accomplish. Part two will focus on the pragmatics of running the fuzzer in a production setting and distilling a root cause from the complex output fuzz tests often produce.

    What’s a fuzzer?

    Fuzzing, or fuzz testing, is a technique of generating randomized, unexpected, and invalid inputs to a program to trigger untested code paths. Fuzzing was originally developed in the 1980s and has since proven to be effective at ensuring the stability of a wide range of systems, from filesystems to distributed clusters to browsers. As people attempt to make fuzzing more effective, two philosophies have emerged: smart, and dumb fuzzing. And as the state of the art evolves, the techniques that are used to implement fuzzers are being partitioned into categories, chief among them being “generational” and “mutational.” In many popular fuzzing tools, smart fuzzing corresponds to generational techniques, and dumb fuzzing to mutational techniques, but as we will see, this is not an intrinsic relationship. Indeed, in our case, the situation is precisely reversed.

    Read More
  • Investing In CS4All: Training Teachers and Helping Them Build Curricula

    Until last year, Jeremy Mellema was a history teacher. Now, he’s teaching computer programming. When I visited his class in the Bronx this month, he had 30 students with 30 MacBooks, completing exercises in Python. They had just finished a lesson on data types, and now they were tackling variables. In Jeremy’s class, the first variable assignment is:

    tupac = "Greatest of All Time!!"
    

    Computer Science for All

    A year ago, New York City mayor Bill de Blasio announced Computer Science for All, an $80 million public-private partnership. The goal is to teach computer science to every student at every public school. But first, the schools need curricula and 5000 teachers need training.

    Here at MongoDB, our VP of Education Shannon Bradshaw oversees MongoDB University, which trains IT professionals to use MongoDB. When he heard about CS4All, he wanted us to contribute. He proposed that we set aside budget for two paid fellowships, and recruit public school teachers to spend the summer with us. We would develop them as teachers, and help build curricula they could take back into schools this fall. MongoDB staff would share our expertise, our office space, our equipment, and the MongoDB software itself.

    Shannon pitched his proposal to the company like this: “As many of us know, it’s still unusual for students to encounter computer science, let alone databases, in their classrooms before entering college. I believe this absence directly contributes to the gender and racial disparity we see today across our industry.” The CS4All project improves access to these subjects for many more students in our city, and MongoDB could be part of it from the beginning.

    Read More

Copyright © 2016 MongoDB, Inc.
Mongo, MongoDB, and the MongoDB leaf logo are registered trademarks of MongoDB, Inc.

Powered by Hugo