The MongoDB Engineering Journal

A tech blog for builders, by builders

All Posts tagged in "Testing"
  • Testing Linearizability with Jepsen and Evergreen: “Call Me Continuously!”


    What do you do with a third-party tool that proves your application lacks a feature? Add that tool to your continuous integration system (after adding the feature, of course)! In our case we have added linearizable reads to MongoDB 3.4 and use Jepsen to test it.

    Evergeeen logo plus Jepsen

    What is Linearizability?

    Linearizability is a property of distributed systems first introduced by Herlihy & Wing in their July 1990 article “Linearizability: a correctness condition for concurrent objects” (ACM Transactions on Programming Languages and Systems Journal). Peter Bailis probably provides the most accessible explanation of linearizability: “writes should appear to be instantaneous. Imprecisely, once a write completes, all later reads (where “later” is defined by wall-clock start time) should return the value of that write or the value of a later write. Once a read returns a particular value, all later reads should return that value or the value of a later write.”

    In MongoDB 3.4, linearizable reads are now supported on single documents, using a new read concern called “linearizable”. Previously, linearizable reads were possible only by using a findAndModify operation on a single document and updating an extraneous field in the document, with a writeConcern of “majority”. Keep in mind there is a performance penalty for this. Linearizable reads have a performance profile similar to majority writes, as each linearizable read makes use of a no-op majority write to the replica set to ensure the data being read is durable and not stale.

    Read More
  • MongoDB’s JavaScript Fuzzer: Harnessing the Havoc (2/2)

    Fuzz testing is a method for subjecting a codebase to a tide of hostile input to supplement the test cases engineers create on their own. In part one of this pair, we looked at the hybrid nature of our fuzzer – how it combines “smart” and “dumb” fuzzing to produce input random enough to provoke bugs, but structured enough to pass input validation and test meaningful codepaths. To wrap up, I’ll discuss how we isolate signal from the noise a fuzzer intrinsically produces, and the tooling that augments the root cause analyses we do when the fuzzer finds an issue.

    An unbridled fuzzer creates too much noise

    Fuzz testing is a game of random numbers. That randomness makes the fuzzer powerful… too powerful. Without some careful harnessing, it would just blow itself up all the time by creating errors within the testing code itself. Take the following block of code, which is something you would see in one of MongoDB’s JavaScript tests:

    while(coll.count() < 654321)
        assert(coll.update({a:1}, {$set: {...}}))

    This code does a large number of updates to a document stored in MongoDB. If we were to put it through the fuzzer, a possible test-case that the fuzzer could produce is this:

        assert(coll.update({}, {$set: {"a.654321" : 1}}))

    The new code now tests something completely different. It tries to set the 654321th element in an array stored in all documents in some MongoDB collection.

    Now, this is an interesting test-case. Using the $set operator with such a large array may not be something we thought of testing explicitly and could trigger a bug (in fact it does). But the interaction between the fuzzed true condition and the residual while loop is going to hang the test! Unless, that is, the assert call in the while loop fails, which could happen if the line defining coll in the original test (not shown here) is mutated or deleted by the fuzzer, leaving coll undefined. If the assert call fails, it would be caught by the Mongo shell and cause it to terminate.

    But neither the hang nor the assertion failure are caused by bugs in MongoDB. They are just byproducts of a randomly generated test-case, and they represent the two classes of noise we have to filter out of our fuzz testing: branch logic and assertion failures.

    Read More
  • MongoDB’s JavaScript Fuzzer: Creating Chaos (1/2)

    As MongoDB becomes more feature rich and complex with time, our need for more sophisticated bug-finding methods grows as well. We recently added a homegrown JavaScript fuzzer to our toolkit, and it is now our most prolific bug finding tool, responsible for finding almost 200 bugs over the course of two release cycles. These bugs span a range of MongoDB components from sharding to the storage engine, with symptoms ranging from deadlocks to data inconsistency. We run the fuzzer as part of our continuous integration system, Evergreen, where it frequently catches bugs in newly committed code.

    In part one of two, we examine how our fuzzer hybridizes the two main types of fuzzing to achieve greater coverage than either method alone could accomplish. Part two will focus on the pragmatics of running the fuzzer in a production setting and distilling a root cause from the complex output fuzz tests often produce.

    What’s a fuzzer?

    Fuzzing, or fuzz testing, is a technique of generating randomized, unexpected, and invalid inputs to a program to trigger untested code paths. Fuzzing was originally developed in the 1980s and has since proven to be effective at ensuring the stability of a wide range of systems, from filesystems to distributed clusters to browsers. As people attempt to make fuzzing more effective, two philosophies have emerged: smart, and dumb fuzzing. And as the state of the art evolves, the techniques that are used to implement fuzzers are being partitioned into categories, chief among them being “generational” and “mutational.” In many popular fuzzing tools, smart fuzzing corresponds to generational techniques, and dumb fuzzing to mutational techniques, but as we will see, this is not an intrinsic relationship. Indeed, in our case, the situation is precisely reversed.

    Read More
  • Cat-Herd's Crook: How YAML test specs improve driver conformance

    Cat illustrations by Samantha Ritter

    Image of Samantha and Jesse as cats

    At MongoDB we write open source database drivers in ten programming languages. We also help developers in our community replicate our drivers’ behavior in even more (and more exotic) languages. Ideally, all drivers behave alike; or, where they differ, the differences are written down and justified. How can we herd all these cats along the same path?

    For years we failed. Each false start at standardization left us more discouraged. But we’ve recently gained momentum on standardizing our drivers. Human-readable, machine-testable specs, coded in YAML, prove which code conforms and which does not. These YAML tests are the Cat-Herd’s Crook: a tool to guide us all in the same direction.

    Read More

Copyright © 2016 MongoDB, Inc.
Mongo, MongoDB, and the MongoDB leaf logo are registered trademarks of MongoDB, Inc.

Powered by Hugo