The MongoDB Engineering Journal

A tech blog for builders, by builders

All Posts tagged in "Open Source"
  • Testing Linearizability with Jepsen and Evergreen: “Call Me Continuously!”

    Intro

    What do you do with a third-party tool that proves your application lacks a feature? Add that tool to your continuous integration system (after adding the feature, of course)! In our case we have added linearizable reads to MongoDB 3.4 and use Jepsen to test it.

    Evergeeen logo plus Jepsen

    What is Linearizability?

    Linearizability is a property of distributed systems first introduced by Herlihy & Wing in their July 1990 article “Linearizability: a correctness condition for concurrent objects” (ACM Transactions on Programming Languages and Systems Journal). Peter Bailis probably provides the most accessible explanation of linearizability: “writes should appear to be instantaneous. Imprecisely, once a write completes, all later reads (where “later” is defined by wall-clock start time) should return the value of that write or the value of a later write. Once a read returns a particular value, all later reads should return that value or the value of a later write.”

    In MongoDB 3.4, linearizable reads are now supported on single documents, using a new read concern called “linearizable”. Previously, linearizable reads were possible only by using a findAndModify operation on a single document and updating an extraneous field in the document, with a writeConcern of “majority”. Keep in mind there is a performance penalty for this. Linearizable reads have a performance profile similar to majority writes, as each linearizable read makes use of a no-op majority write to the replica set to ensure the data being read is durable and not stale.

    Read More
  • The Saga of Concurrent DNS in Python, and the Defeat of the Wicked Mutex Troll

    creaky old wooden chest with medieval-style engravings and a scroll depicting the history of getaddrinfo on macOS

    Tell us about the time you made DNS resolution concurrent in Python on Mac and BSD.

    No, no, you do not want to hear that story, my friends. It is nothing but old lore and #ifdefs.

    But you made Python more scalable. The saga of Steve Jobs was sung to you by a mysterious wizard with a fanciful nickname! Tell us!

    Gather round, then. I will tell you how I unearthed a lost secret, unbound Python from old shackles, and banished an ancient and horrible Mutex Troll.

    Let us begin at the beginning.


    A long time ago, in the 1980s, a coven of Berkeley sorcerers crafted an operating system. They named it after themselves: the Berkeley Software Distribution, or BSD. For generations they nurtured it, growing it and adding features. One night, they conjured a powerful function that could resolve hostnames to IPv4 or IPv6 addresses. It was called getaddrinfo. The function was mighty, but in years to come it would grow dangerous, for the sorcerers had not made getaddrinfo thread-safe.

    As ages passed, BSD spawned many offspring. There were FreeBSD, OpenBSD, NetBSD, and in time, Mac OS X. Each made its copy of getaddrinfo thread safe, at different times and different ways. Some operating systems retained scribes who recorded these events in the annals. Some did not.

    Because getaddrinfo is ringed round with mystery, the artisans who make cross-platform network libraries have mistrusted it. Is it thread safe or not? Often, they hired a Mutex Troll to stand guard and prevent more than one thread from using getaddrinfo concurrently. The most widespread such library is Python’s own socket module, distributed with Python’s standard library. On Mac and other BSDs, the Python interpreter hires a Mutex Troll, who demands that each Python thread hold a special lock while calling getaddrinfo.

    Read More

Copyright © 2016 MongoDB, Inc.
Mongo, MongoDB, and the MongoDB leaf logo are registered trademarks of MongoDB, Inc.

Powered by Hugo