On January 10, I released a badly broken version of the MongoDB C Driver, libmongoc 1.5.2. For most users, that version could not connect to a server at all! Luckily, in under 24 hours a developer reported the bug, I reverted the mistake and released a fix. Although it was resolved before it did any damage, this is among the most dramatic mistakes I’ve made since I switched from the PyMongo team to libmongoc almost two years ago. My error stemmed from three mistaken assumptions I’ve had ever since I changed projects. What were they?
Here’s how the story began. In December, a libmongoc user named Alexey pointed out a longstanding limitation: it would only resolve hostnames to IPv4 addresses. Even if IPv6 address records existed for a hostname, the driver would not look them up – when it called getaddrinfo on the hostname to do the DNS resolution, it passed
AF_INET as the address family, precluding anything but IPv4. So if you passed the URI
mongodb://example.com, libmongoc resolved “example.com” to an IPv4 address like
220.127.116.11 and tried to connect to it. If the connection timed out, the driver gave up.
Tell us about the time you made DNS resolution concurrent in Python on Mac and BSD.
No, no, you do not want to hear that story, my friends. It is nothing but old lore and
But you made Python more scalable. The saga of Steve Jobs was sung to you by a mysterious wizard with a fanciful nickname! Tell us!
Gather round, then. I will tell you how I unearthed a lost secret, unbound Python from old shackles, and banished an ancient and horrible Mutex Troll.
Let us begin at the beginning.
A long time ago, in the 1980s, a coven of Berkeley sorcerers crafted an operating system. They named it after themselves: the Berkeley Software Distribution, or BSD. For generations they nurtured it, growing it and adding features. One night, they conjured a powerful function that could resolve hostnames to IPv4 or IPv6 addresses. It was called getaddrinfo. The function was mighty, but in years to come it would grow dangerous, for the sorcerers had not made getaddrinfo thread-safe.
As ages passed, BSD spawned many offspring. There were FreeBSD, OpenBSD, NetBSD, and in time, Mac OS X. Each made its copy of getaddrinfo thread safe, at different times and different ways. Some operating systems retained scribes who recorded these events in the annals. Some did not.
Because getaddrinfo is ringed round with mystery, the artisans who make cross-platform network libraries have mistrusted it. Is it thread safe or not? Often, they hired a Mutex Troll to stand guard and prevent more than one thread from using getaddrinfo concurrently. The most widespread such library is Python’s own socket module, distributed with Python’s standard library. On Mac and other BSDs, the Python interpreter hires a Mutex Troll, who demands that each Python thread hold a special lock while calling getaddrinfo.