Filtered by Tag: open source

Pruning Dynamic Rebuilds With libabigail

| | c++ open source optimization

Complex C++ projects frequently struggle with lengthy build times. Splitting a project into multiple dynamically-linked components can give developers faster incremental rebuilds and shorter edit-compile-test cycles than relying on static linking, especially when there are a large number of test binaries. However, build systems usually do not realize all of the possible gains in dynamic incremental rebuilds due to how they handle transitive library dependencies. Red Hat's ABI introspection library libabigail offers one possible path to eliminating unnecessary transitive re-linking for some classes of source modifications.

Considering the Community Effects of Introducing an Official MongoDB Go Driver

| | golang drivers open source

What do you do when an open-source project you rely on no longer meets your needs?  When your choice affects not just you, but a larger community, what principles guide your decision?

Submitting patches is often the first option, but you're at the mercy of the maintainer to accept them.  If the changes you need are sweeping, substantial alterations, the odds of acceptance are low.  Eventually, only a few realistic options remain: find an alternative, fork the project, or write your own replacement.  Everyone who depends on open source faces this conundrum at one time or another.

After relying for years on the community-developed mgo Go driver for MongoDB, MongoDB has begun work on a brand-new, internally-developed, open-source Go driver.  We know that releasing a company-sponsored alternative to a successful, community-developed project creates tension and uncertainty for users, so we did not make this decision lightly. We carefully considered how our choice would affect current and future Go users of MongoDB.

Presenting WinKerberos

| | open source python kerberos

WinKerberos is a Python module providing Kerberos facilities to Python applications on Windows, where PyKerberos does not work. WinKerberos can be used as a drop-in replacement on Windows for client applications using PyKerberos.

Why write a new Kerberos module for Python?

The MongoDB Enterprise Edition 2.4 supported a new authentication mechanism: Kerberos V5 using the Generic Security Services API (GSSAPI). PyMongo, the Python driver for MongoDB, needed to support this new authentication mechanism.

There wasn't a lot of information available on using Kerberos in any Python application, to say nothing of one running on Windows. Nick Coghlan's 2011 article "Using the Python Kerberos Module" was the best information available at the time. Nick's article was about using PyKerberos to implement HTTP "Negotiate" authentication, whereas MongoDB uses a custom TCP wire protocol, but using the article's code examples, a few other sources, and a careful reading of section 3.1 of RFC 4752, I got Kerberos authentication in PyMongo working everywhere but Windows in a few days.

PyKerberos was written by Apple as part of their open source Calendar and Contacts Server project. It is a pure C Python extension module that builds against MIT Kerberos V5 or Heimdal and is well tested on macOS and Linux. Though there were rumors of people getting PyKerberos to work on Windows, we could never figure out how they did it.

When we released MongoDB Enterprise 2.4 on March 19, 2013, the number of users that needed support for PyMongo on Windows with Kerberos authentication appeared to be exactly zero. On May 22, 2013, we released PyMongo 2.5 with support for Kerberos authentication on the platforms PyKerberos supported. By 2016, after multiple requests for Kerberos support on Windows and an aborted attempt to implement support using kerberos-sspi, we decided to write a new Python module. This module would support Kerberos authentication on Windows for PyMongo and any other Python project that needed it.

Enter WinKerberos

WinKerberos is a pure C Python extension module that supports Python 2.6, 2.7, and 3.3+. It provides most of the client API of PyKerberos, but using Microsoft's Security Support Provider Interface (SSPI) under the covers. PyMongo, Requests, and a few other projects use WinKerberos as the Kerberos provider on Windows. It is available on pypi as prebuilt binary wheels and can be installed with pip without a C compiler:

 python -m pip install winkerberos

To add Windows support to an existing application that uses PyKerberos for client authentication, change:

import kerberos

to:

try:
    import winkerberos as kerberos
except ImportError:
    import kerberos

If you need to implement Kerberos authentication from scratch in your application, the README provides an example implementation to use with WinKerberos or PyKerberos.

Help us improve WinKerberos

WinKerberos has implemented all of the features of PyKerberos that PyMongo needed since version 0.1. Since then, we have shipped six more releases adding support for features requested by the community, and patches from users adding support for SPNEGO and RFC 5929 Channel Bindings. As a reimplementation of PyKerberos for Windows, WinKerberos is still incomplete. It lacks some of PyKerberos' client-side functions, like changePassword and getServerPrincipalDetails, and doesn't implement any of the server API. If you would like to see these features in WinKerberos, or you are adding a new feature to PyKerberos that should also exist in WinKerberos, we happily accept patches from the community. If you find a bug in WinKerberos, or want to request a new feature, please file a ticket in the Github project.

Testing Linearizability with Jepsen and Evergreen: “Call Me Continuously!”

| | testing ci buildtogether open source

What do you do with a third-party tool that proves your application lacks a feature? Add that tool to your continuous integration system (after adding the feature, of course)! In our case we have added linearizable reads to MongoDB 3.4 and use Jepsen to test it.

What is Linearizability?

Linearizability is a property of distributed systems first introduced by Herlihy & Wing in their July 1990 article "Linearizability: a correctness condition for concurrent objects" (ACM Transactions on Programming Languages and Systems Journal). Peter Bailis probably provides the most accessible explanation of linearizability: "writes should appear to be instantaneous. Imprecisely, once a write completes, all later reads (where “later” is defined by wall-clock start time) should return the value of that write or the value of a later write. Once a read returns a particular value, all later reads should return that value or the value of a later write."

The Saga of Concurrent DNS in Python, and the Defeat of the Wicked Mutex Troll

| | python open source macos bsd

Tell us about the time you made DNS resolution concurrent in Python on Mac and BSD.

No, no, you do not want to hear that story, my friends. It is nothing but old lore and #ifdefs.

But you made Python more scalable. The saga of Steve Jobs was sung to you by a mysterious wizard with a fanciful nickname! Tell us!

Gather round, then. I will tell you how I unearthed a lost secret, unbound Python from old shackles, and banished an ancient and horrible Mutex Troll.

Let us begin at the beginning...

 

A long time ago, in the 1980s, a coven of Berkeley sorcerers crafted an operating system. They named it after themselves: the Berkeley Software Distribution, or BSD. For generations they nurtured it, growing it and adding features. One night, they conjured a powerful function that could resolve hostnames to IPv4 or IPv6 addresses. It was called getaddrinfo. The function was mighty, but in years to come it would grow dangerous, for the sorcerers had not made getaddrinfo thread-safe.

As ages passed, BSD spawned many offspring. There were FreeBSD, OpenBSD, NetBSD, and in time, Mac OS X. Each made its copy of getaddrinfo thread safe, at different times and different ways. Some operating systems retained scribes who recorded these events in the annals. Some did not.

Because getaddrinfo is ringed round with mystery, the artisans who make cross-platform network libraries have mistrusted it. Is it thread safe or not? Often, they hired a Mutex Troll to  stand guard and prevent more than one thread from using getaddrinfo concurrently. The most widespread such library is Python's own socket module, distributed with Python's standard library. On Mac and other BSDs, the Python interpreter hires a Mutex Troll, who demands that each Python thread hold a special lock while calling getaddrinfo.