Python Collections Abstract Base Classes

I love thinking about data structures, and how to organise them most efficiently for a specific task. In the normal course of programming in Python, we don't have to think about it very much - the choice between list and dict is obvious, and that's usually as far as things go.

When things get more complex, though, the Collections Abstract Base Classes can be extremely useful. In my experience, they aren't universally known about, so in this post I'll show a couple of interesting uses for them.

List-Based Set

Using a set requires that the items held within are all hashable (that is, they implement the __hash__ method).

This isn't always the case, though. For example, Django models that don't have a PK yet are unhashable, as are dicts. In these situations, it can be useful to have a data structure which acts like a set, but which is backed by a list to sidestep that requirement. Performance will be worse, but in some cases this is acceptable.

>>> s = ListBasedSet([
>>>     {
>>>         'id': 1,
>>>     },
>>>     {
>>>         'id': 2,
>>>     },
>>> ])
>>> len(s)

This can be easily acheived using the MutableSet Abstract Base Class:

This exposes the exact same API as a builtin set.

>>> s.add({
>>>     'id': 3,
>>> })
>>> len(s)
>>> s.clear()
>>> len(s)

Lazy-Loading and Pagination

If you have an API that paginates results, but you'd like to expose it as a simple list that can be iterated over, the Collections Abstract Base Classes are a good way to do that.

As an example, APIs often return a response with a list of objects and the total number of objects available:

    "objects": [
            "id": 1
            "id": 2
    "total": 2

In such a case, a class like the following could be used to load the data lazily, when an item in the list is accessed:

With this implementation, you can simply iterate over the list as normal and have the paginated data loaded automatically:

>>> l = LazyLoadedList('')
>>> for item in l:
>>>     process_item(item)

At Zapier, we use something very similar to this to wrap ElasticSearch responses.

I hope these examples show some of the things that can be achieved with Python's Collections Abstract Base Classes!