I love thinking about data structures, and how to organise them most
efficiently for a specific task. In the normal course of programming in Python,
we don't have to think about it very much - the choice between list
and
dict
is obvious, and that's usually as far as things go.
When things get more complex, though, the Collections Abstract Base Classes can be extremely useful. In my experience, they aren't universally known about, so in this post I'll show a couple of interesting uses for them.
Using a set
requires that the items held within are all hashable (that is,
they implement the __hash__
method).
This isn't always the case, though. For example, Django models that don't have
a PK yet are unhashable, as are dict
s. In these situations, it can be useful
to have a data structure which acts like a set
, but which is backed by
a list
to sidestep that requirement. Performance will be worse, but in some
cases this is acceptable.
>>> s = ListBasedSet([
>>> {
>>> 'id': 1,
>>> },
>>> {
>>> 'id': 2,
>>> },
>>> ])
>>> len(s)
2
This can be easily acheived using the MutableSet
Abstract Base Class:
This exposes the exact same API as a builtin set
.
>>> s.add({
>>> 'id': 3,
>>> })
>>> len(s)
>>> s.clear()
>>> len(s)
0
If you have an API that paginates results, but you'd like to expose it as
a simple list
that can be iterated over, the Collections Abstract Base
Classes are a good way to do that.
As an example, APIs often return a response with a list of objects and the total number of objects available:
{
"objects": [
{
"id": 1
},
{
"id": 2
}
],
"total": 2
}
In such a case, a class like the following could be used to load the data lazily, when an item in the list is accessed:
With this implementation, you can simply iterate over the list as normal and have the paginated data loaded automatically:
>>> l = LazyLoadedList('http://api.example.com/items')
>>> for item in l:
>>> process_item(item)
At Zapier, we use something very similar to this to wrap ElasticSearch responses.
I hope these examples show some of the things that can be achieved with Python's Collections Abstract Base Classes!