When should I use a Python tuple?

Posted on Fri 23 December 2016 in Python

Python has two different built-in data types that represent an ordered sequence of items, list and tuple.

Now you might know that the difference between a tuple and a list is that a tuple is immutable. So you can do this to a list:

# You can mutate a list
a = [1, 2, 3]
a.append(4)
print(a)  # this displays [1, 2, 3, 4]
a[1] = 5
print(a)  # this now displays [1, 5, 3, 4]

# But you can't do it for tuples
b = (1, 2, 3)
b[1] = 5  # this raises a TypeError

So you can change elements of a list and also add new elements to it. But you can't do it for a tuple.

But have you ever wondered why this is useful? Can't we just use lists everywhere instead of tuples? Why do we need tuples?

Understanding hash tables

To understand this let's talk about a data structure called a hash table.

The abstraction of a hash table is a key-value pair. It is a data structure that associates a key with a value.

Does that sound like a Python dict? That's because Python dicts are implemented using hash tables.

There are many possible implementations of a key-value pair (such as binary trees), but a hash table is particularly efficient, in that given a key, you can theoretically access its associated value in O(1) (constant) time.

Let's talk about how a hash table is implemented in a lower level language like C.

Are you familiar with C arrays? An array is a contiguous sequence of variables. You can think of them like Python lists:

int array[5] = {1, 2, 3, 4, 5};
printf("%d", array[3]); // This prints "4" (C arrays are zero-indexed just like Python)

One way to think of hash tables is a gigantic array. Notice how we can access elements of an array using an index (i.e, array[index])? If we can map a key to a unique index, given a key you can access its value in constant time (O(1)).

For example, you can map the string 'ladida' to the index 2334. So given the key 'ladida' you can simply access index 2334 of the array to get the associated value for the key 'ladida'.

The mapping of a value to an index is done via a computation called a hash function.

Basically, you can conceptually think of hash functions as functions that compute an integer given an object supplied as a key. This allows us to put an associated value in the gigantic array. That integer number hopefully turns out to uniquely map to an index in the array, but sometimes collisions can occur (depending on the array size we use to implement the hash table).

But anyway, that's a topic for another blog post. Hash functions and collision handling are too deep to cover in a single post (I recently talked to someone who did a PhD on hash functions!).

Hashable data types

Okay, so why are we talking about this?

In Python, mutable data types like lists are unhashable. You can try doing this quickly in the shell:

>>> a = [1, 2, 3]
>>> d = {a: 'hello'}
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unhashable type: 'list'

And then if you try doing it with a tuple, it works fine:

>>> a = (1, 2, 3)
>>> d = {a: 'hello'}
>>> d[(1, 2, 3)]
'hello'

The reason this happens is that a hash function needs to differentiate between objects (e.g. (1, 2, 3) and (1, 2, 3, 4) to be useful. When an object is mutable, you can't reliably compute a hash function because the object the dict holds a reference to might change.

That's why we use immutable data types like tuples and frozensets, and strings as keys in Python dictionaries, so that we can reliably compute a hash function for these data types.

However, not all tuples are hashable. If a tuple contains a mutable element, it is no longer hashable:

>>> a = (1, 2, [3, 4])
>>> {a: 'hello'}
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unhashable type: 'list'

Tuples as stylistic convention for records

Some people also consider tuples to be used differently, as a stylistic convention. Tuples are used as a logical grouping for variables that have meaning when they're together, but not necessarily when they're not.

Consider the following:

# These variables in the tuple have logical meaning when grouped
employee1 = ('John Doe', '123 Baker Street, 342-123 Rocknroll City', '+6041112334')
employee2 = ('Karen Smith', '45 Queen Street, 342-123 Rocknroll City', '+6043323123')

# This is a collection of those things
employees = [employee1, employee2]

If you're familiar with other languages, you can probably analogously think of tuples as what other languages call structs. For example in Ruby:

Struct.new('Employee', :name, :address, :phone)
employee1 = Struct::Employee.new('John Doe', '123 Baker Street, 342-123 Rocknroll City', '+6041112334')
puts employee1.name  # This prints out 'John Doe'

The difference is that tuple elements aren't named. But, there is a namedtuple type in Python from the collections module in the standard library:

from collections import namedtuple

Employee = namedtuple('Employee', ['name', 'address', 'phone'])
employee1 = Employee('John Doe', '123 Baker Street, 342-123 Rocknroll City', '+6041112334')
print(employee1.name)  # This prints out 'John Doe'

So it is clear that the intent of a tuple is to provide a way to logically group values that have meaning when grouped, similar to structs in other languages.

Summary

Now you know how immutable data types like tuples and frozensets can be useful as keys in Python dictionaries.

When you want to use a collection of objects as a key to a Python dictionary, using a hashable data type is necessary.

All built-in immutable data types in Python are hashable, whereas built-in mutable data types are unhashable.

Another fun fact: elements of Python sets also need to be hashable.

Tuples are also used as a stylistic convention to logically group variables together.

For more detailed information, refer to the Python documentation on the topic 1.

Hope this helps!

Subscribe to receive free Python tips

Sometimes I keep my best content for my email subscribers. Subscribe now so that you don't miss out! You can unsubscribe at any time, and I will never spam you.
* indicates required

Comments