How To Sort Python Collections

By Adam McQuistan in Python  03/24/2020 Comment

Introduction

In this How To article I discuss and demonstrate techniques for sorting Python collections (lists, dictionaries, tuples) in simple to moderate use cases. Sorting is a common requirement in many software application and in the most simple use cases is refreshingly easy to do in Python. However, once you get beyond simple flat data structures it can be a little challenging at until you get the hang of it. I aim to provide several examples to help Python developers better acqaint themselves with these techniques.

Contents

Simple Sorting of Lists with sorted() and list.sort()

The most commonly sorted data collection-like data structure is the list. Lists can be sorted in one of two approaches using either:

(i) the sorted(collection) function which takes a list as an argument and returns a sorted copy of the list

>>> letters_to_sort = ['D', 'a', 'B', 'E', 'd', 'A', 'I']
>>> sorted_copy = sorted(letters_to_sort)
>>> sorted_copy
['A', 'B', 'D', 'E', 'I', 'a', 'd']

(ii) list.sort() method on the list object itself which sorts the list in place.

>>> from copy import copy
>>> copy_to_sort = copy(letters_to_sort)
>>> copy_to_sort.sort()
>>> copy_to_sort
['A', 'B', 'D', 'E', 'I', 'a', 'd']

As you likely noted the above lists of letters we sorted lexographically (aka, alphabetically) with capital letters given higher precendence.

One important thing worth noting with sorting is that the collection being sorted must be homogenous (ie, all the same data type such as all string or all ints) or the sort function will throw and error.

>>> sorted(['a', 1, 3, 'd'])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: '<' not supported between instances of 'int' and 'str'

Simple Sorting of Lists in Reverse Order


Sorting in reverse order looks very similar using both the sorted() function and the list.sort() method and is accomplished by settings a keyword parameter to True as shown below.

First I show with the sorted() function which, again, creates a new copy of the list to be sorted in reverse this time.

>>> rev_sorted_copy = sorted(letters_to_sort, reverse=True)
>>> rev_sorted_copy
['d', 'a', 'I', 'E', 'D', 'B', 'A']

Then follow with list.sort() which, again, sorts the list in place.

>>> copy_to_sort.sort(reverse=True)
>>> copy_to_sort
['d', 'a', 'I', 'E', 'D', 'B', 'A']

Sorting Tuples

Sorting tupes is actually less complex due to the fact that they are immutable data structures thus their is less to consider when you want to sort them. Your only choice is to use the sorted() function and create a sorted copy of the input tuple.

>>> tup = (8, 3, 4, 1, 5)
>>> sorted_tup = sorted(tup)
>>> sorted_tup
[1, 3, 4, 5, 8]

Sorting Dictionaries

A little more thought and decision goes into sorting of dictionaries due to the key / values nature of the data structure but, unlike the list data structure dictionaries don't have a built in sort() function so you are limited to sorting only the keys or the values of a dictionary with the later needing explicit syntax.

By simply calling the sorted() function on a dictionary a sorted list of the dictionaries keys are returned which can then be used to iterate over and access the original unchanged dictionary. 

>>> d = {'name': 'Adam', 'age': 32, 'country': 'US'}
>>> sorted_keys = sorted(d)
>>> sorted_keys
['age', 'country', 'name']
>>> for k in sorted_keys:
...     print(k, d[k])
... 
age 32
country US
name Adam

Of course, you could call the values() method on a dictionary object and pass the returned list a sorted() function call if you were only interested in having the sorted values but, as stated earlier all the returned values need to be of the same type or sort will raise a TypeError:.

>>> country_populations = { 'China': 1_433_783_686, 'India': 1_366_417_754, 'US': 329_064_917 }
>>> sorted(country_populations.values())
[329064917, 1366417754, 1433783686]

Sorting List of Dictionaries and Other Objects by a Single Field

Thus fare I've demonstrated what I've referred to as simple sorting where the items being sorted are simple types like strings and integers. However, many software applications utilize more complex data structures within collections such as dictionaries and other objects (either built in or user defined). To sort these items you need to tell Python's sorted() and list.sort() constructs which fields within the item objects to sort by using the key keyword parameters and giving it a lambda to access each item with.

For example, say I have a list of dictionaries representing people with keys for name and age then sorting by name would look as follows.

>>> people = [{'name':'Kristin', 'age':32}, {'name':'Kallin', 'age':3}, {'name': 'Adam', 'age': 32}]
>>> sorted_people = sorted(people, key=lambda p: p['name'])
>>> sorted_people
[{'name': 'Adam', 'age': 32}, {'name': 'Kallin', 'age': 3}, {'name': 'Kristin', 'age': 32}]

Sorting by objects other than dictionaries is very similar but you use the dot accessor syntax like so.

>>> from collections import namedtuple
>>> Person = namedtuple('Person', ['name', 'age'])
>>> people_tups = [Person('Kristin', 32), Person('Kallin', 3), Person('Adam', 32)]
>>> sorted_people_tups = sorted(people_tups, key=lambda p: p.name)
>>> sorted_people_tups
[Person(name='Adam', age=32), Person(name='Kallin', age=3), Person(name='Kristin', age=32)]

Sorting Lists of Dictionaries and Other Objects by Multiple Fields

Another common but, only slightly more complex, sorting requirement is to sort by multiple fields of each object within a collection. This looks very similar to what was seen in the last section using the key keyword parameter of sorted() or list.sort() and specifying a lambda to access the fields to be sorted by except you return a tuple of fields. 

As always, an example works wonders for describing this so lets say I want to sort my people collections from the last section by both age and name in reverse order.

>>> people = [{'name':'Kristin', 'age':32}, {'name':'Kallin', 'age':3}, {'name': 'Adam', 'age': 32}]
>>> rev_age_name_sorted_people = sorted(people, key=lambda p: (p['age'], p['name']), reverse=True)
>>> rev_age_name_sorted_people
[{'name': 'Kristin', 'age': 32}, {'name': 'Adam', 'age': 32}, {'name': 'Kallin', 'age': 3}]

And for completeness I show the namedtuple example also.

>>> people_tups = [Person('Kristin', 32), Person('Kallin', 3), Person('Adam', 32)]
>>> rev_age_name_sorted_people_tups = sorted(people_tups, key=lambda p: (p.name, p.age), reverse=True)
>>> rev_age_name_sorted_people_tups
[Person(name='Kristin', age=32), Person(name='Kallin', age=3), Person(name='Adam', age=32)]

Resources for Learning More

thecodinginterface.com earns commision from sales of linked products such as the books above. This enables providing continued free tutorials and content so, thank you for supporting the authors of these resources as well as thecodinginterface.com

Conclusion

In this How To article I discussed sorting of collection-like Python data structures such as tuples, dictionaries, and lists and provided several concrete examples of both simple and moderately complex sorting use cases you are likely to run into as a Python developer. 

As always, I thank you for reading and feel free to commend or critique in the comments section below.

Share with friends and colleagues

[[ likes ]] likes

Navigation

Community favorites for Python

theCodingInterface