3.1. Lists#
Jesse London, Will Trimble, and Amanda R. Kube Jotte
Modern computing would be inconceivable without the ability to interact with a collection of multiple values at once, and more specifically the ordered sequence of values. In Python, the most common of these data types is called the list.
Lists are defined using commas to separate their elements, and enclosed by square brackets.
primes_abridged = [2, 3, 5, 7, 11, 13, 17, 19, 23]
primes_abridged
[2, 3, 5, 7, 11, 13, 17, 19, 23]
Identifying elements#
Lists not only enable us to address a sequence of values at once. They also permit us to refer to their individual elements:
primes_abridged[0]
2
primes_abridged[5]
13
Above, we retrieved individual elements from our list, by indicating the index of the element to retrieve – after the name assigned to the list, and with the index itself enclosed in square brackets.
Remember we did something similar in Section 4 of Chapter 3 when referring to individual characters in a string.
Note
We referred to the first element of our list using the index 0, and the sixth element with the index 5.
Python is a 0-indexed language, meaning that counting starts at 0 rather than 1.
In programming, you can think of an index not as the “number” of the element, but as an offset from the beginning of the list.
0means “no offset” (the very first element).5means “five steps away from the beginning” (the sixth element).
Even negative indices, or offsets, are valid:
primes_abridged[-1]
23
The index -1 always refers to the last element of a list, -2 to the second-to-last, and so on.
The table below summarizes how indices relate to elements of a list using the example from above where primes_abridged = [2,3,5,7,11,13,17,19,23].
List element |
2 |
3 |
5 |
7 |
11 |
13 |
17 |
19 |
23 |
|---|---|---|---|---|---|---|---|---|---|
Position in list |
1st |
2nd |
3rd |
4th |
5th |
6th |
7th |
8th |
9th |
Offset from first (forward) |
0 |
1 |
2 |
3 |
4 |
5 |
6 |
7 |
8 |
Offset from first (backward) |
0 |
-8 |
-7 |
-6 |
-5 |
-4 |
-3 |
-2 |
-1 |
Python Index |
0 |
1 or -8 |
2 or -7 |
3 or -6 |
4 or -5 |
5 or -4 |
6 or -3 |
7 or -2 |
8 or -1 |
Slices#
We can also retrieve a subset of the elements of our list, for example using the slice.
primes_really_abridged = primes_abridged[2:5]
primes_really_abridged
[5, 7, 11]
In the syntax of the slice, we indicated that we would like to construct a new list, consisting of the elements at indices 2 through 4, ending before index 5.
That is, slices are constructed from a generic range. This range may be bounded or unbounded. The lower bound, or start, is always inclusive. The upper bound, or end, is always exclusive.
Warning
There are two common mistakes that students make when indexing or slicing lists. The first is to forget that Python is 0-indexed. The second is to forget that the lower bound of a slice is inclusive but the upper bound is exclusive.
The full signature of the slice includes a start, end and step. All of these arguments are optional, and their defaults are to start with the first element of the list, to end after the last element of the list, and to “step” through the elements one-by-one. As such, the following three expressions evaluate the same.
primes_abridged[0:9:1]
[2, 3, 5, 7, 11, 13, 17, 19, 23]
primes_abridged[:]
[2, 3, 5, 7, 11, 13, 17, 19, 23]
primes_abridged[::]
[2, 3, 5, 7, 11, 13, 17, 19, 23]
We can of course supply some arguments and omit others.
Let’s slice our list starting with its second element – at index 1.
primes_abridged[1:]
[3, 5, 7, 11, 13, 17, 19, 23]
…Or ending before index 3.
primes_abridged[:3]
[2, 3, 5]
We can even specify negative indices.
Let’s start with the last element.
primes_abridged[-1:]
[23]
…Or, construct a list consisting of only the third-to-last and second-to-last elements.
primes_abridged[-3:-1]
[17, 19]
A slice can also indicate a step other than 1, such that some elements are stepped over, or skipped.
every_other_prime_abridged = primes_abridged[::2]
every_other_prime_abridged
[2, 5, 11, 17, 23]
We can even step through the list backwards.
primes_abridged[-1::-1]
[23, 19, 17, 13, 11, 7, 5, 3, 2]
Above, we started with the index of the last element, and told the slice to decrement this index by one for each subsequent element. (And, given this negative step, the list knew by default that the last element should be the first.)
Aggregation#
Most important, lists enable us to instruct the computer to apply an operation to each element of the list in sequence.
And, most simply, we can count the number of elements in our list.
len(primes_abridged)
9
Better yet, we can sum the elements.
sum(primes_abridged)
100
Combining the two of these, we can compute an average or mean.
sum(primes_abridged) / len(primes_abridged)
11.11111111111111
The elements of multiple lists may be combined – or concatenated – forming a new list, with the addition operator.
primes_abridged + [29, 31, 37, 41]
[2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41]
And a list may even be trivially combined with itself – with the multiplication operator!
3 * primes_really_abridged
[5, 7, 11, 5, 7, 11, 5, 7, 11]
Mutation & Methods#
A list may be mutated – that is, manipulated without creating a new list – thanks in part to list methods (methods that work on list objects).
Note
Mutation is also known as manipulation “in place.”
Making changes to a list of data, without creating a new list and assigning a new name, is less straight-forward, and may be prone to error.
However, mutation can also be integral to the process of constructing or building a list; (and, such methods can be useful when computing performance becomes important).
We might extend our list primes_really_abridged with the append() method.
primes_really_abridged.append(13)
…What happened? The above expression had no output – (literally, it evaluated to None, which is hidden by default).
Many of Python’s built-in mutational methods have no output, because none (or None!) is necessary.
Indeed, append() had the desired effect. The prime 13 has been added to the end of our list.
primes_really_abridged
[5, 7, 11, 13]
Note
In Python, None is a special data type that represents the absence of a value.
It often appears when a function or method performs an action (like changing a list) but does not need to return anything. In those cases, Python returns None by default.
For example, append() changes the list directly and therefore has no return value — just None.
Let’s continue to make our list a little less abridged with the insert() method.
primes_really_abridged.insert(0, 3)
primes_really_abridged
[3, 5, 7, 11, 13]
Again, our mutation expression had no output.
But this time, we’ve added an element, 3, to the beginning of our list, by inserting it before the element that was at index 0.
Let’s add some more.
primes_really_abridged.append(19)
primes_really_abridged.append(23)
primes_really_abridged
[3, 5, 7, 11, 13, 19, 23]
Whoops – we forgot the prime 17.
Luckily, elements may be inserted into any position of a list.
But where do we need to insert 17? Let’s ask the list which index is 19 – appropriately, with the method index().
primes_really_abridged.index(19)
5
Unlike insert and append, index does not mutate the list. It only returns the requested value.
Now we can put 17 in its proper place.
missing_index = primes_really_abridged.index(19)
primes_really_abridged.insert(missing_index, 17)
primes_really_abridged
[3, 5, 7, 11, 13, 17, 19, 23]
Note that the insert() method accepts any index.
This includes negative indices.
primes_really_abridged.append(37)
primes_really_abridged
[3, 5, 7, 11, 13, 17, 19, 23, 37]
Huh, we skipped a prime again.
primes_really_abridged.insert(-1, 31)
primes_really_abridged
[3, 5, 7, 11, 13, 17, 19, 23, 31, 37]
Almost there….
primes_really_abridged.insert(-2, 29)
primes_really_abridged
[3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37]
That’s better.
insert() can even get a little silly….
primes_really_abridged.insert(1_000_000, 41)
primes_really_abridged
[3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41]
Above, we asked insert() to place the element 41 before any element at the index for 1,000,000.
What it did was place this value at the end of the list – that is at index 11.
primes_really_abridged[11]
41
In other words, insert() did what we asked it to do.
Also note from the above, list elements must be contiguous – that is, each element is stored right after the previous one, and every index from 0 up to the last index refers to some element.
This means lists cannot be sparse. You cannot, for example, create a list that skips directly from index 0 to index 5 without filling in the elements in between.
numbers = [1, 2, 3]
numbers[5] = 99 # This won't work because there would be no index 3 or 4
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
Cell In[32], line 2
1 numbers = [1, 2, 3]
----> 2 numbers[5] = 99 # This won't work because there would be no index 3 or 4
IndexError: list assignment index out of range
We can also mutate the order in which list elements appear. For example, reverse() will reverse the order of the elements.
primes_really_abridged.reverse()
primes_really_abridged
If we want to put the elements back in order, we can reverse it again.
primes_really_abridged.reverse()
primes_really_abridged
We can also change the order of list elements using sort(). By default, sort() will sort list elements in ascending order.
unordered_primes = [3,5,1,7,2,11]
unordered_primes.sort()
unordered_primes
[1, 2, 3, 5, 7, 11]
If, you want to sort in descending order, you can set reverse equal to True.
unordered_primes = [3,5,1,7,2,11]
unordered_primes.sort(reverse = True)
unordered_primes
[11, 7, 5, 3, 2, 1]
This also works with lists containing strings
words = ["bears","sox","cubs"]
words.sort()
words
['bears', 'cubs', 'sox']
The sort method also has an argument named key which can be used to apply a function to each list element and use the result to sort the elements. You can give it a built-in function or write your own. For example, this code will sort in increased order of string length.
words = ["bears","sox","cubs"]
words.sort(key=len)
words
['sox', 'cubs', 'bears']
words = ["hello","apple","zebra"]
words.sort()
words
['apple', 'hello', 'zebra']
By default, you cannot mix numbers and strings, because Python doesn’t know how to compare them.
prime_words = ["hello",7,2,"apple",11]
prime_words.sort()
prime_words
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
Cell In[14], line 2
1 prime_words = ["hello",7,2,"apple",11]
----> 2 prime_words.sort()
3 prime_words
TypeError: '<' not supported between instances of 'int' and 'str'
However, if you choose a key function that returns comparable elements, you can sort a mixed list.
prime_words = ["hello",7,2,"apple",11]
prime_words.sort(key=str)
prime_words
One list method which both mutates the list and returns a value is pop().
The pop() method removes an element from the list and returns this value.
This can be useful in moving elements from one list to another, or in otherwise making use of removed elements.
By default, pop() removes the last element from the list.
primes_really_abridged.pop()
41
primes_really_abridged
[3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37]
pop() may optionally be given the index of an element to remove.
primes_really_abridged.pop(0)
3
primes_really_abridged
[5, 7, 11, 13, 17, 19, 23, 29, 31, 37]
Because lists are contiguous, and because the index of their first element is always 0, each element removed shifts the indices of the elements that followed it.
As such, when we remove the element at index 3, the element that followed it takes on that index, etc.
primes_really_abridged.pop(3)
primes_really_abridged.pop(3)
primes_really_abridged.pop(3)
primes_really_abridged.pop(3)
primes_really_abridged.pop(3)
primes_really_abridged.pop(3)
primes_really_abridged.pop(3)
primes_really_abridged
[5, 7, 11]
Of course, the above became fairly verbose.
We’ll learn more about concisely manipulating collections of data in subsequent chapters.
Mutability and Assignment#
Recall, we made a distinction between mutable and immutable objects in Section 3.3. Lists are the first mutable objects we’ve worked with in this book.
For immutable objects (numbers, strings, etc), the = is the assignment operator: the right hand side of the expression is evaluated and the value is copied into the name on the left hand side of the = sign.
For mutable objects, = gives a new name to the same object. The data is not copied, and exists in only one place in the computer’s memory. The same data now has two names, both of which can be used to change it. This may have unintended effects:
list1 = [1,2,3]
list2 = list1 # this does not create a new list
list1[2] = 6 # since there is only one list, changing list1
print(list1)
print(list2) # changes list2.
If you need a second copy of a mutable object, you must copy it explicitly. Mutable data types have a copy() method that will create a new copy that can be altered separately from the original:
list3 = [1,2,3]
list4 = list3.copy()
list3[2] = 8
print(list3)
print(list4)
# Success! list3 can finally have different values from list4
Other Lists#
So far we’ve only considered lists of numbers, but in fact list elements may refer to values of any type – and even of multiple types at once.
planets = [
'Mercury',
'Venus',
'Earth',
'Mars',
'Jupiter',
'Saturn',
'Uranus',
'Neptune',
]
In the above, we’ve constructed a list of the planets, whose elements are, of course, strings.
Depending on when you attended grade school, you may prefer the following planetary listing:
planets + ['Pluto']
['Mercury',
'Venus',
'Earth',
'Mars',
'Jupiter',
'Saturn',
'Uranus',
'Neptune',
'Pluto']
But this needn’t worry the International Astronomical Union! We didn’t use append(), so the planets are safe.
planets
['Mercury', 'Venus', 'Earth', 'Mars', 'Jupiter', 'Saturn', 'Uranus', 'Neptune']
Lists can contain multiple data types at the same time.
planets.append(True)
planets.append(8)
planets
['Mercury',
'Venus',
'Earth',
'Mars',
'Jupiter',
'Saturn',
'Uranus',
'Neptune',
True,
True,
True,
True,
8]
Lists can even contain other lists of data.
# distances from the sun, millions of miles:
planetary_distances = [
['Mercury', 36],
['Venus', 67.2],
['Earth', 93],
['Mars', 141.6],
['Jupiter', 483.6],
['Saturn', 886.7],
['Uranus', 1_784.0],
['Neptune', 2_794.4],
]
planetary_distances
[['Mercury', 36],
['Venus', 67.2],
['Earth', 93],
['Mars', 141.6],
['Jupiter', 483.6],
['Saturn', 886.7],
['Uranus', 1784.0],
['Neptune', 2794.4]]
In the above collection, we’ve combined lists, strings, integers and floats!
Note
Earlier we learned that we need to use the copy() method to make duplicates of mutable objects. However, copy() usually makes only a shallow copy, meaning it duplicates the outer object but not the nested objects inside it. If the object contains other mutable objects (like lists of lists), changes to those inner objects will still affect both copies.
To avoid this, you can make a deep copy using the deepcopy() function from Python’s built-in copy module. Try the code below on your own computer:
import copy
nested = [[1, 2], [3, 4]]
shallow = nested.copy()
deep = copy.deepcopy(nested)
nested[0][0] = 99
print(shallow) # [[99, 2], [3, 4]]
print(deep) # [[1, 2], [3, 4]]
This kind of nested structure, if constructed arbitrarily, can be difficult to use; but, constructed consistently, this is the basis of computational data and data science.
But lists are not the only kind of useful collection available to us in Python, and we’ll explore a few!