Iteration and Simulation#

In the case that we want to repeat a programming block of statements or expressions multiple times, the control statement for can be used. Before we continue with the example from the previous section, we’ll focus on the foundation of this iteration tool.

A for statement can iterate through a sequence to perform some action on each of its elements. This sequence can really be any iterable object, including a list, a string, or a range of numbers, to name a few. The general form of a for statement is below:

for item in sequence:
    action    

Notice we specify a name to assign to the value of each of the sequence’s items – here item. This name is assigned to each of the values in the sequence, sequentially. And for each of these assignments – or “loops” – the indented body of the for statement is executed (here “action”).

For example, to print out each element in a given list we might use the following code.

adjectives = ["red", "rotten", "tasty"]

for word in adjectives:
    print(word)
red
rotten
tasty

The above dictates that for each element word in the list adjectives, we execute the indented body: print(word).

word was sequentially assigned the value of each of the elements in the list. We can alternately use something else since this choice of name does not matter. The for statement below also prints out each element in the list adjectives.

for item in adjectives:
    print(item)
red
rotten
tasty

Note that what we iterate over does not need to be directly related to the body of the for statement. In fact, for statements are useful to simply execute the body or action a given number of times.

In the first example below, the value of the iterator is used in the body of the for statement; in the second example, the for statement uses this iterator merely to repeat or loop through the body statement a certain number of times.

for i in range(3):
    print(i)
0
1
2
for i in range(3):
    print('potato')
potato
potato
potato

Instead of performing the action of printing, it might be useful to use a loop to count how often something occurs. Below we create a list of colors and we want to know how many colors have a u in them. For each element in the list we want to

  1. perform the action of checking ‘Is there a u?’

  2. If a ‘u’ exists, we want to count that ‘u’.

To do this, we need to augment a variable that is defined outside the loop. That is, we need to define a count variable that we can increase each time the given condition is met. Since the default is none, that is at the start there are no colors with a ‘u’, we set count = 0. Each time a ‘u’ is found in the loop, we overwrite count by adding one to the previous count variable.

color_list = ['red','orange', 'green', 'blue', 'yellow', 'purple', 'brown']
count = 0

for color in color_list:
    if 'u' in color:
        count = count +1

Printing the value of count after we run the loop gives us how many colors in our list have a ‘u’.

print(count)
2

Nested for loops#

Suppose we want to repeat a process for each loop of a repeated process. We can accomplish this by nesting a for statement within a for statement.

For example, suppose we have two lists, and we want to pair every element in list_1 with every element in list_2. We could write out by hand all the possible combinations pairing elements of list_1 with list_2 … or, we could use nested for statements to systematically consider each element in list_2 for each element in list_1.

This takes the following form:

for item_1 in list_1:
    for item_2 in list_2:
        print(item_1, item_2)

For a more concrete example, consider the list of adjectives above, and a new list of fruits. We’ll use a nested for statement to print all possible adjective, fruit combinations.

fruits = ["apple", "banana", "cherry"]
for adjective in adjectives:
    for fruit in fruits:
        print(adjective, fruit)
red apple
red banana
red cherry
rotten apple
rotten banana
rotten cherry
tasty apple
tasty banana
tasty cherry

Alternatively, we could pick random combinations; though, in doing so we cannot guarantee that each combination will be distinct.

Setting combos = 5 below, the for statement prints a random adjective paired with a random fruit for each number in the range to combos – really, repeating these selections combos (or 5) times.

Feel free to experiment with the number of combinations to print below.

import numpy as np

combos = 5 

for i in np.arange(combos):
    print(np.random.choice(adjectives), np.random.choice(fruits))
rotten cherry
red cherry
tasty cherry
red apple
rotten cherry

If we want to iterate through a list or other object but only perform an action on specific values in that list, we can use for statements in combination with if statements. For example, to print out all items in the list adjectives with less than or equal to 5 characters, we use a for statement to loop through each individual item in the list and check if the length satisfies the inequality.

for adjective in adjectives:
    if len(adjective) <= 5:
        print(adjective)
red
tasty

List and Dictionary Comprehension#

In Section 4.1: Lists we created the following list of prime numbers:

primes_abridged = [2, 3, 5, 7, 11, 13, 17, 19, 23]
primes_abridged
[2, 3, 5, 7, 11, 13, 17, 19, 23]

We can process this data to make a new list, or mapping.

Consider the following expression, which creates a (partial) mapping of primes_abridged, with its elements doubled:

primes_doubled = [
    2 * primes_abridged[0],
    2 * primes_abridged[1],
]

primes_doubled
[4, 6]

The above form works, strictly speaking; but, it would require a lot of typing to process all the elements of primes_abridged! Moreover, an expression of the above form requires that there are a certain set number of elements – the expression will fail if there are too few elements, and it will ignore any more elements.

We can instead write the following:

primes_doubled = []

for prime in primes_abridged:
    primes_doubled.append(2 * prime)
    
primes_doubled
[4, 6, 10, 14, 22, 26, 34, 38, 46]

In the above cell:

  1. First, we initialized an empty list, and assigned to this the name, primes_doubled

  2. Then, using the control structure of the “for loop,” we considered the elements of primes_abridged sequentially, in each “loop” assigning one of its elements the name, prime

  3. Finally, in each loop, we multiplied the value of the element by two, and added the value of this expression to the end of the list, primes_doubled, using the list method append.

This is a fundamental pattern in the majority of programming languages; and, one which doesn’t apply only to lists.

That said, there are other ways to instruct the computer to do the same – and which you may find are either more succinct (linguistically), more efficient (computationally), or both.

The most “Pythonic” way of producing primes_doubled involves the list comprehension.

primes_doubled = [2 * prime for prime in primes_abridged]

primes_doubled
[4, 6, 10, 14, 22, 26, 34, 38, 46]

In the list comprehension, we again used the keywords for and in, in order to consider the elements of primes_abridged sequentially, under the name prime. But in this case, we were able to construct primes_doubled with a single, clear expression of our mapping from one set of data to another.

We can use dictionary comprehensions to create dictionaries in a similar way to list comprehensions. Instead of using [] as in list comprehensions, a dictionary comprehension using {}.

{prime:2 * prime for prime in primes_abridged}
{2: 4, 3: 6, 5: 10, 7: 14, 11: 22, 13: 26, 17: 34, 19: 38, 23: 46}

Simulating the six-sided die#

In the last section we experimented with finding the number of even dice rolls when rolling a six-sided die 100 times. Now we can simulate repeating this experiment many times using the process of iteration.

Below we’ll redefine the six-sided die and other relevant items from the last section.

die = np.arange(1, 7)

def parity(input_integer): 
    '''Assumes integer input
    Returns even or odd'''
    if (input_integer % 2) == 0:
        return "even"
    else:
        return "odd"
    
vec_parity = np.vectorize(parity) 

Now we’ll set the desired number of iterations for this experiment and create a for loop to execute the experiment of rolling a die 100 times, checking whether the rolls are even, and then appending the number of evens to an array.

After the experiments are simulated, we’ll use the array of results to summarize our experiment. Finding the minimum and maximum number of evens rolled out of the total 100 rolls as well as the average of evens gives us useful and interesting information.

In the code below, we use np.empty(0) to create an empty array. As we iterate through the loop, we add the results of our experiment to this array.

Experiment below with different values for num_experiments, but be careful as it is easy to set up a long run time!

import numpy as np
np.arange(4)
array([0, 1, 2, 3])
num_experiments = 10_000

total_evens = np.empty(0)

# Below we iterate through the array np.arange(num_experiments)
# alternatively we could use range(num_experiments)

for i in np.arange(num_experiments): 
    choices = np.random.choice(die, 100)
    labels = vec_parity(choices)    
    total_evens = np.append(total_evens, sum(labels == 'even'))


# Since these aren't indented, they are outside of the "for" loop, and executed after it's done:

print('Number of experiments: {:,}'.format(len(total_evens)))
print('Min Evens (out of 100):', min(total_evens))
print('Max Evens (out of 100):', max(total_evens))
print('Mean Evens (out of 100):', round(np.mean(total_evens)))
Number of experiments: 10,000
Min Evens (out of 100): 31.0
Max Evens (out of 100): 69.0
Mean Evens (out of 100): 50

In later chapters, we will learn to visualize and explore this data in more detail. For now, if you want to see the contents of the result array you can run the following line:

total_evens
array([52., 45., 55., ..., 49., 53., 59.])

The process of simulating an experiment is much faster than performing the task and recording the results by hand. The for loop is essential in this action! It allows us to quickly get data that mimics a real world situation.