2.4. Strings#

Amanda R. Kube Jotte

A string is a data type that represents a sequence of characters, such as letters, numbers, or punctuation. Strings in Python are written inside quotation marks — single (' '), double (" "), or triple (''' ''' or """ """).

For example, I can create a string by placing the text that I want in my string between single, double, or triple quotes like below. All 3 will produce the same string.

'This is a sentence.'
'This is a sentence.'
"This is also a sentence."
'This is also a sentence.'
'''This is a third sentence.'''
'This is a third sentence.'

The code cells above show output printed below them because they contain an expression. Recall from the previous section that expressions are pieces of code that are not assignment statements (that is, they do not use an =).

Note

If an expression is the only line in a code cell, or if it is the last line in a cell, its result will automatically display below the cell. When this happens, we say that the code is printing to the console. You can think of this like your calculator showing you the result after you type something in.

Creating and Displaying Strings#

Usually, we wouldn’t just write a string in a code cell like this. If we want to print something to the console, we can use the print() function. You’ve seen this function before, but we have not properly introduced it. The print() function converts its input(s) to a string, and prints it. It does not need to be the only line or the last line to print to the console. It can be anywhere in your code.

var = 3 + 2
print('Today is a sunny day in Chicago.')
var
Today is a sunny day in Chicago.
5

The cell above creates a variable named var, prints the string 'Today is a sunny day in Chicago.', and then the expression in the last line var will print the results of var (in this case 5) to the console. So we see there are two printed outputs above.

You will see later in this book that printing things to the console can be very useful. For now, it is helpful to start learning how the print() function works and how strings can be used in Python.

When creating strings in Python, inside or outside of print(), it is recommended to use double quotes instead of single quotes, as they allow for the use of single quotations inside. In the example below, we get an error message when trying to use an apostrophe inside of single quotations.

print("Today's the day!")
Today's the day!
print('Today's the day!')
  Cell In[6], line 1
    print('Today's the day!')
                           ^
SyntaxError: unterminated string literal (detected at line 1)

While the above error can be fixed by wrapping the string in double quotes in place of the single quotes, it can also be fixed by an escape sequence. Escape sequences are string modifiers that allow for the use of certain characters that would otherwise be misinterpreted by Python. Because strings are created by the use of quotes, the escape sequences \' and \" allow for the use of quotes as part of a string:

print('Today\'s the day!')
Today's the day!

Other useful escape sequences include \n and \t. These allow for a new line and tab spacing to be added to a string, respectively.

print("This is the first sentence. \nThis sentence is on a new line! \tThis sentence comes after a tab.")
This is the first sentence. 
This sentence is on a new line! 	This sentence comes after a tab.

Triple quotes are useful if you want to create a string that spans multiple lines.

my_par = '''This is line 1.
This is line 2.
This is line 3.
'''
print(my_par) # display the text using the print() function
my_par # display the text without the print function
This is line 1.
This is line 2.
This is line 3.
'This is line 1.\nThis is line 2.\nThis is line 3.\n'

Above, I created a variable named my_par that stores three sentences, each separated by a newline. I then used the print() function to display the text, and also included my_par on the last line of the cell to show how Python displays it directly.

Note

The two outputs are formatted differently.

  • Without print(), Python shows the text literally—including the special newline character (\n) instead of starting a new line, and with quotation marks around the text.

  • With print(), Python formats the text in a more natural way: it starts new lines where the \n appears, and it removes the quotation marks.

Formatting text nicely isn’t the only benefit of using the print() function. Recall, the print() function converts its input(s) (also called arguments, see the next section for more information) to a string, and prints it. So far, we’ve given the function only one input – meaning we have put one string inside the parentheses. We could give it multiple strings:

print("This is the first input.", "This is another input.") # We separate inputs by commas

# The function separates inputs with a space and makes them one large string
print("There can", "also be", "three or more inputs.") 
This is the first input. This is another input.
There can also be three or more inputs.

Or even inputs that are not strings. Recall, we did this in Section 3.2.

# The second input here is a Boolean. Python makes it a string and combines the strings
print("1.5 == 0.8:", 1.5 == 0.8)
1.5 == 0.8: False

String Operations#

Joining together strings is also called concatenation. The print() function is not the only way to join together strings. We can concatenate strings using the mathematical operators: + and *.

str1 = "Moses supposes"
str1
'Moses supposes'
str2 = "his toeses are roses."
str2
'his toeses are roses.'
str1 + str2
'Moses supposeshis toeses are roses.'

When we concatenate strings using +, it is done very literally. As there is no space at the end of str1 or at the beginning of str2, when we join them together, there is no space between supposes and his. We can fix this by either changing the strings or concatenating a space in the middle.

str1 + " " + str2
'Moses supposes his toeses are roses.'

You can think of + as squishing together the strings on either side of the operator.

The + operator can be used with numeric data or with strings but NOT with a combination of both.

3 + 2.75 # This will work
5.75
"Hello " + "world!" # This will also work
'Hello world!'
9 + " lives" # This will produce an TypeError
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[39], line 1
----> 1 9 + " lives" # This will produce an TypeError

TypeError: unsupported operand type(s) for +: 'int' and 'str'

If you want to concatenate numbers and text, you have to first turn the numeric data into a string. You can do this by placing it in quotations like below:

"9" + " lives"
'9 lives'

Or you can use the str() function to create a string.

str(9) + " lives"
'9 lives'

Keep in mind that when a numerical value is converted to a string, it can no longer be used to perform certain mathematical calculations, such as division, subtraction, or exponentiation.

"2" ** 2 # this will produce a TypeError
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[42], line 1
----> 1 "2" ** 2 # this will produce a TypeError

TypeError: unsupported operand type(s) for ** or pow(): 'str' and 'int'
# This is concatenation not addition
"2" + "2"
'22'

We can also use the * operator on strings. This will concatenate the string with copies of itself. For example, the code below concatenates two copies of the string.

"waka" * 2
'wakawaka'

We can “multiply” strings by any integer.

"waka" * 7
'wakawakawakawakawakawakawaka'

But, we cannot use floats:

"waka" * 1.5
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[46], line 1
----> 1 "waka" * 1.5

TypeError: can't multiply sequence by non-int of type 'float'

Formatted Strings#

When we want to combine text with the value of a variable, it can be awkward to use multiple print() statements or string concatenation (+). Python provides a very convenient feature called an f-string (short for formatted string).

To make an f-string, put the letter f right before the quotation marks. Inside the string, you can include variables or even expressions inside curly braces {}. Python will replace the braces with the value.

For example:

name = "Katie"
age = 20

print(f"My name is {name} and I am {age} years old.")
My name is Katie and I am 20 years old.

This is easier to read and write than doing:

print("My name is " + name + " and I am " + str(age) + " years old.")
My name is Katie and I am 20 years old.

With f-strings, you can also put expressions directly inside the curly braces.

For example:

print(f"Next year I will be {age + 1} years old.")
Next year I will be 21 years old.

You can also format numbers directly in an f-string. For instance, if you want to show only two decimal places:

pi = 3.14159
print(f"Pi rounded to two decimals is {pi:.2f}")
Pi rounded to two decimals is 3.14

Or if you want to show 4 decimal places:

print(f"Pi rounded to two decimals is {pi:.4f}")
Pi rounded to two decimals is 3.1416

Warning

A common mistake that students make is forgetting the f at the beginning.

name = "Alex"
print("Hello {name}")   # Forgot the f

You will know you made this mistake if you see brackets and variable name (ie {name}) printed in your string instead of the value for the variable.

Checking and Changing Data Types#

If you need to confirm the data types of the values you are using, you can use the type() function that we learned in Section 3.1. The code below will print the data types of "waka", 1.5, and 7 which we should see are string (or str), float, and integer (or int).

print(type("waka"))
print(type(1.5))
print(type(7))
<class 'str'>
<class 'float'>
<class 'int'>

Often, this is more useful if we’ve created variables to store these values and have forgotten what type of data we stored.

my_var = 4 < 7
my_other_var = "It's raining cats and dogs."
type(my_var) # we don't have to print the result if its the last/only line
bool
type(my_other_var)
str

We’ve seen the int(), float(), and str() functions which can change the data type of their input. We can also use these to convert numerical values inside strings to integers and floats.

print(int('45'))
print(float('45'))
45
45.0

Remember, the int() and float() functions can only convert recognized numerical values. A string of letters cannot be converted to a float or integer.

int('Sorry')
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[57], line 1
----> 1 int('Sorry')

ValueError: invalid literal for int() with base 10: 'Sorry'

Indexing Strings#

Just as sentences are made up of letters, strings are made up of characters. Each of these characters is a letter, number, space, or symbol inside the quotation marks. For example, the string below is a sequence of 12 characters. The period and the space count as characters.

chars = "Soft Kitten."

If we want to check how many characters are in the string, we can use the function len() which gives the length of the string.

len(chars)
12

We can access particular characters inside of the string by doing something called indexing which we will discuss in much more detail in the next chapter. For now, notice that I can use square brackets ([]) with a number inside to indicate a particular character in the sequence.

chars[0] # In Python we count starting with 0
'S'
chars[1]
'o'
chars[8] # Remember S is 0 so 9 is the e
't'

String Comparisons#

In Section 3.2 we showed examples of comparisons of integers and floats, but strings can also be used with comparison operators.

a = 'Dan'
b = 'Mike'

print("a == b:", a == b)
print("a != b:", a != b)
print("a < b:", a < b)
print("a <= b:", a <= b)
print("a > b:", a > b)
print("a >= b:", a >= b)
a == b: False
a != b: True
a < b: True
a <= b: True
a > b: False
a >= b: False

As you can see, Python can compare two strings and check if they are equal or not. It can also use inequality operators (<, >, <=, >=) to compare them. The order is based on the characters’ underlying ASCII values, which give each letter, digit, and symbol a numeric code. This is called lexicographic order, and it works like a “dictionary order” based on those codes. For more information on ASCII, see Chapter 29 Section 6.

Letter case matters in comparisons: uppercase letters (like "A") come before lowercase letters (like "a") because their ASCII values are smaller. For example, ‘Dan’ and ‘dan’ are not considered the same strings as the capitalization is different.

'Dan' == 'dan'
False

In addition, ALL uppercase letters come before ALL lower case characters. So ‘M’ comes before ‘d’.

'dan' < 'Mike'
False

Python gives us lots of ways to work with strings. But typing out all the steps ourselves can get tedious. What if we want to quickly make all the letters in a string lowercase? Or replace every space with a period?

Python makes these tasks easier by giving us functions and methods — reusable “mini-programs” that do a specific job for us. Functions and methods aren’t just for strings. For example, they let us round numbers, find the mean of a dataset, and perform countless other operations with just one line of code.

In the next chapter, we’ll start exploring functions and methods and see how they can save us time, make our code cleaner, and unlock a whole new level of power in Python.