Glossary of Code

Glossary of Code#

Index#

A | B | C | D | E | F | H | I | K | L | M | N | O | P | R | S | T | V | W

A#

abs(…)#

Returns the absolute value of the given number.

abs(-7)   # 7

Learn more in Chapter 2 | Back to Top

Arithmetic Operators#

+ : Addition
- : Subtraction
* : Multiplication
/ : Division (always returns a float)
// : Floor Division (rounded down to nearest whole number)
% : Modulus (remainder after division)
** : Exponentiation (power)

Learn more in Chapter 2 | Back to Top

array.astype()#

Converts the elements of an array to a different data type.

arr = np.array([1.5, 2.8, 3.2])
print(arr.astype(int))  # [1 2 3]

Learn more in Chapter 3 | Back to Top

array.flatten()#

Returns a copy of the array collapsed into one dimension. Useful for turning a multi-dimensional array into a simple list-like array.

arr = np.array([[1, 2], [3, 4]])
print(arr.flatten())  # [1 2 3 4]

Learn more in Chapter 3 | Back to Top

array.shape#

An attribute of a numpy array showing its dimensions.

arr = np.array([[1, 2], [3, 4]])
print(arr.shape)  # (2, 2)

Learn more in Chapter 3 | Back to Top

Assignment Operator#

= : Assigns a value to a variable

Learn more in Chapter 2 | Back to Top

ax.bar()#

Bar plot for a specific subplot (axis object).
Docs

Learn more in Chapter 7 | Back to Top

ax.bar_label()#

Add numeric labels to bars in a bar chart.
Docs

Learn more in Chapter 7 | Back to Top

axis methods#

ax.set_title()
ax.set_xlabel()
ax.set_ylabel()
ax.set_ylim()
ax.set_xticks()

Set labels, titles, x-axis ticks, or y-axis range for a specific subplot.
Docs

Learn more in Chapter 7 | Back to Top

C#

Comment Character#

# : Signals to Python that anything written after it should be ignored

# This is a comment

Learn more in Chapter 2 | Back to Top

Comparison Operators#

== : Equal
!= : Not equal
< : Less than
<= : Less than or equal
> : Greater than
>= : Greater than or equal

Learn more in Chapter 2 | Back to Top

Counter()#

Counts the occurrences of elements in a collection and returns a dictionary-like object mapping each element to its frequency.

Learn more in Chapter 9 | Back to Top

Counter(iterable).most_common()#

This is a method of a Counter object that returns a list of elements and their counts, sorted from the most frequent to the least frequent.

Learn more in Chapter 9 | Back to Top

D#

del#

Deletes a variable, list element, or dictionary entry.

nums = [1, 2, 3]
del nums[1]
print(nums)  # [1, 3]

Learn more in Chapter 3 | Back to Top

df.apply(given_function)#

The .apply method is used to execute a function on a DataFrame or subsets of a DataFrame. Note, the given_function can be a built-in or user defined function.

Learn more in Chapter 6 | Back to Top

df.applymap(given_function)#

The .applymap method is used to execute a function on every element of a DataFrame. Note, df is a DataFrame and given_function can be a built-in or user defined function.

Learn more in Chapter 6 | Back to Top

df.assign(new_column = expression)#

Adds a new column to a DataFrame or modifies existing columns by evaluating the given expression, returning a new DataFrame without modifying the original.

df.assign(NewAge = df['Age'] + 5)  # creates a new column 'NewAge' with values 5 more than 'Age'

Learn more in Chapter 5 | Back to Top

df.columns#

Returns the column labels of a DataFrame as an Index object.

import pandas as pd

df = pd.DataFrame({'Name': ["Alice", "Bob"], 'Age': [24, 22]})
print(df.columns)  # Output: Index(['Name', 'Age'], dtype='object')

Learn more in Chapter 5 | Back to Top

df.copy()#

Creates and returns a copy of the DataFrame (or Series).

Learn more in Chapter 5 | Back to Top

df.drop()#

The .drop() method is used to delete specific rows or columns from a DataFrame, df.

Learn more in Chapter 6 | Back to Top

df.drop(labels, axis)#

Removes specified rows or columns from a DataFrame, returning a new DataFrame by default.

import pandas as pd

df = pd.DataFrame({'Name': ["Alice", "Bob"], 'Age': [24, 22]})
print(df)

# Drop a column
print(df.drop('Age', axis=1))

# Drop a row
print(df.drop(0, axis=0))

Learn more in Chapter 5 | Back to Top

df.groupby()#

The .groupby() method is applied to a DataFrame, df, to group data based on one or more criteria. Note an aggregate function is typically applied and calculate for each group.

Learn more in Chapter 6 | Back to Top

df.groupby().first#

The `.first() method returns the first row, if it exists, of each group.

Learn more in Chapter 6 | Back to Top

df.head(n)#

Returns the first n rows of the DataFrame (default is 5).

import pandas as pd

df = pd.DataFrame({'Name': ["Alice", "Bob", "Charlie", "David"], 'Age': [24, 22, 30, 28]})
print(df.head(2))  # first 2 rows

Learn more in Chapter 5 | Back to Top

df.hist()#

A pandas method for quick histograms.
Docs

Learn more in Chapter 7 | Back to Top

df.iloc[row_index, column_index]#

Selects rows and columns from a DataFrame by integer position (0-based indexing).

import pandas as pd

df = pd.DataFrame({'Name': ["Alice", "Bob", "Charlie"], 'Age': [24, 22, 30]})

print(df.iloc[0, 0])   # element at first row, first column ('Alice')
print(df.iloc[0:2, 1]) # 'Age' values of the first two rows

Learn more in Chapter 5 | Back to Top

df.index#

Returns the index (row labels) of a DataFrame or Series.

import pandas as pd

df = pd.DataFrame({'Name': ["Alice", "Bob"], 'Age': [24, 22]})
df.index  # shows the row labels (default RangeIndex in this case)

Learn more in Chapter 5 | Back to Top

df.loc[row_labels, column_labels]#

Selects rows and columns from a DataFrame by label (row/column names) rather than integer position.

import pandas as pd

df = pd.DataFrame({'Name': ["Alice", "Bob"], 'Age': [24, 22]}, index=['a', 'b'])

print(df.loc['a', 'Name'])         # Outputs the value at row 'a' and column 'Name' ('Alice')
print(df.loc['a':'b', 'Age'])      # Outputs the 'Age' column for rows 'a' through 'b'

Learn more in Chapter 5 | Back to Top

df.pivot_table()#

There are two general format for pivot_table;

df.pivot_table(data, index='group_1', columns='group_2', aggfunc='function')

or alternately, dependent on preference, the below format from directly from a pandas method can be used.

pd.pivot_table(df, values='data', index='group_1', columns = 'group_2', aggfunc='function').

In both formats, we have, df is the the given DataFrame, each unique value in index gets its own row, each unique value in columns gets its own column, and data specifies the value in the DataFrame to which we want to apply aggfunc

Learn more in Chapter 6 | Back to Top

df.plot()#

A pandas method for quick plots using matplotlib underneath.

df["column"].plot(kind="line")

Docs

Learn more in Chapter 7 | Back to Top

df.plot.box()#

A pandas method for quick box-and-whisker plots.
Docs

Learn more in Chapter 7 | Back to Top

df.reset_index()#

Resets the index of the DataFrame to the default integer index, moving the old index to a column unless drop=True is specified.

import pandas as pd

df = pd.DataFrame({'Name': ["Alice", "Bob"]}, index=['a', 'b'])
print(df)

print(df.reset_index())        # Outputs a DataFrame with the old index as a new column
print(df.reset_index(drop=True)) # Outputs a DataFrame with a default integer index, old index removed

Learn more in Chapter 5 | Back to Top

df.sample()#

Samples one or more rows from a dataframe. .sample() can take in multiple arguments, including n (number of rows to sample; defaults to 1), random_state (the initial state of the pseudo-random generator; defaults to None), and replace (whether we want to “replace” rows for re-sampling; defaults to False)

DataFrame.sample(n = 1, random_state = None, replace = False)

Learn more in Chapter 10 | Back to Top

df.set_index(column_name)#

Sets one or more columns of a DataFrame as the new index, replacing the existing index.

import pandas as pd

df = pd.DataFrame({'Name': ["Alice", "Bob"], 'Age': [24, 22]})
df = df.set_index('Name')  # sets the 'Name' column as the index

Learn more in Chapter 5 | Back to Top

df.sort_values(by, ascending=True)#

Sorts the DataFrame by the values of one or more columns. By default, sorts in ascending order.

import pandas as pd

df = pd.DataFrame({'Name': ["Alice", "Bob", "Charlie"], 'Age': [24, 22, 30]})

print(df.sort_values(by='Age'))           # Outputs the DataFrame sorted by 'Age' in ascending order
print(df.sort_values(by='Age', ascending=False))  # Outputs the DataFrame sorted by 'Age' in descending order

Learn more in Chapter 5 | Back to Top

df.tail(n)#

Returns the first n rows of the DataFrame (default is 5).

import pandas as pd

df = pd.DataFrame({'Name': ["Alice", "Bob", "Charlie", "David"], 'Age': [24, 22, 30, 28]})
print(df.tail(2))  # last 2 rows

Learn more in Chapter 5 | Back to Top

df.transpose()#

Swaps rows and columns of a DataFrame.
Docs

Learn more in Chapter 7 | Back to Top

df[‘column_name’]#

Selects a single column from a DataFrame as a Series, or multiple columns as a DataFrame when given a list of column names.

import pandas as pd

df = pd.DataFrame({'Name': ["Alice", "Bob"], 'Age': [24, 22], 'Interest': ['Music', 'Dance']})

# Single column as Series
df['Name'] # Outputs the Name column as a pandas Series.

# Multiple columns as DataFrame
df[['Name', 'Age']] # Outputs the Name and Age columns from the DataFrame.

Learn more in Chapter 5 | Back to Top

df[‘column_name’].count()#

Counts the number of non-NA values in a Series.

Learn more in Chapter 5 | Back to Top

df[‘column_name’].cumprod()#

Computes the cumulative product of a Series.

Learn more in Chapter 5 | Back to Top

df[‘column_name’].cumsum()#

Computes the cumulative sum of a Series.

Learn more in Chapter 5 | Back to Top

df[‘column_name’].diff()#

Calculates the difference between consecutive values in a Series, useful for computing changes over time.

import pandas as pd

df = pd.DataFrame({'Item': ['A', 'B', 'C'], 'Value': [10, 15, 20]})
df['Value'].diff() # Outputs a Series showing differences between consecutive values: [NaN, 5.0, 5.0]

Learn more in Chapter 5 | Back to Top

df[‘column_name’].max()#

Finds the maximum value in a Series.

Learn more in Chapter 5 | Back to Top

df[‘column_name’].mean()#

Calculates the mean of a Series (a particular column of the DataFrame).

Learn more in Chapter 5 | Back to Top

df[‘column_name’].median()#

Calculates the median of a Series.

Learn more in Chapter 5 | Back to Top

df[‘column_name’].min()#

Finds the minimum value in a Series.

Learn more in Chapter 5 | Back to Top

df[‘column_name’].mode()#

Calculates the mode of a Series.

Learn more in Chapter 5 | Back to Top

df[‘column_name’].pct_change()#

Calculates the percentage change between consecutive values in a Series, useful for analyzing relative changes over time.

Learn more in Chapter 5 | Back to Top

df[‘column_name’].quantile(q)#

Returns the value at the q-th quantile of a Series.

Learn more in Chapter 5 | Back to Top

df[‘column_name’].rename(new_name)#

Renames a Series or DataFrame column to the specified new name.

Learn more in Chapter 5 | Back to Top

df[‘column_name’].std()#

Calculates the standard deviation of a Series.

Learn more in Chapter 5 | Back to Top

df[‘column_name’].sum()#

Calculates the sum of all values in a Series.

Learn more in Chapter 5 | Back to Top

df[‘column_name’].var()#

Calculates the variance of a Series.

Learn more in Chapter 5 | Back to Top

df[n:m]#

Selects rows from index n up to (but not including) index m of the DataFrame, similar to Python list slicing.

import pandas as pd

df = pd.DataFrame({'Name': ["Alice", "Bob", "Charlie"], 'Age': [24, 22, 30]})
print(df[0:2])  # selects the first two rows (indices 0 and 1)

Learn more in Chapter 5 | Back to Top

dict()#

Creates a dictionary object.

d = dict(name="Alice", age=20)
print(d)  # {'name': 'Alice', 'age': 20}

Learn more in Chapter 3 | Back to Top

dict.items()#

A method that returns a view object of the dictionary’s key–value pairs as tuples.

student = {"name": "Alice", "age": 20}
print(student.items())  # dict_items([('name', 'Alice'), ('age', 20)])

Learn more in Chapter 3 | Back to Top

dict.keys()#

A method that returns a view object containing all the keys in a dictionary.

student = {"name": "Alice", "age": 20}
print(student.keys())  # dict_keys(['name', 'age'])

Learn more in Chapter 3 | Back to Top

dict.update()#

A method that updates a dictionary with key–value pairs from another dictionary (or from keyword arguments). Existing keys are overwritten.

student = {"name": "Alice", "age": 20}
student.update({"age": 21, "major": "Data Science"})
print(student)  # {'name': 'Alice', 'age': 21, 'major': 'Data Science'}

Learn more in Chapter 3 | Back to Top

dict.values()#

A method that returns a view object containing all the values in a dictionary.

student = {"name": "Alice", "age": 20}
print(student.values())  # dict_values(['Alice', 20])

Learn more in Chapter 3 | Back to Top

E#

Escape Sequences#

\' : Inserts a single quote in a string
\" : Inserts a double quote in a string
\n : Newline (starts a new line)
\t : Tab (inserts horizontal spacing)

print("Line 1\nLine 2")
# Line 1
# Line 2

Learn more in Chapter 2 | Back to Top

F#

fig.suptitle()#

Adds a title to the entire figure (all subplots).
Docs

Learn more in Chapter 7 | Back to Top

float()#

Converts a value into a float (decimal number).

float(7)   # 7.0

Learn more in Chapter 2 | Back to Top

For Statement#

A control statement which allows for iteration through a sequence to perform some action on each of its elements. This sequence can be any iterable object, including a list, a string, or a range of numbers, to name a few. The general form of a for statement is below:

for item in sequence:
    action    

Learn more in Chapter 4 | Back to Top

Function Definition#

def function_name(input_arguments):
    """ Docstring """
    
    # body of function
    return output

Learn more in Chapter 2 | Back to Top

H#

help()#

Displays documentation about functions, objects, or modules.

help(len)

Learn more in Chapter 2 | Back to Top

I#

If Statement#

A control statement of the form of an “if-then”: if statement P, the hypothesis, occurs, then statement Q, the conclusion, also occurs. This conditional statement follows the general format;

if hypothesis_1:
    conclusion_1
elif hypothesis_2:
    conclusion_2
... 
elif hypothesis_n:
    conclusion_n
else:
    conclusion    

Learn more in Chapter 4 | Back to Top

int()#

Converts a value into an integer (whole number).

int(7.9)   # 7

Learn more in Chapter 2 | Back to Top

K#

key:value pair#

A pair used in dictionaries where a key is mapped to a value.

student = {"name": "Alice", "age": 20}
# "name" is the key, "Alice" is the value

Learn more in Chapter 3 | Back to Top

L#

len(…)#

Returns the length (number of items) of the given input.

len("hello")   # 5

Learn more in Chapter 2 | Back to Top

list()#

Creates a list object, often by converting another sequence.

chars = list("hello")
print(chars)  # ['h', 'e', 'l', 'l', 'o']

Learn more in Chapter 3 | Back to Top

list.append()#

Adds an element to the end of a list.

nums = [1, 2, 3]
nums.append(4)
print(nums)  # [1, 2, 3, 4]

Learn more in Chapter 3 | Back to Top

list.copy()#

Creates a shallow copy of a list.

a = [1, 2, 3]
b = a.copy()
b.append(4)
print(a)  # [1, 2, 3]
print(b)  # [1, 2, 3, 4]

Learn more in Chapter 3 | Back to Top

list.index()#

Returns the index of the first occurrence of a value in a list.

fruits = ["apple", "banana", "cherry"]
print(fruits.index("banana"))  # 1

Learn more in Chapter 3 | Back to Top

list.insert()#

Inserts an element at a specified position in a list.

nums = [1, 2, 3]
nums.insert(1, 10)  # insert 10 at index 1
print(nums)  # [1, 10, 2, 3]

Learn more in Chapter 3 | Back to Top

list.pop()#

Removes and returns an element from a list (last by default, or a specified index).

nums = [1, 2, 3]
nums.pop()      # removes 3
nums.pop(0)     # removes 1

Learn more in Chapter 3 | Back to Top

list.reverse()#

Reverses the elements of a list in place.

nums = [1, 2, 3]
nums.reverse()
print(nums)  # [3, 2, 1]

Learn more in Chapter 3 | Back to Top

list.sort()#

Sorts a list in place.

nums = [3, 1, 2]
nums.sort()
print(nums)  # [1, 2, 3]

Learn more in Chapter 3 | Back to Top

Logical (Boolean) Operators#

and : True if both are true
or : True if at least one is true
not : True if input is false

Learn more in Chapter 2 | Back to Top

M#

math library#

A standard Python library that provides mathematical functions and constants.

import math

Learn more in Chapter 2 | Back to Top

math.ceil(…)#

Rounds the input up to the nearest whole number.

import math
math.ceil(3.1)   # 4

Learn more in Chapter 2 | Back to Top

math.e#

The constant \(e\) (Euler’s number ≈ 2.71828).

import math
math.e   # 2.71828...

Learn more in Chapter 2 | Back to Top

math.exp(…)#

Returns e to the power of the given number.

import math
math.exp(2)   # 7.389...

Learn more in Chapter 2 | Back to Top

math.factorial(…)#

Returns the factorial of the input.

import math
math.factorial(5)   # 120

Learn more in Chapter 2 | Back to Top

math.floor(…)#

Rounds the input down to the nearest whole number.

import math
math.floor(3.9)   # 3

Learn more in Chapter 2 | Back to Top

math.log(…)#

Returns the logarithm of a number. Uses base e (natural log) if no base is specified.

import math
math.log(8, 2)   # 3.0

Learn more in Chapter 2 | Back to Top

math.pi#

The constant \(\pi\) (pi ≈ 3.14159).

import math
math.pi   # 3.14159...

Learn more in Chapter 2 | Back to Top

math.sqrt(…)#

Returns the square root of the input.

import math
math.sqrt(16)   # 4.0

Learn more in Chapter 2 | Back to Top

matplotlib#

A popular Python library for creating static, animated, and interactive visualizations.
Documentation

Learn more in Chapter 7 | Back to Top

matplotlib.pyplot#

A submodule of matplotlib that provides a MATLAB-like interface for plotting.
Usually imported as:

import matplotlib.pyplot as plt

Documentation

Learn more in Chapter 7 | Back to Top

max(…)#

Returns the maximum (largest) value from the given inputs.

max(3, 9, 5)   # 9

Learn more in Chapter 2 | Back to Top

Membership Operators#

Check whether an element is in a sequence.

in: Returns True if an element exists.
not in: Returns True if an element does not exist.

fruits = ["apple", "banana"]
print("apple" in fruits)     # True
print("cherry" not in fruits)  # True

Learn more in Chapter 3 | Back to Top

min(…)#

Returns the minimum (smallest) value from the given inputs.

min(3, 9, 5)   # 3

Learn more in Chapter 2 | Back to Top

N#

NameError#

Error raised when you try to use a variable or function name that has not been defined.

print(x)   # NameError: name 'x' is not defined

Learn more in Chapter 2 | Back to Top

NoneType#

The data type of the special value None, which represents “nothing” or “no value.”

Learn more in Chapter 3 | Back to Top

norm.ppf()#

A scipy library function that calculates percentiles of the normal distribution. Arguments of the function include q, the lower tail probability, loc the mean, and scale the standard deviation of the normal distribution to generate.

# percentiles needed for a 90% confidence interval
L0=norm.ppf(0.05, loc=0, scale=1/2)
U0=norm.ppf(0.95, loc=0, scale=1/2)

Learn more in Chapter 12 | Back to Top

np.append(array, value)#

a numpy function which creates a new array from an existing one (arr) attaching value to the end.

Learn more in Chapter 4 | Back to Top

np.arange()#

Creates an array with evenly spaced values, similar to range().

np.arange(0, 10, 2)  # array([0, 2, 4, 6, 8])

Learn more in Chapter 3 | Back to Top

np.array()#

Creates a numpy array from a list (or other sequence).

import numpy as np
arr = np.array([1, 2, 3])

Learn more in Chapter 3 | Back to Top

np.average()#

Returns the average (mean) of array elements.

arr = np.array([1, 2, 3])
print(np.average(arr))  # 2.0

Learn more in Chapter 3 | Back to Top

np.column_stack()#

Stacks arrays as columns into a 2D array.

a = np.array([1, 2])
b = np.array([3, 4])
print(np.column_stack((a, b)))

Learn more in Chapter 3 | Back to Top

np.concatenate()#

Concatenates (combines) a sequence of different arrays

np.concatenate(array1, array2)

Learn more in Chapter 11 | Back to Top

np.empty(0)#

creates an empty array.

Learn more in Chapter 4 | Back to Top

np.log()#

Computes the natural logarithm (log with base e) of array elements.

arr = np.array([1, np.e, np.e**2])
print(np.log(arr))  # [0. 1. 2.]

Learn more in Chapter 3 | Back to Top

np.max()#

Returns the maximum value in an array.

arr = np.array([1, 5, 3])
print(np.max(arr))  # 5

Learn more in Chapter 3 | Back to Top

np.mean()#

Calculates the mean of elements in an array.

a = [10,20,30,40,50]
np.mean([a])   # 30

Learn more in Chapter 10 | Back to Top

np.min()#

Returns the minimum value in an array.

arr = np.array([1, 5, 3])
print(np.min(arr))  # 1

Learn more in Chapter 3 | Back to Top

np.ones()#

Creates an array filled with ones.

print(np.ones((2, 3)))

Learn more in Chapter 3 | Back to Top

np.percentile()#

Takes in an array of a distribution and computes the q percentile of data.

np.percentile(array, q=97.5)

Learn more in Chapter 12 | Back to Top

np.power()#

Raises array elements to a power.

arr = np.array([1, 2, 3])
print(np.power(arr, 2))  # [1 4 9]

Learn more in Chapter 3 | Back to Top

np.random.binomial()#

Randomly draws samples from a binomial distribution where n is the number of trials, p is the probability of success, and size

np.random.binomial(n, p, size)

Learn more in Chapter 11 | Back to Top

np.random.choice()#

Allows for random sampling from a given array of items. Can also be used to take in an integer argument, a, and randomly sample integers from within the range given by np.arange(a). By default, the sampling is with replacement. But this can be altered by setting the parameter replace to False.

import numpy as np

letters = ['a', 'b', 'c', 'd', 'e']
random_letters_with_replacement = np.random.choice(letters, size=3)
print(f"Randomly chosen letters with replacement: {random_letters_with_replacement}") # Randomly chosen letters with replacement: ['a' 'c' 'c']

random_letters_without_replacement = np.random.choice(letters, size=3, replace=False)
print(f"Randomly chosen letters without replacement: {random_letters_without_replacement}") # Randomly chosen letters without replacement: ['e' 'd' 'b']

# Choose 5 random integers from 0 up to (but not including) 5
random_integers = np.random.choice(5, size=5)
print(f"Random integers up to 5: {random_integers}") # Random integers up to 5: [1 4 2 0 1]

Learn more in Chapter 8 | Back to Top

np.random.choice(…)#

outputs exactly one item from the input sequence, selecting from it randomly and uniformly - or with equal opportunity.

Learn more in Chapter 4 | Back to Top

np.random.permutation()#

A random permutation (ie shuffle) from a provided sequence or range (x)

np.random.permutation(x)

Learn more in Chapter 11 | Back to Top

np.random.seed()#

In order to get reproducible results out of functions that utilizes the np.random module in different executions, it is advisable to set a seed. This is done to initialize the pseudo-random number generator algorithm, ensuring that a ‘randomized’ generation of numbers within numpy’s random functions is ultimately reproducible when using the same seed value.

# Set a specific seed value
np.random.seed(42)

# Generates the same random numbers between 0 to 5 in different runs
print(f"Five random numbers between 0 to 5: {np.random.choice(5, size=5)}") # Five random numbers between 0 to 5: [3 4 2 4 4]

Learn more in Chapter 8 | Back to Top

np.random.uniform()#

A function that randomly generates continuous values from a uniform distribution. The function takes in a value for low and high as well as size (the number of samples to draw) and draws from the continuous uniform probability distribution on the interval [low to high).

np.random.uniform(low = 1, high = 6, size = 10)

Learn more in Chapter 10 | Back to Top

np.reshape()#

Reshapes an array without changing its data.

arr = np.arange(6)
print(np.reshape(arr, (2, 3)))

Learn more in Chapter 3 | Back to Top

np.row_stack()#

Stacks arrays vertically (row by row).

a = np.array([1, 2])
b = np.array([3, 4])
print(np.row_stack((a, b)))

Learn more in Chapter 3 | Back to Top

np.sort()#

Returns a sorted copy of an array. By default, it sorts in ascending order along the last axis. For descending order, reverse the result with slicing ([::-1]).

arr = np.array([3, 1, 2])
print(np.sort(arr))        # [1 2 3]
print(np.sort(arr)[::-1])  # [3 2 1]

Learn more in Chapter 3 | Back to Top

np.sqrt()#

Returns the square root of array elements.

arr = np.array([1, 4, 9])
print(np.sqrt(arr))  # [1. 2. 3.]

Learn more in Chapter 3 | Back to Top

np.std()#

Calculates the standard deviation of elements in an array.

a = [10,20,30,40,50]
np.std([a])   #14.14

Learn more in Chapter 10 | Back to Top

np.sum()#

Returns the sum of array elements.

arr = np.array([1, 2, 3])
print(np.sum(arr))  # 6

Learn more in Chapter 3 | Back to Top

np.zeros()#

Creates an array filled with zeros.

print(np.zeros((2, 3)))

Learn more in Chapter 3 | Back to Top

numpy library#

A popular Python library for numerical computing. Provides arrays, mathematical functions, and tools for linear algebra, statistics, and more. Often imported as import numpy as np.

Learn more in Chapter 3 | Back to Top

O#

obj.tolist()#

Converts a pandas Series, Index, or a NumPy array into a regular Python list.

import pandas as pd

df = pd.DataFrame({'Name': ["Alice", "Bob"], 'Age': [24, 22]})
obj = df.columns
obj.tolist()  # ['Name', 'Age']

Learn more in Chapter 5 | Back to Top

P#

pandas library#

The pandas library is a Python package built on top of numpy for data analysis and manipulation.

import pandas as pd

Learn more in Chapter 5 | Back to Top

pandas.DataFrame.groupby()#

Helps with groupby operations on dataframes that involves grouping records based on one or more criteria such that different aggregate functions can be applied on the groups.

import pandas as pd
# Create a sample dataframe
data = {
    'Course Name': ['CS120', 'DATA118', 'CS221', 'DATA211', 'DATA259', 'CS340'],
    'Level': ['I', 'I', 'II', 'II', 'II', 'III', ],
    'Size' : [54, 29, 35, 55, 118, 67]
}
course_df = pd.DataFrame(data)

# Group courses by level
level_groups = course_df.groupby('Level')

# Print the average size for Level I courses
print(level_groups.get_group('I')[['Size']].mean()) #Size  41.5

Learn more in Chapter 8 | Back to Top

pd.DataFrame(…)#

Creates a 2D tabular data structure in pandas from data like arrays, lists, or dictionaries, with labeled rows and columns.

import pandas as pd

Students = pd.DataFrame({'Name':["Alice", "Bob", "Eve"], 'Major':["Math", "Music", "History"]}) # creates a DataFrame with two columns containing information about student's names and majors. 

Learn more in Chapter 5 | Back to Top

pd.Index(…)#

Creates an immutable, ordered collection of labels used as the index of a pandas Series or DataFrame.

import pandas as pd

index = pd.Index([10, 20, 30])  # creates an Index with labels 10, 20, 30

Learn more in Chapter 5 | Back to Top

pd.merge(df_left, df_right)#

combines common column names and takes all common rows of two DataFrames, df_left and df_right, to make a new DataFrame. The order of df_left is preserved.

Learn more in Chapter 6 | Back to Top

pd.RangeIndex(start=0, stop, step=1)#

Creates a sequence of integers to be used as a default index in a pandas DataFrame or Series.

import pandas as pd

index = pd.RangeIndex(start=5, stop=10, step=2)  # creates an index: 5, 7, 9

Learn more in Chapter 5 | Back to Top

pd.read_csv(file_path)#

Reads data from a CSV file, either stored on your computer or accessible via a URL, into a pandas DataFrame.

import pandas as pd

planets_df = pd.read_csv("planets.csv")  # reads the CSV file 'planets.csv' into a DataFrame.

Learn more in Chapter 5 | Back to Top

pd.Series(…)#

Creates a one-dimensional labeled array in pandas. It can be made from a list, array, or dictionary.

import pandas as pd

Even_numbers = pd.Series([2,4,6,8]) # creates a Series object consisting of the integers 2, 4, 6, 8.

Learn more in Chapter 5 | Back to Top

plt.bar()#

Creates a vertical bar chart.

plt.bar(["A","B","C"], [3,5,2])
plt.show()

Docs

Learn more in Chapter 7 | Back to Top

plt.barh()#

Creates a horizontal bar chart.
Docs

Learn more in Chapter 7 | Back to Top

plt.boxplot()#

Box-and-whisker plots for distributions.
Docs

Learn more in Chapter 7 | Back to Top

plt.colorbar()#

Adds a color scale bar to a plot.
Docs

Learn more in Chapter 7 | Back to Top

plt.figure()#

Creates a new plotting figure.
Docs

Learn more in Chapter 7 | Back to Top

plt.hist()#

Creates a histogram to show a variable’s distribution.

plt.hist([1,1,2,3,3,3,4])
plt.show()

Docs

Learn more in Chapter 7 | Back to Top

plt.legend()#

Adds a legend to a plot.
Docs

Learn more in Chapter 7 | Back to Top

plt.margins()#

Adjusts plot margins/padding.
Docs

Learn more in Chapter 7 | Back to Top

plt.pie()#

Creates a pie chart.
Docs

Learn more in Chapter 7 | Back to Top

plt.plot()#

Creates a line plot.

plt.plot([0,1,2], [0,1,4])
plt.show()

Docs

Learn more in Chapter 7 | Back to Top

plt.scatter()#

Creates a scatter plot of two variables.

plt.scatter([1,2,3], [4,5,6])
plt.show()

Docs

Learn more in Chapter 7 | Back to Top

plt.show()#

Displays all active plots.
Docs

Learn more in Chapter 7 | Back to Top

plt.stackplot()#

Stacked area plot.
Docs

Learn more in Chapter 7 | Back to Top

plt.subplots()#

Creates a figure with one or more subplots, returning (fig, ax).
Docs

Learn more in Chapter 7 | Back to Top

plt.tight_layout()#

Automatically adjusts subplot spacing to prevent overlap.
Docs

Learn more in Chapter 7 | Back to Top

plt.title()#

Adds a title to a plot.
Docs

Learn more in Chapter 7 | Back to Top

plt.xlabel() / plt.ylabel()#

Add x-axis labels.
Docs

Learn more in Chapter 7 | Back to Top

plt.xticks()#

Sets tick positions or labels on the x-axis.
Docs

Learn more in Chapter 7 | Back to Top

plt.ylabel()#

Add y-axis labels.
Docs

Learn more in Chapter 7 | Back to Top

print(…)#

Displays the input on the console.

print("hello")   # hello

Learn more in Chapter 2 | Back to Top

R#

range()#

Creates a sequence of numbers, typically used in loops.

for i in range(3):
    print(i)  # 0, 1, 2

Learn more in Chapter 3 | Back to Top

round(…)#

Rounds a number to the nearest integer (or to a specified number of decimal places if given).

round(3.14159, 2)   # 3.14

Learn more in Chapter 2 | Back to Top

S#

seaborn#

A statistical visualization library built on top of matplotlib.
It provides high-level plotting functions that make complex plots easier to create.

import seaborn as sns
sns.scatterplot(x="col1", y="col2", data=df)

Documentation

Learn more in Chapter 7 | Back to Top

Series.isin()#

Checks whether each element in a Series is contained in a given list, set, or other sequence.
Returns a Boolean Series with True for elements that are present and False otherwise.

import pandas as pd
s = pd.Series([1, 2, 3, 4, 5])
s.isin([2, 4, 6])

Learn more in Chapter 14 | Back to Top

Series.str.lower()#

Converts all string values in a Series to lowercase.
This method works element-wise across the entire Series without needing .apply().

s = pd.Series(["Apple", "BANANA", "Cherry"])
s.str.lower()

Learn more in Chapter 14 | Back to Top

stats.norm()#

Generates a normal continuous distribution with mean (loc) and standard deviation (scale)

stats.norm(loc, scale)

Learn more in Chapter 10 | Back to Top

str()#

Converts a value into a string.

str(123)   # '123'

Learn more in Chapter 2 | Back to Top

string.lower()#

Returns the string in all lowercase letters.

"HELLO".lower()   # 'hello'

Learn more in Chapter 2 | Back to Top

string.replace(‘old’, ‘new’)#

Returns a copy of the string with all ‘old’ substrings replaced by ‘new’.

"hello world".replace("world", "Python")
# 'hello Python'

Learn more in Chapter 2 | Back to Top

string.strip()#

Returns the string with whitespace removed from the beginning and end.

"  hello  ".strip()   # 'hello'

Learn more in Chapter 2 | Back to Top

string.upper()#

Returns the string in all uppercase letters.

"hello".upper()   # 'HELLO'

Learn more in Chapter 2 | Back to Top

sum(…)#

Returns the sum of all values in a given iterable (like a list).

sum([1, 2, 3])   # 6

Learn more in Chapter 2 | Back to Top

SyntaxError#

Error raised when Python code is written incorrectly and does not follow Python’s rules.

if True print("hi")   # SyntaxError

Learn more in Chapter 2 | Back to Top

T#

type(…)#

Returns the data type of the input.

type(3.14)   # <class 'float'>

Learn more in Chapter 2 | Back to Top

TypeError#

Error raised when an operation or function is applied to a value of the wrong type.

"5" + 2   # TypeError

Learn more in Chapter 2 | Back to Top

V#

ValueError#

Error raised when a function receives an argument of the right type but an inappropriate value.

int("hello")   # ValueError

Learn more in Chapter 2 | Back to Top

View Object#

A special object returned by certain dictionary methods such as .keys(), .values(), and .items().
A view object acts like a dynamic “window” into the dictionary’s contents:

It automatically updates if the dictionary is changed.
It is iterable, meaning you can loop over it just like a list.
If you need a fixed snapshot of the contents, you can convert it into a list with list().

student = {"name": "Alice", "age": 20}
keys_view = student.keys()
print(keys_view)  # dict_keys(['name', 'age'])

student["major"] = "Data Science"
print(keys_view)  # dict_keys(['name', 'age', 'major'])  # updated automatically

print(list(keys_view))  # ['name', 'age', 'major']

Learn more in Chapter 3 | Back to Top

W#

While Statement#

In control statement which allows for repeated execution of an action as long as, or while a specific condition is true.
The general format of a while loop is below.

while condition:
    action  

Learn more in Chapter 4 | Back to Top

Glossary of Code

Contents

Glossary of Code#

Index#

A#

abs(…)#

Arithmetic Operators#

array.astype()#

array.flatten()#

array.shape#

Assignment Operator#

ax.bar()#

ax.bar_label()#

axis methods#

B#

bar.set_color()#

bool()#

C#

Comment Character#

Comparison Operators#

Counter()#

Counter(iterable).most_common()#

D#

del#

df.apply(given_function)#

df.applymap(given_function)#

df.assign(new_column = expression)#

df.columns#

df.copy()#

df.drop()#

df.drop(labels, axis)#

df.groupby()#

df.groupby().first#

df.head(n)#

df.hist()#

df.iloc[row_index, column_index]#

df.index#

df.loc[row_labels, column_labels]#

df.pivot_table()#

df.plot()#

df.plot.box()#

df.reset_index()#

df.sample()#

df.set_index(column_name)#

df.sort_values(by, ascending=True)#

df.tail(n)#

df.transpose()#

df[‘column_name’]#

df[‘column_name’].count()#

df[‘column_name’].cumprod()#

df[‘column_name’].cumsum()#

df[‘column_name’].diff()#

df[‘column_name’].max()#

df[‘column_name’].mean()#

df[‘column_name’].median()#

df[‘column_name’].min()#

df[‘column_name’].mode()#

df[‘column_name’].pct_change()#

df[‘column_name’].quantile(q)#

df[‘column_name’].rename(new_name)#

df[‘column_name’].std()#

df[‘column_name’].sum()#

df[‘column_name’].var()#

df[n:m]#

dict()#

dict.items()#

dict.keys()#

dict.update()#

dict.values()#

E#

Escape Sequences#

F#

fig.suptitle()#

float()#

For Statement#

Function Definition#

H#

help()#

I#

If Statement#