Glossary of Code#
Index#
A#
abs(…)#
Returns the absolute value of the given number.
abs(-7) # 7
Learn more in Chapter 2 | Back to Top
Arithmetic Operators#
+
: Addition-
: Subtraction*
: Multiplication/
: Division (always returns a float)//
: Floor Division (rounded down to nearest whole number)%
: Modulus (remainder after division)**
: Exponentiation (power)
Learn more in Chapter 2 | Back to Top
array.astype()#
Converts the elements of an array to a different data type.
arr = np.array([1.5, 2.8, 3.2])
print(arr.astype(int)) # [1 2 3]
Learn more in Chapter 3 | Back to Top
array.flatten()#
Returns a copy of the array collapsed into one dimension. Useful for turning a multi-dimensional array into a simple list-like array.
arr = np.array([[1, 2], [3, 4]])
print(arr.flatten()) # [1 2 3 4]
Learn more in Chapter 3 | Back to Top
array.shape#
An attribute of a numpy
array showing its dimensions.
arr = np.array([[1, 2], [3, 4]])
print(arr.shape) # (2, 2)
Learn more in Chapter 3 | Back to Top
Assignment Operator#
=
: Assigns a value to a variable
Learn more in Chapter 2 | Back to Top
ax.bar()#
Bar plot for a specific subplot (axis object).
Docs
Learn more in Chapter 7 | Back to Top
ax.bar_label()#
Add numeric labels to bars in a bar chart.
Docs
Learn more in Chapter 7 | Back to Top
axis methods#
ax.set_title()
ax.set_xlabel()
ax.set_ylabel()
ax.set_ylim()
ax.set_xticks()
Set labels, titles, x-axis ticks, or y-axis range for a specific subplot.
Docs
Learn more in Chapter 7 | Back to Top
B#
bar.set_color()#
Change the color of a bar object.
Docs
Learn more in Chapter 7 | Back to Top
bool()#
Converts a value into a boolean (True or False).
bool(0) # False
bool(5) # True
Learn more in Chapter 2 | Back to Top
C#
Comparison Operators#
==
: Equal!=
: Not equal<
: Less than<=
: Less than or equal>
: Greater than>=
: Greater than or equal
Learn more in Chapter 2 | Back to Top
Counter()#
Counts the occurrences of elements in a collection and returns a dictionary-like object mapping each element to its frequency.
Learn more in Chapter 9 | Back to Top
Counter(iterable).most_common()#
This is a method of a Counter
object that returns a list of elements and their counts, sorted from the most frequent to the least frequent.
Learn more in Chapter 9 | Back to Top
D#
del#
Deletes a variable, list element, or dictionary entry.
nums = [1, 2, 3]
del nums[1]
print(nums) # [1, 3]
Learn more in Chapter 3 | Back to Top
df.apply(given_function)#
The .apply method is used to execute a function on a DataFrame or subsets of a DataFrame. Note, the given_function can be a built-in or user defined function.
Learn more in Chapter 6 | Back to Top
df.applymap(given_function)#
The .applymap method is used to execute a function on every element of a DataFrame. Note, df is a DataFrame and given_function can be a built-in or user defined function.
Learn more in Chapter 6 | Back to Top
df.assign(new_column = expression)#
Adds a new column to a DataFrame or modifies existing columns by evaluating the given expression, returning a new DataFrame without modifying the original.
df.assign(NewAge = df['Age'] + 5) # creates a new column 'NewAge' with values 5 more than 'Age'
Learn more in Chapter 5 | Back to Top
df.columns#
Returns the column labels of a DataFrame as an Index object.
import pandas as pd
df = pd.DataFrame({'Name': ["Alice", "Bob"], 'Age': [24, 22]})
print(df.columns) # Output: Index(['Name', 'Age'], dtype='object')
Learn more in Chapter 5 | Back to Top
df.copy()#
Creates and returns a copy of the DataFrame (or Series).
Learn more in Chapter 5 | Back to Top
df.drop()#
The .drop() method is used to delete specific rows or columns from a DataFrame, df.
Learn more in Chapter 6 | Back to Top
df.drop(labels, axis)#
Removes specified rows or columns from a DataFrame, returning a new DataFrame by default.
import pandas as pd
df = pd.DataFrame({'Name': ["Alice", "Bob"], 'Age': [24, 22]})
print(df)
# Drop a column
print(df.drop('Age', axis=1))
# Drop a row
print(df.drop(0, axis=0))
Learn more in Chapter 5 | Back to Top
df.groupby()#
The .groupby() method is applied to a DataFrame, df, to group data based on one or more criteria. Note an aggregate function is typically applied and calculate for each group.
Learn more in Chapter 6 | Back to Top
df.groupby().first#
The `.first() method returns the first row, if it exists, of each group.
Learn more in Chapter 6 | Back to Top
df.head(n)#
Returns the first n
rows of the DataFrame (default is 5).
import pandas as pd
df = pd.DataFrame({'Name': ["Alice", "Bob", "Charlie", "David"], 'Age': [24, 22, 30, 28]})
print(df.head(2)) # first 2 rows
Learn more in Chapter 5 | Back to Top
df.hist()#
A pandas method for quick histograms.
Docs
Learn more in Chapter 7 | Back to Top
df.iloc[row_index, column_index]#
Selects rows and columns from a DataFrame by integer position (0-based indexing).
import pandas as pd
df = pd.DataFrame({'Name': ["Alice", "Bob", "Charlie"], 'Age': [24, 22, 30]})
print(df.iloc[0, 0]) # element at first row, first column ('Alice')
print(df.iloc[0:2, 1]) # 'Age' values of the first two rows
Learn more in Chapter 5 | Back to Top
df.index#
Returns the index (row labels) of a DataFrame or Series.
import pandas as pd
df = pd.DataFrame({'Name': ["Alice", "Bob"], 'Age': [24, 22]})
df.index # shows the row labels (default RangeIndex in this case)
Learn more in Chapter 5 | Back to Top
df.loc[row_labels, column_labels]#
Selects rows and columns from a DataFrame by label (row/column names) rather than integer position.
import pandas as pd
df = pd.DataFrame({'Name': ["Alice", "Bob"], 'Age': [24, 22]}, index=['a', 'b'])
print(df.loc['a', 'Name']) # Outputs the value at row 'a' and column 'Name' ('Alice')
print(df.loc['a':'b', 'Age']) # Outputs the 'Age' column for rows 'a' through 'b'
Learn more in Chapter 5 | Back to Top
df.pivot_table()#
There are two general format for pivot_table;
df.pivot_table(data, index='group_1', columns='group_2', aggfunc='function')
or alternately, dependent on preference, the below format from directly from a pandas method can be used.
pd.pivot_table(df, values='data', index='group_1', columns = 'group_2', aggfunc='function').
In both formats, we have, df is the the given DataFrame, each unique value in index
gets its own row, each unique value in columns
gets its own column, and data
specifies the value in the DataFrame to which we want to apply aggfunc
Learn more in Chapter 6 | Back to Top
df.plot()#
A pandas method for quick plots using matplotlib underneath.
df["column"].plot(kind="line")
Learn more in Chapter 7 | Back to Top
df.plot.box()#
A pandas method for quick box-and-whisker plots.
Docs
Learn more in Chapter 7 | Back to Top
df.reset_index()#
Resets the index of the DataFrame to the default integer index, moving the old index to a column unless drop=True
is specified.
import pandas as pd
df = pd.DataFrame({'Name': ["Alice", "Bob"]}, index=['a', 'b'])
print(df)
print(df.reset_index()) # Outputs a DataFrame with the old index as a new column
print(df.reset_index(drop=True)) # Outputs a DataFrame with a default integer index, old index removed
Learn more in Chapter 5 | Back to Top
df.sample()#
Samples one or more rows from a dataframe. .sample()
can take in multiple arguments, including n
(number of rows to sample; defaults to 1), random_state
(the initial state of the pseudo-random generator; defaults to None), and replace
(whether we want to “replace” rows for re-sampling; defaults to False)
DataFrame.sample(n = 1, random_state = None, replace = False)
Learn more in Chapter 10 | Back to Top
df.set_index(column_name)#
Sets one or more columns of a DataFrame as the new index, replacing the existing index.
import pandas as pd
df = pd.DataFrame({'Name': ["Alice", "Bob"], 'Age': [24, 22]})
df = df.set_index('Name') # sets the 'Name' column as the index
Learn more in Chapter 5 | Back to Top
df.sort_values(by, ascending=True)#
Sorts the DataFrame by the values of one or more columns. By default, sorts in ascending order.
import pandas as pd
df = pd.DataFrame({'Name': ["Alice", "Bob", "Charlie"], 'Age': [24, 22, 30]})
print(df.sort_values(by='Age')) # Outputs the DataFrame sorted by 'Age' in ascending order
print(df.sort_values(by='Age', ascending=False)) # Outputs the DataFrame sorted by 'Age' in descending order
Learn more in Chapter 5 | Back to Top
df.tail(n)#
Returns the first n
rows of the DataFrame (default is 5).
import pandas as pd
df = pd.DataFrame({'Name': ["Alice", "Bob", "Charlie", "David"], 'Age': [24, 22, 30, 28]})
print(df.tail(2)) # last 2 rows
Learn more in Chapter 5 | Back to Top
df.transpose()#
Swaps rows and columns of a DataFrame.
Docs
Learn more in Chapter 7 | Back to Top
df[‘column_name’]#
Selects a single column from a DataFrame as a Series, or multiple columns as a DataFrame when given a list of column names.
import pandas as pd
df = pd.DataFrame({'Name': ["Alice", "Bob"], 'Age': [24, 22], 'Interest': ['Music', 'Dance']})
# Single column as Series
df['Name'] # Outputs the Name column as a pandas Series.
# Multiple columns as DataFrame
df[['Name', 'Age']] # Outputs the Name and Age columns from the DataFrame.
Learn more in Chapter 5 | Back to Top
df[‘column_name’].count()#
Counts the number of non-NA values in a Series.
Learn more in Chapter 5 | Back to Top
df[‘column_name’].cumprod()#
Computes the cumulative product of a Series.
Learn more in Chapter 5 | Back to Top
df[‘column_name’].cumsum()#
Computes the cumulative sum of a Series.
Learn more in Chapter 5 | Back to Top
df[‘column_name’].diff()#
Calculates the difference between consecutive values in a Series, useful for computing changes over time.
import pandas as pd
df = pd.DataFrame({'Item': ['A', 'B', 'C'], 'Value': [10, 15, 20]})
df['Value'].diff() # Outputs a Series showing differences between consecutive values: [NaN, 5.0, 5.0]
Learn more in Chapter 5 | Back to Top
df[‘column_name’].max()#
Finds the maximum value in a Series.
Learn more in Chapter 5 | Back to Top
df[‘column_name’].mean()#
Calculates the mean of a Series (a particular column of the DataFrame).
Learn more in Chapter 5 | Back to Top
df[‘column_name’].median()#
Calculates the median of a Series.
Learn more in Chapter 5 | Back to Top
df[‘column_name’].min()#
Finds the minimum value in a Series.
Learn more in Chapter 5 | Back to Top
df[‘column_name’].mode()#
Calculates the mode of a Series.
Learn more in Chapter 5 | Back to Top
df[‘column_name’].pct_change()#
Calculates the percentage change between consecutive values in a Series, useful for analyzing relative changes over time.
Learn more in Chapter 5 | Back to Top
df[‘column_name’].quantile(q)#
Returns the value at the q-th quantile of a Series.
Learn more in Chapter 5 | Back to Top
df[‘column_name’].rename(new_name)#
Renames a Series or DataFrame column to the specified new name.
Learn more in Chapter 5 | Back to Top
df[‘column_name’].std()#
Calculates the standard deviation of a Series.
Learn more in Chapter 5 | Back to Top
df[‘column_name’].sum()#
Calculates the sum of all values in a Series.
Learn more in Chapter 5 | Back to Top
df[‘column_name’].var()#
Calculates the variance of a Series.
Learn more in Chapter 5 | Back to Top
df[n:m]#
Selects rows from index n
up to (but not including) index m
of the DataFrame, similar to Python list slicing.
import pandas as pd
df = pd.DataFrame({'Name': ["Alice", "Bob", "Charlie"], 'Age': [24, 22, 30]})
print(df[0:2]) # selects the first two rows (indices 0 and 1)
Learn more in Chapter 5 | Back to Top
dict()#
Creates a dictionary object.
d = dict(name="Alice", age=20)
print(d) # {'name': 'Alice', 'age': 20}
Learn more in Chapter 3 | Back to Top
dict.items()#
A method that returns a view object of the dictionary’s key–value pairs as tuples.
student = {"name": "Alice", "age": 20}
print(student.items()) # dict_items([('name', 'Alice'), ('age', 20)])
Learn more in Chapter 3 | Back to Top
dict.keys()#
A method that returns a view object containing all the keys in a dictionary.
student = {"name": "Alice", "age": 20}
print(student.keys()) # dict_keys(['name', 'age'])
Learn more in Chapter 3 | Back to Top
dict.update()#
A method that updates a dictionary with key–value pairs from another dictionary (or from keyword arguments). Existing keys are overwritten.
student = {"name": "Alice", "age": 20}
student.update({"age": 21, "major": "Data Science"})
print(student) # {'name': 'Alice', 'age': 21, 'major': 'Data Science'}
Learn more in Chapter 3 | Back to Top
dict.values()#
A method that returns a view object containing all the values in a dictionary.
student = {"name": "Alice", "age": 20}
print(student.values()) # dict_values(['Alice', 20])
Learn more in Chapter 3 | Back to Top
E#
Escape Sequences#
\'
: Inserts a single quote in a string\"
: Inserts a double quote in a string\n
: Newline (starts a new line)\t
: Tab (inserts horizontal spacing)
print("Line 1\nLine 2")
# Line 1
# Line 2
Learn more in Chapter 2 | Back to Top
F#
fig.suptitle()#
Adds a title to the entire figure (all subplots).
Docs
Learn more in Chapter 7 | Back to Top
float()#
Converts a value into a float (decimal number).
float(7) # 7.0
Learn more in Chapter 2 | Back to Top
For Statement#
A control statement which allows for iteration through a sequence to perform some action on each of its elements. This sequence can be any iterable object, including a list, a string, or a range of numbers, to name a few. The general form of a for
statement is below:
for item in sequence:
action
Learn more in Chapter 4 | Back to Top
Function Definition#
def function_name(input_arguments):
""" Docstring """
# body of function
return output
Learn more in Chapter 2 | Back to Top
H#
help()#
Displays documentation about functions, objects, or modules.
help(len)
Learn more in Chapter 2 | Back to Top
I#
If Statement#
A control statement of the form of an “if-then”: if statement P
, the hypothesis, occurs, then statement Q
, the conclusion, also occurs. This conditional statement follows the general format;
if hypothesis_1:
conclusion_1
elif hypothesis_2:
conclusion_2
...
elif hypothesis_n:
conclusion_n
else:
conclusion
Learn more in Chapter 4 | Back to Top
int()#
Converts a value into an integer (whole number).
int(7.9) # 7
Learn more in Chapter 2 | Back to Top
K#
key:value pair#
A pair used in dictionaries where a key is mapped to a value.
student = {"name": "Alice", "age": 20}
# "name" is the key, "Alice" is the value
Learn more in Chapter 3 | Back to Top
L#
len(…)#
Returns the length (number of items) of the given input.
len("hello") # 5
Learn more in Chapter 2 | Back to Top
list()#
Creates a list object, often by converting another sequence.
chars = list("hello")
print(chars) # ['h', 'e', 'l', 'l', 'o']
Learn more in Chapter 3 | Back to Top
list.append()#
Adds an element to the end of a list.
nums = [1, 2, 3]
nums.append(4)
print(nums) # [1, 2, 3, 4]
Learn more in Chapter 3 | Back to Top
list.copy()#
Creates a shallow copy of a list.
a = [1, 2, 3]
b = a.copy()
b.append(4)
print(a) # [1, 2, 3]
print(b) # [1, 2, 3, 4]
Learn more in Chapter 3 | Back to Top
list.index()#
Returns the index of the first occurrence of a value in a list.
fruits = ["apple", "banana", "cherry"]
print(fruits.index("banana")) # 1
Learn more in Chapter 3 | Back to Top
list.insert()#
Inserts an element at a specified position in a list.
nums = [1, 2, 3]
nums.insert(1, 10) # insert 10 at index 1
print(nums) # [1, 10, 2, 3]
Learn more in Chapter 3 | Back to Top
list.pop()#
Removes and returns an element from a list (last by default, or a specified index).
nums = [1, 2, 3]
nums.pop() # removes 3
nums.pop(0) # removes 1
Learn more in Chapter 3 | Back to Top
list.reverse()#
Reverses the elements of a list in place.
nums = [1, 2, 3]
nums.reverse()
print(nums) # [3, 2, 1]
Learn more in Chapter 3 | Back to Top
list.sort()#
Sorts a list in place.
nums = [3, 1, 2]
nums.sort()
print(nums) # [1, 2, 3]
Learn more in Chapter 3 | Back to Top
Logical (Boolean) Operators#
and
: True if both are trueor
: True if at least one is truenot
: True if input is false
Learn more in Chapter 2 | Back to Top
M#
math library#
A standard Python library that provides mathematical functions and constants.
import math
Learn more in Chapter 2 | Back to Top
math.ceil(…)#
Rounds the input up to the nearest whole number.
import math
math.ceil(3.1) # 4
Learn more in Chapter 2 | Back to Top
math.e#
The constant \(e\) (Euler’s number ≈ 2.71828).
import math
math.e # 2.71828...
Learn more in Chapter 2 | Back to Top
math.exp(…)#
Returns e to the power of the given number.
import math
math.exp(2) # 7.389...
Learn more in Chapter 2 | Back to Top
math.factorial(…)#
Returns the factorial of the input.
import math
math.factorial(5) # 120
Learn more in Chapter 2 | Back to Top
math.floor(…)#
Rounds the input down to the nearest whole number.
import math
math.floor(3.9) # 3
Learn more in Chapter 2 | Back to Top
math.log(…)#
Returns the logarithm of a number. Uses base e (natural log) if no base is specified.
import math
math.log(8, 2) # 3.0
Learn more in Chapter 2 | Back to Top
math.pi#
The constant \(\pi\) (pi ≈ 3.14159).
import math
math.pi # 3.14159...
Learn more in Chapter 2 | Back to Top
math.sqrt(…)#
Returns the square root of the input.
import math
math.sqrt(16) # 4.0
Learn more in Chapter 2 | Back to Top
matplotlib#
A popular Python library for creating static, animated, and interactive visualizations.
Documentation
Learn more in Chapter 7 | Back to Top
matplotlib.pyplot#
A submodule of matplotlib that provides a MATLAB-like interface for plotting.
Usually imported as:
import matplotlib.pyplot as plt
Learn more in Chapter 7 | Back to Top
max(…)#
Returns the maximum (largest) value from the given inputs.
max(3, 9, 5) # 9
Learn more in Chapter 2 | Back to Top
Membership Operators#
Check whether an element is in a sequence.
in
: Returns True if an element exists.not in
: Returns True if an element does not exist.
fruits = ["apple", "banana"]
print("apple" in fruits) # True
print("cherry" not in fruits) # True
Learn more in Chapter 3 | Back to Top
min(…)#
Returns the minimum (smallest) value from the given inputs.
min(3, 9, 5) # 3
Learn more in Chapter 2 | Back to Top
N#
NameError#
Error raised when you try to use a variable or function name that has not been defined.
print(x) # NameError: name 'x' is not defined
Learn more in Chapter 2 | Back to Top
NoneType#
The data type of the special value None
, which represents “nothing” or “no value.”
Learn more in Chapter 3 | Back to Top
norm.ppf()#
A scipy
library function that calculates percentiles of the normal distribution. Arguments of the function include q
, the lower tail probability, loc
the mean, and scale
the standard deviation of the normal distribution to generate.
# percentiles needed for a 90% confidence interval
L0=norm.ppf(0.05, loc=0, scale=1/2)
U0=norm.ppf(0.95, loc=0, scale=1/2)
Learn more in Chapter 12 | Back to Top
np.append(array, value)#
a numpy function which creates a new array from an existing one (arr) attaching value to the end.
Learn more in Chapter 4 | Back to Top
np.arange()#
Creates an array with evenly spaced values, similar to range()
.
np.arange(0, 10, 2) # array([0, 2, 4, 6, 8])
Learn more in Chapter 3 | Back to Top
np.array()#
Creates a numpy
array from a list (or other sequence).
import numpy as np
arr = np.array([1, 2, 3])
Learn more in Chapter 3 | Back to Top
np.average()#
Returns the average (mean) of array elements.
arr = np.array([1, 2, 3])
print(np.average(arr)) # 2.0
Learn more in Chapter 3 | Back to Top
np.column_stack()#
Stacks arrays as columns into a 2D array.
a = np.array([1, 2])
b = np.array([3, 4])
print(np.column_stack((a, b)))
Learn more in Chapter 3 | Back to Top
np.concatenate()#
Concatenates (combines) a sequence of different arrays
np.concatenate(array1, array2)
Learn more in Chapter 11 | Back to Top
np.empty(0)#
creates an empty array.
Learn more in Chapter 4 | Back to Top
np.log()#
Computes the natural logarithm (log with base e) of array elements.
arr = np.array([1, np.e, np.e**2])
print(np.log(arr)) # [0. 1. 2.]
Learn more in Chapter 3 | Back to Top
np.max()#
Returns the maximum value in an array.
arr = np.array([1, 5, 3])
print(np.max(arr)) # 5
Learn more in Chapter 3 | Back to Top
np.mean()#
Calculates the mean of elements in an array.
a = [10,20,30,40,50]
np.mean([a]) # 30
Learn more in Chapter 10 | Back to Top
np.min()#
Returns the minimum value in an array.
arr = np.array([1, 5, 3])
print(np.min(arr)) # 1
Learn more in Chapter 3 | Back to Top
np.ones()#
Creates an array filled with ones.
print(np.ones((2, 3)))
Learn more in Chapter 3 | Back to Top
np.percentile()#
Takes in an array
of a distribution and computes the q
percentile of data.
np.percentile(array, q=97.5)
Learn more in Chapter 12 | Back to Top
np.power()#
Raises array elements to a power.
arr = np.array([1, 2, 3])
print(np.power(arr, 2)) # [1 4 9]
Learn more in Chapter 3 | Back to Top
np.random.binomial()#
Randomly draws samples from a binomial distribution where n
is the number of trials, p
is the probability of success, and size
np.random.binomial(n, p, size)
Learn more in Chapter 11 | Back to Top
np.random.choice()#
Allows for random sampling from a given array of items. Can also be used to take in an integer argument, a
, and randomly sample integers from within the range given by np.arange(a)
. By default, the sampling is with replacement. But this can be altered by setting the parameter replace
to False
.
import numpy as np
letters = ['a', 'b', 'c', 'd', 'e']
random_letters_with_replacement = np.random.choice(letters, size=3)
print(f"Randomly chosen letters with replacement: {random_letters_with_replacement}") # Randomly chosen letters with replacement: ['a' 'c' 'c']
random_letters_without_replacement = np.random.choice(letters, size=3, replace=False)
print(f"Randomly chosen letters without replacement: {random_letters_without_replacement}") # Randomly chosen letters without replacement: ['e' 'd' 'b']
# Choose 5 random integers from 0 up to (but not including) 5
random_integers = np.random.choice(5, size=5)
print(f"Random integers up to 5: {random_integers}") # Random integers up to 5: [1 4 2 0 1]
Learn more in Chapter 8 | Back to Top
np.random.choice(…)#
outputs exactly one item from the input sequence, selecting from it randomly and uniformly - or with equal opportunity.
Learn more in Chapter 4 | Back to Top
np.random.permutation()#
A random permutation (ie shuffle) from a provided sequence or range (x
)
np.random.permutation(x)
Learn more in Chapter 11 | Back to Top
np.random.seed()#
In order to get reproducible results out of functions that utilizes the np.random
module in different executions, it is advisable to set a seed. This is done to initialize the pseudo-random number generator algorithm, ensuring that a ‘randomized’ generation of numbers within numpy
’s random functions is ultimately reproducible when using the same seed value.
# Set a specific seed value
np.random.seed(42)
# Generates the same random numbers between 0 to 5 in different runs
print(f"Five random numbers between 0 to 5: {np.random.choice(5, size=5)}") # Five random numbers between 0 to 5: [3 4 2 4 4]
Learn more in Chapter 8 | Back to Top
np.random.uniform()#
A function that randomly generates continuous values from a uniform distribution. The function takes in a value for low
and high
as well as size
(the number of samples to draw) and draws from the continuous uniform probability distribution on the interval [low
to high
).
np.random.uniform(low = 1, high = 6, size = 10)
Learn more in Chapter 10 | Back to Top
np.reshape()#
Reshapes an array without changing its data.
arr = np.arange(6)
print(np.reshape(arr, (2, 3)))
Learn more in Chapter 3 | Back to Top
np.row_stack()#
Stacks arrays vertically (row by row).
a = np.array([1, 2])
b = np.array([3, 4])
print(np.row_stack((a, b)))
Learn more in Chapter 3 | Back to Top
np.sort()#
Returns a sorted copy of an array. By default, it sorts in ascending order along the last axis. For descending order, reverse the result with slicing ([::-1]
).
arr = np.array([3, 1, 2])
print(np.sort(arr)) # [1 2 3]
print(np.sort(arr)[::-1]) # [3 2 1]
Learn more in Chapter 3 | Back to Top
np.sqrt()#
Returns the square root of array elements.
arr = np.array([1, 4, 9])
print(np.sqrt(arr)) # [1. 2. 3.]
Learn more in Chapter 3 | Back to Top
np.std()#
Calculates the standard deviation of elements in an array.
a = [10,20,30,40,50]
np.std([a]) #14.14
Learn more in Chapter 10 | Back to Top
np.sum()#
Returns the sum of array elements.
arr = np.array([1, 2, 3])
print(np.sum(arr)) # 6
Learn more in Chapter 3 | Back to Top
np.zeros()#
Creates an array filled with zeros.
print(np.zeros((2, 3)))
Learn more in Chapter 3 | Back to Top
numpy library#
A popular Python library for numerical computing. Provides arrays, mathematical functions, and tools for linear algebra, statistics, and more. Often imported as import numpy as np
.
Learn more in Chapter 3 | Back to Top
O#
obj.tolist()#
Converts a pandas Series, Index, or a NumPy array into a regular Python list.
import pandas as pd
df = pd.DataFrame({'Name': ["Alice", "Bob"], 'Age': [24, 22]})
obj = df.columns
obj.tolist() # ['Name', 'Age']
Learn more in Chapter 5 | Back to Top
P#
pandas library#
The pandas library is a Python package built on top of numpy
for data analysis and manipulation.
import pandas as pd
Learn more in Chapter 5 | Back to Top
pandas.DataFrame.groupby()#
Helps with groupby operations on dataframes that involves grouping records based on one or more criteria such that different aggregate functions can be applied on the groups.
import pandas as pd
# Create a sample dataframe
data = {
'Course Name': ['CS120', 'DATA118', 'CS221', 'DATA211', 'DATA259', 'CS340'],
'Level': ['I', 'I', 'II', 'II', 'II', 'III', ],
'Size' : [54, 29, 35, 55, 118, 67]
}
course_df = pd.DataFrame(data)
# Group courses by level
level_groups = course_df.groupby('Level')
# Print the average size for Level I courses
print(level_groups.get_group('I')[['Size']].mean()) #Size 41.5
Learn more in Chapter 8 | Back to Top
pd.DataFrame(…)#
Creates a 2D tabular data structure in pandas from data like arrays, lists, or dictionaries, with labeled rows and columns.
import pandas as pd
Students = pd.DataFrame({'Name':["Alice", "Bob", "Eve"], 'Major':["Math", "Music", "History"]}) # creates a DataFrame with two columns containing information about student's names and majors.
Learn more in Chapter 5 | Back to Top
pd.Index(…)#
Creates an immutable, ordered collection of labels used as the index of a pandas Series or DataFrame.
import pandas as pd
index = pd.Index([10, 20, 30]) # creates an Index with labels 10, 20, 30
Learn more in Chapter 5 | Back to Top
pd.merge(df_left, df_right)#
combines common column names and takes all common rows of two DataFrames, df_left and df_right, to make a new DataFrame. The order of df_left is preserved.
Learn more in Chapter 6 | Back to Top
pd.RangeIndex(start=0, stop, step=1)#
Creates a sequence of integers to be used as a default index in a pandas DataFrame or Series.
import pandas as pd
index = pd.RangeIndex(start=5, stop=10, step=2) # creates an index: 5, 7, 9
Learn more in Chapter 5 | Back to Top
pd.read_csv(file_path)#
Reads data from a CSV file, either stored on your computer or accessible via a URL, into a pandas DataFrame.
import pandas as pd
planets_df = pd.read_csv("planets.csv") # reads the CSV file 'planets.csv' into a DataFrame.
Learn more in Chapter 5 | Back to Top
pd.Series(…)#
Creates a one-dimensional labeled array in pandas. It can be made from a list, array, or dictionary.
import pandas as pd
Even_numbers = pd.Series([2,4,6,8]) # creates a Series object consisting of the integers 2, 4, 6, 8.
Learn more in Chapter 5 | Back to Top
plt.bar()#
Creates a vertical bar chart.
plt.bar(["A","B","C"], [3,5,2])
plt.show()
Learn more in Chapter 7 | Back to Top
plt.barh()#
Creates a horizontal bar chart.
Docs
Learn more in Chapter 7 | Back to Top
plt.boxplot()#
Box-and-whisker plots for distributions.
Docs
Learn more in Chapter 7 | Back to Top
plt.colorbar()#
Adds a color scale bar to a plot.
Docs
Learn more in Chapter 7 | Back to Top
plt.figure()#
Creates a new plotting figure.
Docs
Learn more in Chapter 7 | Back to Top
plt.hist()#
Creates a histogram to show a variable’s distribution.
plt.hist([1,1,2,3,3,3,4])
plt.show()
Learn more in Chapter 7 | Back to Top
plt.legend()#
Adds a legend to a plot.
Docs
Learn more in Chapter 7 | Back to Top
plt.margins()#
Adjusts plot margins/padding.
Docs
Learn more in Chapter 7 | Back to Top
plt.pie()#
Creates a pie chart.
Docs
Learn more in Chapter 7 | Back to Top
plt.plot()#
Creates a line plot.
plt.plot([0,1,2], [0,1,4])
plt.show()
Learn more in Chapter 7 | Back to Top
plt.scatter()#
Creates a scatter plot of two variables.
plt.scatter([1,2,3], [4,5,6])
plt.show()
Learn more in Chapter 7 | Back to Top
plt.show()#
Displays all active plots.
Docs
Learn more in Chapter 7 | Back to Top
plt.stackplot()#
Stacked area plot.
Docs
Learn more in Chapter 7 | Back to Top
plt.subplots()#
Creates a figure with one or more subplots, returning (fig, ax).
Docs
Learn more in Chapter 7 | Back to Top
plt.tight_layout()#
Automatically adjusts subplot spacing to prevent overlap.
Docs
Learn more in Chapter 7 | Back to Top
plt.title()#
Adds a title to a plot.
Docs
Learn more in Chapter 7 | Back to Top
plt.xlabel() / plt.ylabel()#
Add x-axis labels.
Docs
Learn more in Chapter 7 | Back to Top
plt.xticks()#
Sets tick positions or labels on the x-axis.
Docs
Learn more in Chapter 7 | Back to Top
plt.ylabel()#
Add y-axis labels.
Docs
Learn more in Chapter 7 | Back to Top
print(…)#
Displays the input on the console.
print("hello") # hello
Learn more in Chapter 2 | Back to Top
R#
range()#
Creates a sequence of numbers, typically used in loops.
for i in range(3):
print(i) # 0, 1, 2
Learn more in Chapter 3 | Back to Top
round(…)#
Rounds a number to the nearest integer (or to a specified number of decimal places if given).
round(3.14159, 2) # 3.14
Learn more in Chapter 2 | Back to Top
S#
seaborn#
A statistical visualization library built on top of matplotlib.
It provides high-level plotting functions that make complex plots easier to create.
import seaborn as sns
sns.scatterplot(x="col1", y="col2", data=df)
Learn more in Chapter 7 | Back to Top
Series.isin()#
Checks whether each element in a Series is contained in a given list, set, or other sequence.
Returns a Boolean Series with True
for elements that are present and False
otherwise.
import pandas as pd
s = pd.Series([1, 2, 3, 4, 5])
s.isin([2, 4, 6])
Learn more in Chapter 14 | Back to Top
Series.str.lower()#
Converts all string values in a Series to lowercase.
This method works element-wise across the entire Series without needing .apply()
.
s = pd.Series(["Apple", "BANANA", "Cherry"])
s.str.lower()
Learn more in Chapter 14 | Back to Top
stats.norm()#
Generates a normal continuous distribution with mean (loc
) and standard deviation (scale
)
stats.norm(loc, scale)
Learn more in Chapter 10 | Back to Top
str()#
Converts a value into a string.
str(123) # '123'
Learn more in Chapter 2 | Back to Top
string.lower()#
Returns the string in all lowercase letters.
"HELLO".lower() # 'hello'
Learn more in Chapter 2 | Back to Top
string.replace(‘old’, ‘new’)#
Returns a copy of the string with all ‘old’ substrings replaced by ‘new’.
"hello world".replace("world", "Python")
# 'hello Python'
Learn more in Chapter 2 | Back to Top
string.strip()#
Returns the string with whitespace removed from the beginning and end.
" hello ".strip() # 'hello'
Learn more in Chapter 2 | Back to Top
string.upper()#
Returns the string in all uppercase letters.
"hello".upper() # 'HELLO'
Learn more in Chapter 2 | Back to Top
sum(…)#
Returns the sum of all values in a given iterable (like a list).
sum([1, 2, 3]) # 6
Learn more in Chapter 2 | Back to Top
SyntaxError#
Error raised when Python code is written incorrectly and does not follow Python’s rules.
if True print("hi") # SyntaxError
Learn more in Chapter 2 | Back to Top
T#
type(…)#
Returns the data type of the input.
type(3.14) # <class 'float'>
Learn more in Chapter 2 | Back to Top
TypeError#
Error raised when an operation or function is applied to a value of the wrong type.
"5" + 2 # TypeError
Learn more in Chapter 2 | Back to Top
V#
ValueError#
Error raised when a function receives an argument of the right type but an inappropriate value.
int("hello") # ValueError
Learn more in Chapter 2 | Back to Top
View Object#
A special object returned by certain dictionary methods such as .keys()
, .values()
, and .items()
.
A view object acts like a dynamic “window” into the dictionary’s contents:
It automatically updates if the dictionary is changed.
It is iterable, meaning you can loop over it just like a list.
If you need a fixed snapshot of the contents, you can convert it into a list with
list()
.
student = {"name": "Alice", "age": 20}
keys_view = student.keys()
print(keys_view) # dict_keys(['name', 'age'])
student["major"] = "Data Science"
print(keys_view) # dict_keys(['name', 'age', 'major']) # updated automatically
print(list(keys_view)) # ['name', 'age', 'major']
Learn more in Chapter 3 | Back to Top
W#
While Statement#
In control statement which allows for repeated execution of an action as long as, or while a specific condition is true.
The general format of a while
loop is below.
while condition:
action
Learn more in Chapter 4 | Back to Top
Comment Character#
#
: Signals to Python that anything written after it should be ignored# This is a comment
Learn more in Chapter 2 | Back to Top