Pandas/NumPy

Connect to Teradata using Pandas

pip install --upgrade 'sqlalchemy<2.0'
pip install teradatasqlalchemy
pip install python-dotenv

Note: I’m using sqlalchemy<2.0 in order to avoid getting the dreaded AttributeError: 'OptionEngine' object has no attribute 'execute' error, caused by pandas 1.x not being updated to support sqlalchemy 2.x. When pandas 2.x is released, this should no longer be an issue.

from sqlalchemy import create_engine
from dotenv import dotenv_values

config = dotenv_values(".env")

engine = create_engine(f'teradatasql://{config["teradata-user"]}:{config["teradata-password"]}@{config["teradata-host"]}')
pd.read_sql("SELECT * FROM my_teradata_table", engine)

– sqlalchemy tip via Giorgos Myrianthous

Slice by second (or third) level of a MultiIndex

df.loc[pd.IndexSlice[:, 't'], :]

– via this great StackOverflow answer

Get duplicated index values - useful when debugging stuff like “ValueError: cannot reindex from a duplicate axis”

df[df.index.duplicated()]

Connect to Teradata using Pandas#

Slice by second (or third) level of a MultiIndex#

Get duplicated index values - useful when debugging stuff like “ValueError: cannot reindex from a duplicate axis”#

Drop duplicated index values#

Drop/filter out rows based on a list of values#

Display a correlation matrix using pandas#

Return value counts for NumPy array#

Avoid “TypeError: float() argument must be a string or a number, not ‘Period’” errors when plotting with pandas#

Flatten hierarchical index (MultiIndex) in columns#

Drop NaNs from specific columns#

Trying to jsonify a numpy array and getting “TypeError: Object of type ndarray is not JSON serializable”#

Setting a value on a slice#

Set all dtypes for a DataFrame#

Replace values in a column for rows identified by a .loc condition#

Type annotations in for loops#

Truncate floats to a fixed number of decimals, without having to mess with formatting and the like#

Date/Time#

Calculate the difference in months between two dates#

Create a datetime Series from year/month numeric columns#

Easily convert a datetime to its month representation#

Detecting missing entries in a time series#

Convert a DateTimeIndex to isoformat-like strings#

Adding months, years to a datetime object#

Group DateTimeIndex by year/month/etc, and optionally get the values#

Replace year/month/etc in a DateTimeIndex with a new value#

Excel via XlsxWriter#

Write two frames in different sheets of the same Excel document, using XlsxWriter#

Set column width when using XlsxWriter#

Prevent XlsxWriter from automatically converting strings that start with the equals sign (=) to formulas#

Other tips for not getting corrupt Excel files when using XlsxWriter#

Set a temp dir#

Use the in_memory option#

Apply an async function to a dataframe#

Connect to Teradata using Pandas

Slice by second (or third) level of a MultiIndex

Get duplicated index values - useful when debugging stuff like “ValueError: cannot reindex from a duplicate axis”

Drop duplicated index values

Drop/filter out rows based on a list of values

Display a correlation matrix using pandas

Return value counts for NumPy array

Avoid “TypeError: float() argument must be a string or a number, not ‘Period’” errors when plotting with pandas

Flatten hierarchical index (MultiIndex) in columns

Drop NaNs from specific columns

Trying to jsonify a numpy array and getting “TypeError: Object of type ndarray is not JSON serializable”

Setting a value on a slice

Set all dtypes for a DataFrame

Replace values in a column for rows identified by a .loc condition

Type annotations in for loops

Truncate floats to a fixed number of decimals, without having to mess with formatting and the like

Date/Time

Calculate the difference in months between two dates

Create a datetime Series from year/month numeric columns

Easily convert a datetime to its month representation

Detecting missing entries in a time series

Convert a DateTimeIndex to isoformat-like strings

Adding months, years to a datetime object

Group DateTimeIndex by year/month/etc, and optionally get the values

Replace year/month/etc in a DateTimeIndex with a new value

Excel via XlsxWriter

Write two frames in different sheets of the same Excel document, using XlsxWriter

Set column width when using XlsxWriter

Prevent XlsxWriter from automatically converting strings that start with the equals sign (=) to formulas

Other tips for not getting corrupt Excel files when using XlsxWriter

Set a temp dir

Use the in_memory option

Apply an async function to a dataframe