pd_extras.optimize package¶
Submodules¶
pd_extras.optimize.df_ops module¶
Optimize dataframe operations
- pd_extras.optimize.df_ops.get_rows(data: DataFrame, columns: list) ndarray¶
Get iterable rows for the selected columns.
- Parameters:
data (
pd.DataFrame) – Pandas dataframe to select columns from.columns (
list) – List of columns.
- Returns:
Numpy array containing iterable rows.
- Return type:
np.ndarray
>>> from pandas_utils.optimize.df_ops import get_rows >>> rows = get_rows(data=data, columns=list(columns))
- pd_extras.optimize.df_ops.select_columns_from_dataframe(data: DataFrame, columns: list) DataFrame¶
Get a dataframe consisting columns from a dataframe This is the fastest approach from some tests conducted.
df.iloc[indices, column_indices]is a close second.- Parameters:
data (
pd.DataFrame) – Dataframe to take columns from.columns (
list) – List of columns.
- Returns:
Dataframe with the selected columns.
- Return type:
pd.DataFrame
>>> from pandas_utils.optimize.df_ops import select_columns_from_dataframe >>> res = select_columns_from_dataframe(data=data, columns=list(columns))