pd_extras.write package¶
Submodules¶
pd_extras.write.common module¶
Common variables for dataframe to database module
pd_extras.write.driver module¶
pd_extras.write.nosql_writer module¶
Write a pandas dataframe to a NoSQL database collection
- class pd_extras.write.nosql_writer.NoSQLDatabaseWriter(dbtype: str, host: str, dbname: str, user: str, password: str, port: int, dns_seed_list: bool = False)¶
Bases:
objectWriter class for NoSQL Database
- close_connection()¶
Close the current connection.
- delete_collection(collection_name: str)¶
Delete collection collection_name.
- Parameters:
collection_name (str) – Name of the collection.
- delete_database()¶
Drop the current database.
- get_document_count(collection_name: str)¶
Get number of documents in collection collection_name.
- Parameters:
collection_name (str) – Name of the collection.
- Returns:
Document count.
- Return type:
int
- get_list_of_collections()¶
List names of collections in the current database.
- Returns:
Collection names.
- Return type:
list[str]
- get_list_of_databases()¶
List names of databses in this connection.
- Returns:
Database names.
- Return type:
list[str]
- get_or_create_collection(collection_name: str)¶
Get object for the collection collection_name.
- Parameters:
collection_name (str) – Name of the collection.
- Returns:
Collection object.
- Return type:
pymongo.collection.Collection
- write_data_to_collection(collection_name: str, data: DataFrame)¶
Write dataframe data to the collection collection_name.
- Parameters:
collection_name (str) – Name of the collection.
data (pd.DataFrame) – Dataframe to write.
- Returns:
Object with ids of inserted documents.
- Return type:
pymongo.results.InsertManyResult
pd_extras.write.sql_writer module¶
Write a pandas dataframe to a SQL database table
- class pd_extras.write.sql_writer.SQLDatabaseWriter(dbtype: SQLDatabaseType, host: str, dbname: str, user: str, password: str, port: int)¶
Bases:
objectDatabase connection object for SQL databases Only database name dbname is stored. Rest of the credentials are used only to retrieved the connection. Two connections are created: one for the specific database dbname and another generic connection with no database selected. Be sure to call connobj.close_connection() after you are done.
- close_connection()¶
Close the current connection to the database
- delete_table(table_name: str)¶
Drop table table_name from the current database if it exists.
- Parameters:
table (Table) – Table to delete.
- get_column_info(table_name: str)¶
Get table schema from database.
- Parameters:
table_name (
str) – Name of the table in database.- Returns:
Pandas dataframe of table schema information.
- Return type:
pd.DataFrame
- has_table(table_name: str)¶
Check if the current database has table table_name.
- Parameters:
table_name (str) – Name of the table to check.
- Returns:
True if table_name exists in current database.
- Return type:
bool
- write_df_to_db(data: DataFrame, table_name: str, id_col: str = 'id', drop_first: bool = False, clean_columns: bool = True, max_length: int = 100)¶
Write data to Table table_name
- Parameters:
data (pd.DataFrame) – Pandas dataframe containing data to write.
dbname (str) – Name of the database.
table_name (str) – Name of table in the database.
id_col (str, optional) – Id column of table if exists, defaults to “id”. Should be set to None if not present in data.
drop_first (bool) – If True, table table_name in database will be attempted to drop first.
clean_columns (bool) – If True, trailing/leading whitespaces and “ will be stripped off column names, defaults to “True”.
max_length (int) – Maximum length of VARCHAR type columns, defaults to 100.
- Returns:
Cursor with result of query execution.
- Return type:
sqlalchemy.engine.cursor.CursorResult