rasgo.publish.df()
rasgo.publish.df(df=None, name=None, description=None, dataset_table_name=None, verbose=False)
Pushes a local pandas DataFrame into a Data Warehouse table, and registers the table as a Rasgo Dataset.
name | type | description |
---|---|---|
df | pandas DataFrame | DataFrame to publish as a Dataset |
name | Optional[str] | A name for the resulting Dataset (if not provided a random string will be used) |
description | Optional[str] | A description for the resulting Dataset |
parents | Optional[List[Dataset()]] | Optionally, add Parent Dataset dependencies for the resulting dataset. Input as list of Dataset primitive objects. |
tags | Optional[list] | List of custom strings to categorize the Dataset |
attributes | Optional[dict] | Dictionary with metadata about the Dataset |
fqtn | Optional[str] | Optionally specify the Fully Qualified Table Name here |
owners | Optional[list] | List of Rasgo user email addresses who are designated data owners |
verbose | Optional[bool] | If set to True, status statements will be printed to stdout during execution |
if_exists | Optional[str] | Values: ['fail', 'append', 'overwrite'] directs the function what to do if a FQTN is passed, and represents an existing Dataset. Default is 'overwrite' |
A Rasgo
Dataset
import pandas as pd
p_df = pd.read_csv("some_file_path")
#### Dealing with dates in a way that lets snowflake automatically
#### create the correct column types
# If passing in a DATE
# pd.to_datetime(df['DATE']).dt.date
# If passing in a TIMESTAMP
# pd.to_datetime(df['DATE']).dt.tz_localize('UTC')
d1 = rasgo.publish.df(
df=p_df,
name="My-CSV-Dataset",
description="I created a Dataset from the CSV some_file_path",
fqtn="RASGO.PUBLIC.MY_TABLE_NAME",
verbose=True,
tags=['New', 'Two'],
owners=['[email protected]', '[email protected]'],
if_exists='overwrite'
)
print(d1)
Rasgo pushes your dataframe to a table in Snowflake - this means you can expect runtime to scale with the size of your dataframe. You will receive a
Dataset
object once the regestration process is completed!Column names from your dataframe may be modified to match Snowflake table naming best practices. Be sure to explore your new dataset once its registered.
Last modified 4mo ago