Links

rasgo.publish.df()

rasgo.publish.df(df=None, name=None, description=None, dataset_table_name=None, verbose=False)
Pushes a local pandas DataFrame into a Data Warehouse table, and registers the table as a Rasgo Dataset.

Parameters:

name
type
description
df
pandas DataFrame
DataFrame to publish as a Dataset
name
Optional[str]
A name for the resulting Dataset (if not provided a random string will be used)
description
Optional[str]
A description for the resulting Dataset
parents
Optional[List[Dataset()]]
Optionally, add Parent Dataset dependencies for the resulting dataset. Input as list of
Dataset primitive objects.
tags
Optional[list]
List of custom strings to categorize the Dataset
attributes
Optional[dict]
Dictionary with metadata about the Dataset
fqtn
Optional[str]
Optionally specify the Fully Qualified Table Name here
owners
Optional[list]
List of Rasgo user email addresses who are designated data owners
verbose
Optional[bool]
If set to True, status statements will be printed to stdout during execution
if_exists
Optional[str]
Values: ['fail', 'append', 'overwrite'] directs the function what to do if a FQTN is passed, and represents an existing Dataset. Default is 'overwrite'

Returns:

A Rasgo Dataset

Sample usage:

import pandas as pd
p_df = pd.read_csv("some_file_path")
#### Dealing with dates in a way that lets snowflake automatically
#### create the correct column types
# If passing in a DATE
# pd.to_datetime(df['DATE']).dt.date
# If passing in a TIMESTAMP
# pd.to_datetime(df['DATE']).dt.tz_localize('UTC')
d1 = rasgo.publish.df(
df=p_df,
name="My-CSV-Dataset",
description="I created a Dataset from the CSV some_file_path",
fqtn="RASGO.PUBLIC.MY_TABLE_NAME",
verbose=True,
tags=['New', 'Two'],
if_exists='overwrite'
)
print(d1)

Tips:

Rasgo pushes your dataframe to a table in Snowflake - this means you can expect runtime to scale with the size of your dataframe. You will receive a Dataset object once the regestration process is completed!
Column names from your dataframe may be modified to match Snowflake table naming best practices. Be sure to explore your new dataset once its registered.