publish.source_data()
Create a Rasgo DataSource from a csv, pandas DataFrame, or Snowflake table
Parameters
source_type
:str:
Valid values are "csv", "dataframe", or "table"
file_path
:str: (Optional)
Full path to the csv file to upload. Only pass when source_type="csv"
df
:pandas DataFrame: (Optional)
DataFrame to upload. Only pass when source_type="dataframe"
table
:str: (Optional)
Name of existing Snowflake table. Only pass when source_type="table"
data_source_name
:str: (Optional)
Name for this DataSource
data_source_domain
:str: (Optional)
Domain for this DataSource
data_source_table_name
:str: (Optional)
Name to give to the uploaded table in Snowflake. Only pass when source_type is "csv" or "dataframe"
parent_data_source_id
:str: (Optional)
Parent DataSource for this DataSource
if_exists
:bool: (Optional)
Instructions on how to proceed when a DataSource exists with this table. Valid values are "fail", "replace", "append".
Return Object
Rasgo DataSource
Sample Usage
Upload a csv:
source_type = "csv"
file_path = "Users/me/Downloads/myfile.csv"
data_source_name = "My CSV Test"
data_source_table_name = "CSV_MYFILE_TEST_ONE"
datasource = rasgo.publish.source_data(source_type,
file_path,
data_source_name,
data_source_table_name,
if_exists="replace"
)
print('DataSource:', datasource)
Upload a DataFrame:
source_type = "dataframe"
df = myPandasDf
data_source_name = "My DF Test"
data_source_table_name = "DF_MYPANDAS_TEST_ONE"
datasource = rasgo.publish.source_data(source_type,
df,
data_source_name,
data_source_table_name,
if_exists="append"
)
print('DataSource:', datasource)
Register a table:
source_type = "table"
table = "SFDATABASE.SFSCHEMA.EXISTING_SF_TBL"
data_source_name = "My Table Test"
datasource = rasgo.publish.source_data(source_type,
table,
data_source_name,
if_exists="fail"
)
print('DataSource:', datasource)
Best Practices / Tips
TIP: Operating on existing sources
When re-loading data from a csv or DataFrame into an existing source, the _datasource_table_name _parameter must be included for this function to associate the data with the correct existing table. Omitting this parameter will create a new DataSource.
Last updated
Was this helpful?