🐧
dbt Core
These instructions will help you set up an integration to sync metadata from dbt Core to Rasgo via PyRasgo. If you're just wondering what will be sync'd from dbt Core, scroll down to the bottom of this page.
Every time you call
dbt run
, dbt generates a manifest file with the details of your run. Rasgo then parses this file to ingest datasets, metrics, and associated metadata for you. Adapt this python script to pick up the manifest file generated after each dbt run and push it to Rasgo. You can schedule this using your preferred orchestration system (Airflow, AWS Lambda, GitHub Actions, etc.).import json
import pyrasgo
rasgo = pyrasgo.connect(PYRASGO_API_KEY)
manifest_filepath = '/dbt/manifest_test.json' # <-- location of your manifest.json file
with open(manifest_filepath) as manifest:
rasgo.publish.dbt_manifest(manifest=json.load(manifest))
The import process will run async after the manifest file is submitted, so you do not need to keep the python environment open. Imports can take a while, depending on how many dbt sources and models you have.
Once the python script is running on a schedule, the integration is set up and good to go! Rasgo will now ingest your dbt metadata every time the script is run to make sure it stays updated in Rasgo.
Here is the metadata that Rasgo will ingest from dbt Core:
- dbt sources -> Rasgo datasets
- Lineage
- Columns
- Column descriptions
- dbt models -> Rasgo datasets
- Description
- Lineage
- Columns
- Column descriptions
- SQL
- dbt metrics -> Rasgo metrics
- Metric definition
Last modified 7mo ago