LogoLogo
Home PageLoginSQL Generator
  • 🖥️What is Rasgo?
  • 🚀Setting Up Rasgo
    • Connect Rasgo to your Data
  • 🛠️Using Rasgo
    • Modeling your Data
    • Prompt Guide
    • AI Notes
    • AI-Generated Documentation
    • AI Readiness Score
    • Reports
    • Additional Features
    • Admin Settings
  • 🎉What's New
  • Integrations
    • ❄️Snowflake
    • 🔍BigQuery
    • 🔴Redshift
    • 🧱DeltaLake (via Databricks)
    • 💬OpenAI
    • 🅰️Anthropic
    • ✨Gemini
    • ☁️dbt Cloud
  • 🔐API
    • Table Metadata
    • Column Metadata
  • Reference
    • Status Page
    • Frequently Asked Questions
      • Rasgo Architecture
      • Contacting Rasgo Support
      • What does Rasgo do with my data?
  • Rasgo Graveyard
    • PyRasgo 0.3
      • Source Methods
        • publish.source_data()
        • read.source_data()
        • get.data_sources()
        • get.data_source()
      • Feature Methods
        • feature.get_stats()
        • publish.features_from_source_code()
        • publish.feature_from_source()
        • publish.features()
        • read.feature_data()
        • get.feature_attributes()
        • get.features()
        • get.feature()
      • Collection Methods
        • collection.add_attributes()
        • collection.preview()
        • collection.get_compatible_features()
        • read.collection_snapshot_data()
        • read.collection_data()
        • get.collection_attributes()
        • get.collections()
        • get.collection()
      • Features yml file
      • version 0.3
    • Transforms Overview
      • Build your Own Transform
        • Argument Types
        • Make your own Transform
        • SQL Best Practices
        • Utilities
          • cleanse_name()
    • All Transforms
      • Aggregate String
      • Aggregate
      • Apply
      • Bin
      • Cast
      • Clean
      • Conditional Agg
      • Correlation
      • Cumulative Agg
      • Datarobot Score
      • Dateadd
      • Datediff
      • Datepart
      • Datespine Groups
      • Datespine
      • Datetrunc
      • Describe
      • Drop Columns
      • Dropna
      • Encode Values
      • Entropy
      • Extract Sequences
      • Filter
      • Funnel
      • Heatmap
      • Histogram
      • If Then
      • Join
      • Joins
      • Label Encode
      • Lag
      • Latest
      • Lead
      • Linear Regression
      • Market Basket
      • Math
      • Metric Plot
      • Metric
      • Min Max Scaler
      • Moving Avg
      • New Columns
      • One Hot Encode
      • Order
      • Pivot Table
      • Plot
      • Prefix
      • Profile Column
      • Query
      • Rank
      • Ratio With Shrinkage
      • Remove Duplicates
      • Remove Outliers
      • Rename
      • Replace Missing
      • Replace String
      • Reshape
      • Rolling Agg
      • Rsi
      • Sample Class
      • Sample
      • Sankey
      • Scale Columns
      • Select
      • Sliding Slope
      • Standard Scaler
      • Suffix
      • Summarize Flatlines
      • Summarize Islands
      • Summarize
      • Target Encode
      • Text To Sql
      • Timeseries Agg
      • To Date
      • Train Test Split
      • Union
      • Unions
      • Unpivot
      • Uppercase Columns
      • Vlookup
Powered by GitBook
On this page
  • What's Changing
  • Migration Path

Was this helpful?

  1. Rasgo Graveyard
  2. PyRasgo 0.3

version 0.3

Release date: September 2021

What's Changing

In September, we will release version 0.3 of pyrago. This version will eliminate the concept of FeatureSets from pyrasgo. All references to FeatureSet will be removed or replaced with DataSource.

Can you be more specific?

The following objects will be completely removed from pyrasgo:

  • The FeatureSet primitive class

  • All "feature_set" methods

  • All "feature_set" parameters in functions

See below for a full list of changes and migration path.

Why are you making this change?

Rasgo v0 started with two primitives: Features and Collections. Features were representations of columns in a table and Collections were joined Features. We quickly realized a need to link Features in a conceptual wrapper. This was convenient for a few reasons, but chief among them was: features are most often built in batches as columns in the same table and share table-level metadata. FeatureSets - i.e. Features that are built in the same table, using the same code, having the same grain - were born.

As Rasgo evolved to support more use cases, the DataSource primitive was born to store information about data tranformation and feature lineage. The DataSource primitive has grown into such a core part of our Rasgo data model, that it has usurped all value that FeatureSets originally provided. The two primitives are currently redundant and will block some exciting product features we have planned for Q4. For this reason, we will retire the FeatureSet primitive in pyrasgo with a 21 beer salute.

Features and Dimensions will now be built directly on DataSources. The DataSource primitive will expand to include 3 new attributes: columns, features, and dimensions. We find this direct access pattern much easier to use and hope our customers will agree. All existing features and dimensions have already been migrated to support this relationship and all new ones will build the relationship more naturally.

Excited to see for yourself? Run: rasgo.get.data_sources()

What can I do to prepare?

The first thing to note is that Rasgo will support version 0.2.5 for a few months, so there is no immediate need to upgrade if you have production code dependent on FeatureSets. We recommend you upgrade as soon as possible to benefit from the efficiencies this release will provide.

See below for a full list of changes and a migration path for each. If for any reason this path will not work in your codebase, please contact Rasgo as soon as possible for support.

Migration Path

FeatureSet Primitive:

Current
Migrate To
Notes

FeatureSet()

DataSource()

The FeatureSet class will no longer be available. All publish functions that used to return data in a FeatureSet class will now return data in a DataSource class. See attribute mapping doc below to understand changes.

FeatureSet Functions:

Current
Migrate To

get.feature_set()

get.data_source()

get.feature_sets()

get.data_sources(with_features_only=True)

get.features_by_featureset(feature_set_id)

data_source = get.data_source(id)

data_source.features

get.columns_by_featureset(feature_set_id)

data_source = get.data_source(id)

data_source.dimensions

get.feature_set_yml(feature_set_id)

get.features_yml(data_source_id)

prepare_feature_set_dict(feature_set_id)

data_source = get.data_source(id)

data_source.to_dict()

prepare_feature_set_yml(feature_set_id, file_name, directory)

data_source = get.data_source(id)

data_source.to_yml(file_name, directory)

Functions calling FeatureSet Parameters:

Current
Migrate To

publish.features_from_source(feature_set_name)

None (parameter deprecated)

publish.features_from_source_code(feature_set_name, feature_set_table_name)

derivative_source_name,

sql_view_name

Functions returning FeatureSet responses:

Current
Now Returns

publish.features_from_source()

DataSource

publish.features_from_source_code()

DataSource

publish.features()

DataSource

Attribute Mapping

Feature Set Attribute
Data Source Attribute

id

id (not a direct equivalent)

name

name

sourceTable

dataTable.tableName

dataTable.tableDatabase

dataTable.tableSchema

dataTable.fqtn

sourceCode

sourceCode

features

features (all attributes same)

dimensions

dimensions (all attributes same)

granularities

granularities (all attributes same)

dataSource

N/A

N/A

columns

PreviousFeatures yml fileNextTransforms Overview

Last updated 3 years ago

Was this helpful?