LogoLogo
Home PageLoginSQL Generator
  • 🖥️What is Rasgo?
  • 🚀Setting Up Rasgo
    • Connect Rasgo to your Data
  • 🛠️Using Rasgo
    • Modeling your Data
    • Prompt Guide
    • AI Notes
    • AI-Generated Documentation
    • AI Readiness Score
    • Reports
    • Additional Features
    • Admin Settings
  • 🎉What's New
  • Integrations
    • ❄️Snowflake
    • 🔍BigQuery
    • 🔴Redshift
    • 🧱DeltaLake (via Databricks)
    • 💬OpenAI
    • 🅰️Anthropic
    • ✨Gemini
    • ☁️dbt Cloud
  • 🔐API
    • Table Metadata
    • Column Metadata
  • Reference
    • Status Page
    • Frequently Asked Questions
      • Rasgo Architecture
      • Contacting Rasgo Support
      • What does Rasgo do with my data?
  • Rasgo Graveyard
    • PyRasgo 0.3
      • Source Methods
        • publish.source_data()
        • read.source_data()
        • get.data_sources()
        • get.data_source()
      • Feature Methods
        • feature.get_stats()
        • publish.features_from_source_code()
        • publish.feature_from_source()
        • publish.features()
        • read.feature_data()
        • get.feature_attributes()
        • get.features()
        • get.feature()
      • Collection Methods
        • collection.add_attributes()
        • collection.preview()
        • collection.get_compatible_features()
        • read.collection_snapshot_data()
        • read.collection_data()
        • get.collection_attributes()
        • get.collections()
        • get.collection()
      • Features yml file
      • version 0.3
    • Transforms Overview
      • Build your Own Transform
        • Argument Types
        • Make your own Transform
        • SQL Best Practices
        • Utilities
          • cleanse_name()
    • All Transforms
      • Aggregate String
      • Aggregate
      • Apply
      • Bin
      • Cast
      • Clean
      • Conditional Agg
      • Correlation
      • Cumulative Agg
      • Datarobot Score
      • Dateadd
      • Datediff
      • Datepart
      • Datespine Groups
      • Datespine
      • Datetrunc
      • Describe
      • Drop Columns
      • Dropna
      • Encode Values
      • Entropy
      • Extract Sequences
      • Filter
      • Funnel
      • Heatmap
      • Histogram
      • If Then
      • Join
      • Joins
      • Label Encode
      • Lag
      • Latest
      • Lead
      • Linear Regression
      • Market Basket
      • Math
      • Metric Plot
      • Metric
      • Min Max Scaler
      • Moving Avg
      • New Columns
      • One Hot Encode
      • Order
      • Pivot Table
      • Plot
      • Prefix
      • Profile Column
      • Query
      • Rank
      • Ratio With Shrinkage
      • Remove Duplicates
      • Remove Outliers
      • Rename
      • Replace Missing
      • Replace String
      • Reshape
      • Rolling Agg
      • Rsi
      • Sample Class
      • Sample
      • Sankey
      • Scale Columns
      • Select
      • Sliding Slope
      • Standard Scaler
      • Suffix
      • Summarize Flatlines
      • Summarize Islands
      • Summarize
      • Target Encode
      • Text To Sql
      • Timeseries Agg
      • To Date
      • Train Test Split
      • Union
      • Unions
      • Unpivot
      • Uppercase Columns
      • Vlookup
Powered by GitBook
On this page

Was this helpful?

  1. Rasgo Graveyard
  2. PyRasgo 0.3
  3. Feature Methods

feature.get_stats()

Rasgo makes it easy for you to get the stats for any feature stored within Rasgo from Python without needing to use the GUI. Once you have a feature, you can get its stats by calling get_stats.

feature = rasgo.get.feature(id)

fstats = feature.get_stats()

You can access the individual stats either directly as

fstats.meanVal

Or create a dictionary and work with the stats in the dictionary

fstats_dict = fstats.dict() fstats_dict['meanVal']

The available statistics are:

Field Name

Statistic

recCt

Record Count

distinctCt

Number of Distinct Values

nullRecCt

Number Null

zeroValRecCt

Number that is zero value

meanVal

Mean

medianVal

Median

maxVal

Maximum

minVal

Minimum

sumVal

Sum

stdDevVal

Standard Deviation

varianceVal

Variance

rangeVal

Range

skewVal

Skewness

kurtosisVal

Kurtosis

q1Val

25th Percentile

q3Val

75th Percentile

IQRVal

IQR

pct5Val

5th Percentile

pct95Val

95th Percentile

outlierCt

Total number of outliers

lowOutlier

Value below which a record is an outlier

highOutlier

Value above which a record is an outlier

If you have recently uploaded a dataframe with publish.features_from_df, you can easily get these statistics for each column in a pandas dataframe. publish.features_from_df Returns a featureset

featureset = rasgo.publish.features_from_df(df, dimensions, features, granularity, tags)

You can get a list of features contained in this featureset using get.features_by_featureset. features = rasgo.get.features_by_featureset(featureset.id)

And you can create the stats dataframe for this list of features (or any other list you’ve created) by

fstatlist = [] for f in featurelist: statdict = f.get_stats().dict() statdict['featureName'] = f.name fstatlist.append(statdict) df = pd.DataFrame(fstatlist)

PreviousFeature MethodsNextpublish.features_from_source_code()

Last updated 3 years ago

Was this helpful?