feature.get_stats()
Rasgo makes it easy for you to get the stats for any feature stored within Rasgo from Python without needing to use the GUI. Once you have a feature, you can get its stats by calling get_stats
.
feature = rasgo.get.feature(id)
fstats = feature.get_stats()
You can access the individual stats either directly as
fstats.meanVal
Or create a dictionary and work with the stats in the dictionary
fstats_dict = fstats.dict()
fstats_dict['meanVal']
The available statistics are:
Field Name | Statistic |
recCt | Record Count |
distinctCt | Number of Distinct Values |
nullRecCt | Number Null |
zeroValRecCt | Number that is zero value |
meanVal | Mean |
medianVal | Median |
maxVal | Maximum |
minVal | Minimum |
sumVal | Sum |
stdDevVal | Standard Deviation |
varianceVal | Variance |
rangeVal | Range |
skewVal | Skewness |
kurtosisVal | Kurtosis |
q1Val | 25th Percentile |
q3Val | 75th Percentile |
IQRVal | IQR |
pct5Val | 5th Percentile |
pct95Val | 95th Percentile |
outlierCt | Total number of outliers |
lowOutlier | Value below which a record is an outlier |
highOutlier | Value above which a record is an outlier |
If you have recently uploaded a dataframe with publish.features_from_df
, you can easily get these statistics for each column in a pandas dataframe. publish.features_from_df
Returns a featureset
featureset = rasgo.publish.features_from_df(df, dimensions, features, granularity, tags)
You can get a list of features contained in this featureset using get.features_by_featureset.
features = rasgo.get.features_by_featureset(featureset.id)
And you can create the stats dataframe for this list of features (or any other list you’ve created) by
fstatlist = []
for f in featurelist:
statdict = f.get_stats().dict()
statdict['featureName'] = f.name
fstatlist.append(statdict)
df = pd.DataFrame(fstatlist)
Last updated