# Features yml file

Features are a reference to columns in a SQL table. Rasgo stores metadata about features to help users discover and consume them.

Feature metadata can be defined using a yml file or a dict.

#### Structure of a yml files

Features that reside in the same table belong to a DataSource. A yml file describes a DataSource (table) and the Features and Dimensions (columns) in it. Each yml file should describe a single DataSource.

#### Attributes

| Attribute Name | Description                                                                                                            | Value constraints                                                                                                                            |
| -------------- | ---------------------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------- |
| name           | Name of the DataSource.                                                                                                | (Optional) will default to the sourceTable name if not supplied                                                                              |
| sourceTable    | The Snowflake table these features are stored in                                                                       | Mandatory                                                                                                                                    |
| sourceType     | The type of data used to import this DataSource                                                                        | Mandatory: restricted value in list \[table, dataframe, csv]                                                                                 |
| sourceCode     | The sql or python code used to create this feature (assuming there is value in storing this)                           | (Optional) free-form text field                                                                                                              |
| tags           | Free-form text tags to apply to all features                                                                           | (Optional) List of strings                                                                                                                   |
| attributes     | Free-form k:v dicts to apply to all features                                                                           | (Optional) List of dicts                                                                                                                     |
| dimensions:    | --                                                                                                                     | --                                                                                                                                           |
| columnName     | SQL column name of the dimension                                                                                       | <p>Mandatory: Standard SQL column rules: no spaces or special characters.</p><p>Best practice to CAPITALIZE all letters</p>                  |
| dataType       | SQL datatype of the column                                                                                             | <p>Mandatory: Standard SQL datatypes allowed:</p><p>string, int, float, date, bool</p>                                                       |
| granularity    | String describing the grain of this column. This will determine what other features can be joined with these features. | <p>Mandatory: Allowed datetime values:</p><p>hour, day, week, month, quarter, year</p>                                                       |
| features:      | --                                                                                                                     | --                                                                                                                                           |
| columnName     | SQL column name of the feature                                                                                         | <p>Mandatory: Standard SQL column rules: no spaces or special characters.</p><p>Best practice to CAPITALIZE all letters</p>                  |
| dataType       | SQL data type of the feature                                                                                           | <p>Mandatory: Standard SQL datatypes allowed:</p><p>string, int, float, date, bool</p>                                                       |
| displayName    | The name that will display in the Rasgo UI                                                                             | <p>(Optional) Any string value. Spaces and special characters allowed.</p><p>Best practice to avoid double quotes (“) and semicolons (;)</p> |
| description    | A short description of the feature that will display in the Rasgo UI                                                   | <p>(Optional) Any string value. Spaces and special characters allowed.</p><p>Best practice to avoid double quotes (“) and semicolons (;)</p> |
| status         | Status of the feature: sandbox or production                                                                           | (Optional) restricted value in list: \[Productionized, Sandboxed]                                                                            |
| tags           | Free-form text tags to apply to this feature only                                                                      | (Optional) List of strings                                                                                                                   |
| attributes     | Free-form k:v dicts to apply to this feature only                                                                      | (Optional) List of dicts                                                                                                                     |

#### Sample file:

```
name: "Customer Transactions"
sourceType: table
sourceTable: CUSTOMER_TRANSACTIONS
tags:
- apply_to_all_features
features:
- columnName: TRANS_AMT
  displayName: "Transaction Amount"
  dataType: float
  description: "Total of transaction in USD"
  status: Productionized
  tags:
  - USD
- columnName: ITEM_CT
  displayName: "Item Count"
  dataType: integer
  description: "Number of items in cart"
  status: Productionized
- columnName: STORE_NAME
  displayName: "Store Name"
  dataType: string
  description: "Name of store"
  status: Productionized
- columnName: COUPONS_USED
  displayName: "Coupons Used"
  dataType: bool
  description: "Were any coupons used"
  status: Productionized
dimensions:
- columnName: TRANS_DATE
  dataType: date
  granularity: day
- columnName: CUSTOMER_ID
  dataType: int
  granularity: customer
```

{% hint style="info" %}
"dimensions" are index fields that will be used to join features to other features
{% endhint %}

{% hint style="info" %}
"granularity" can be any string that helps uniquely describe a dimension. Granularity is used to determine when dimensions across FeatureSets are of the same "grain" and can be joined to each other.

It is often helpful to think of granularity as a way to tag your features with taxonomy metadata. Consider:

Granularity for datetime fields may be logged as: year, quarter, month, day, second - to define the grain of a date or datetime column.

Granularity for geolocation data may be logged as: Country, State, CBG, FIPS, zipcode, latlong

Granularity for healthcare data may be logged as: patient, payer, provider, encounter
{% endhint %}

{% hint style="info" %}
The "sourceTable" param can accept just a table name or a fully qualified table name (DB.SCHEMA.TABLE). If database and schema are not supplied, Rasgo will assume your account's default credentials. For most accounts this will be: Database = RASGO & Schema = PUBLIC
{% endhint %}


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.rasgoml.com/rasgo-docs/rasgo-0.1/pyrasgo-0.3/features-yml-files.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
