# Modeling your Data

## Best Table Formats for AI

We've found that Rasgo AI understands some table structures better than others. Examples:

* Aggregated Summary Tables
  * AI does best with wide, pre-aggregated tables with all possible dimension slicers and metrics included, at a standard date granularity (i.e. Daily, Weekly, Monthly, etc...)
* Dimension Tables
  * For dimension tables a standard star schema approach works best
  * It's important that the join columns are named the same in both the fact tables and dimension tables (i.e. PRODUCT\_ID in the fact\_sales table and PRODUCT\_ID in the dim\_product table). This removes guesswork when writing the joins.
* Fact Tables
  * For fact tables that should be aggregated when queried, ensure that aggregations to accurately calculate the metrics over time are as straightforward as possible (i.e. if you want to count unique transactions, include transaction ID).
* Header & Details Tables
  * AI can navigate between a header table and line item detail table (i.e. invoices) well, as long as the granularities of both tables are accurate in the notes

## Table Formats to Avoid

For certain table formats, AI needs extra guidance to query them well. This is fine if you are an expert on how the table is structured, but be careful sharing these tables with others that are not familiar with them.

1. Entity - Attribute - Value (EAV) tables
   * It's best to Unpivot these tables into wide columnar tables
2. Mapping tables
   * It's best to pre-join mapping tables into pre-aggregated summary tables to reduce the number of joins required for each analysis query
3. Hierarchy tables
   * It's best to flatten hierarchy tables into a single dimensional lookup table, indexed at the lowest granularity
4. Partitioned tables
   * For scenarios where you have N number of tables that are identical but partitioned by one ID (such as a Customer ID or a date), it's best to create a VIEW that UNIONS or JOINS all of these tables into a single object, and adds in the partition by column so it's easily queryable
   * This reduces duplicated documentation and enables additional notes on a single dataset to help AI understand the contents of all of the partitioned tables

## Important Notes for Certain Table Types

For certain types of tables, there are important notes that will help Rasgo AI understand your data better.

* Aggregated Summary tables
  * AI does best with these tables when there is a note describing the grain of the table that makes each row unique (for example, a 'grain' could be 'Daily per Product ID per Customer ID')
* As-Of tables
  * An "As-Of" metric table is a type of database table designed to store historical data points or snapshots of metrics as they were at specific points in time. This type of table is useful for tracking how metrics change over time and for performing historical analysis.
  * For this table type, it's important to identify the table as an as-of with a note, instructing the AI to always filter by date when querying the table unless it's specifically analyzing a change over a discrete time period
* Entity - Attribute - Value (EAV) tables
  * In this model, instead of having a separate column for each attribute, the attributes are stored as rows in a single table.
  * For this table type, it's important to describe how the tabel is modeled with a note
  * it's also important to have all distinct values of the 'Entity' column described in the column description so AI knows how to filter on this column's values
* Lookup tables
  * Lookup tables contain a set of predefined, often static, values that other tables can reference
  * For this table type, it's important to identify the table appropriately and describe how it should be joined with to accurately lookup reference values

## Additional Considerations

* If you have a fiscal calendar table and want AI to use it instead of the regular calendar, this should be mentioned in your Agent Instructions (including the fully qualified table name)
* Adding notes describing the refresh schedule and ETAs of the data can be helpful to the AI when it's answering questions about the data
* Pre-calculating important KPIs in summary tables can reduce compute time and increase accuracy
* For date columns that will be queried frequently and only have a specific time period available, ask AI to add a note describing the date range of that column


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.rasgoml.com/rasgo-docs/using-rasgo/modeling-your-data.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
