Data Onboarding Addendum

_Please see the accompanying demonstration video, _linked here.

To import your data from a CSV file into Rasgo, you must first upload it as a Source. Make sure you are on the Sources page on http://app.rasgoml.com/features, by clicking Sources on the left-hand side of the app.

Next click “Create Source” in the upper right corner

Then under CSV Upload, click “Select FIle” and select the CSV you want to upload.

Once the data has been loaded into Rasgo, you will see

And you can click “View Source Data” to view the data, set data types and handle any data quality issues found

You will then see any data quality resolutions Rasgo has identified

Each row shows one column of data from the CSV file

The row shows the column name and the percentage of entries in the column that are distinct values, contain only blanks, and are missing.

In addition, the first icon in the row shows if Rasgo identified a data quality resolution for you to check

Or found no issues

Clicking on the row allows you to see

First, Rasgo identifies the possible data types contained in this column (from left to right: string, binary, integer, float and date) and identifies what percentage of the values in the column are compatible with each type

If 100% of the values in the column are compatible with binary, float or string, Rasgo will automatically pick that type

In cases where no data type has a 100% match, Rasgo will default to string, but suggest if another data type looks like the best fit.

Clicking any of the other data types to change the data type for that column.

Next, the four most common values are shown

And any data quality issues for that column.

You can “Ignore” the error or select the option “Replace With…”

And enter a value before clicking “Save Changes”

You can accept all by clicking the confirm all button on the right hand side.

Once you have handled or decided to ignore the data quality issues, you are ready to create features from this source. To do this, click “Publish Features” in the upper right.

At the top of the box, Rasgo shows the hashtags that will be applied to the features you create from this source. By default, Rasgo will create a hashtag based on the original filename along with the date-time the source was created.

You can create additional hashtags by clicking the plus symbol on the right hand side. Then type in any additional hashtags and click “Save Changes”.

Like in the Source Profile, each row shows one column of data, with the column name, the name you want it to have in Rasgo and any description.

The three boxes to the right of the column name specify how this column should be handled within Rasgo. The left box specifies a feature and this column will show up as a feature on the Features page.

The right box tells Rasgo to ignore this column and not import it.

The middle box tells Rasgo that this feature should be treated as a dimension. Dimensions are columns that Rasgo uses to join the features in this dataset with other datasets. Rasgo requires dimensions to have the same name in each feature set to allow this join to happen automatically. You can specify more than one dimension on a CSV to allow multiple different joins.

If you specify that a column is a dimension, you will need to specify the type of dimension

And the name of the dimension (if it is neither a date/time or geospatial dimension) within Rasgo.

Click Import <number> Features to import the features into Rasgo

You can find your imported features under the hashtags you created.

Last updated