Summarize Islands
Given a dataset with a date column, summarizes the data in terms of islands
, which are periods of time where data exists. This is often useful in determining if your data has gaps
where data does not exist, or exists under certain conditions.
You must set a buffer such as 7 DAYS which will determine the grain of time for which one island stops and another begins.
The result is a summarized table.
Parameters
group_cols
column_list
The column(s) used to partition you data into groups. Islands will be searched within each group
True
conditions
math_list
A list of conditions for which to apply to the data before searching for islands. For example, ["COL1 > 0","COL1 IS NOT NULL"]
True
date_col
column
The column used to create search for islands. This must be a date or datetime column.
buffer_date_part
date_part
buffer_size
int
An integer of how many date_parts
will be considered to be a part of the same island. Larger numbers will cause more overlaps and therefore less islands, and smaller numbers will cause less overlaps and therefore more islands
Example
Source Code
Last updated