# Event store

The UDP Event store serves as the archive for all behavioral data captured for all time. The UDP Event data pipeline streams [enriched](https://resources.unizin.org/products/data-and-analytics/unizin-data-platform/system-overview/event-data-pipeline/udp-event-enricher) events into the UDP Event store in real-time.

## Implementation in Google BigQuery <a href="#eventstore-implementationingooglebigquery" id="eventstore-implementationingooglebigquery"></a>

The UDP Event store is implemented in Google BigQuery using a date-partitioned table, called `expanded`. The `expanded` table is located in the `event_store` dataset.

<figure><img src="https://3709019308-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FKYwtHNGgdPXS3PWAlZUr%2Fuploads%2Fgit-blob-ee3098bb864a1cae46cc440fce5be168ce058151%2F64159756.png?alt=media" alt=""><figcaption><p><em>The UDP Event store in Google BigQuery.</em></p></figcaption></figure>

The `expanded` table's date partitioning is done on the Caliper event's `event_time` variable. The `event_time` variable corresponds to the event timestamp provided in Caliper event payloads themselves. Consequently, the Event store partitions events based on the timestamp that learning tools report when a behavior occurred (rather than, say, when the event was written to the Event store). When writing queries on the `expanded` table, having the `event_time` in the WHERE clause is required.

Unizin enforces a 20TiB daily byte scan limit in BigQuery for all consortium members. BigQuery charges users very lightly for data storage, which is ideal for *storing* the multiple TB-sized expanded table; however, BigQuery charges more heavily for data computation and usage. The current rate for BigQuery byte scan is $5 per 1TiB scanned in a query. Enforcing a 20TiB daily limit per school allows BigQuery to remain a powerful, useful tool while staying within our financial requirements as a consortium. The `expanded` table is so large that we require this partition filter in queries to prevent large scans of terabytes of data by accident.

Here is an example framework for a query:

**Query Framework Example - Expanded Table**

```sql
SELECT 
	<FIELDS IN SCOPE>
FROM event_store.expanded
WHERE event_time >= '2022-06-01' and event_time <= '2022-06-10' -- This will pull all events from 6/1 - 6/10
	<FURTHER AND / OR CONDITIONS TO FILTER RESULTS>
```

The UDP installation process will automatically create the `event_store` BigQuery dataset and `expanded` table. The UDP will configure the `expanded` table to be partitioned by `event_time`. The UDP will also automatically create the service accounts needed by the UDP Event enricher to stream insert event data into the `expanded` table.

For a full explanation of the Event store data schema, [please see our dedicated docs](https://resources.unizin.org/products/data-and-analytics/unizin-data-platform/data-stores/data-lake/udp-event-store/expanded-table).
