Comment on page
The UDP Event store's
expandedtable explodes the nested attributes of IMS Global Caliper events into a tabular format. By decomposing and flattening an event's hierarchical data, the UDP Event store's expanded table makes querying event data far more performant and efficient.
The UDP Event store schema explodes the nested attributes and values of a Caliper event into a single, tabular schema. The UDP Event store also computes a small number of variables that make downstream querying and reporting more convenient.
expandedtable presents many identifiers related to an event. The identifiers in the
expandedtable fall into one of two categories.
- Given identifiers are the values in the
idfields of a Caliper event. They are almost always IRIs that use a qualified namespace to communicate a native identifier. These identifiers are represented, unchanged, in
idcolumns in the
expandedtable. For example, the
group.idcolumn in the
expandedtable contains the IRI value found in the
group[id]location of the Caliper event.
- Learning tool identifiers. The UDP is able to use event data to unambiguously identify, generate, or lookup a tool's native identifier. When this is possible, the UDP will store the learning tool's identifiers with column names that communicate to which tool an identifier belongs. For example, the
sis_idcolumns represent identifiers fro Unizin Engage and an SIS, respectively. The values for these identifiers are represented in their native form (not using IRIs). At present, learning tool identifiers are provided for Persons and Course offerings.
To illustrate the different categories of identifiers captured in the UDP Event store, consider a single event from Instructure Canvas. The
actor.idcolumn value for this event will be the given, fully qualified IRI value in the actor[id] location (e.g., "urn:instructure:canvas:user:1"). By contrast, the
person.canvas_idvalue will be the native, unqualified identifier (e.g., "1").
The UDP Event store's expanded table is implemented in Google BigQuery, which supports the STRUCT data type. A STRUCT data type is a container of ordered fields, each of which is defined by its own data type and a name. The advantage of a STRUCT data type is that it enables you to store multiple attributes related to a single object in a single row.
The UDP Event store's expanded table makes liberal use of the STRUCT data type to represent data points in Caliper events that contain 1 or more other data points. For example, the
actorattribute in a Caliper event will typically look like this:
In the UDP Event store's expanded table, this data is represented as an actor STRUCT with two fields:
type. To query a field in a STRUCTure, you use the dot (
.) operator, like so:
actor.id AS actor_id
event_time >= '2021-01-01'
As a JSON object, a Caliper event will contain a number of root attributes, many of which are nodes for nested attributes and sub-nodes. For example, the following event describes that an instructor has graded a particular student assignment:
In this event, the
objectnode contains a mix of attributes and sub-nodes. For every Caliper event, the UDP Event store will represent a certain number of root nodes as STRUCTs. Each of these structs will contain the following attributes:
Given the example above, here is how you might query the Event store for the attributes of the
objectnode in Caliper events:
object.id AS object_id
, object.extensions as object_extensions
event_time >= '2021-01-01'
The following root nodes of Caliper events are:
In a Caliper event, the
objectroot nodes often contain sub-nodes called
attempt, which usually refer to an assigned learning activity (
assignable) and the person who completed or created it (
The UDP Event store will extract these sub-nodes and as attributes of their parent nodes STRUCTs. The sub-nodes are then represented as nested STRUCTs.
Given the example "graded" event above, the following SQL can be used to query the
typeattributes of a nested
object.assignable.type as object_assignable_type
event_time >= '2021-01-01'
The UDP Event store records when the event was written to the UDP Event store (
store_time) along with when the behavior occurred (
event_time) and the date and hour when the behavior occurred (
The UDP Event store table is a partitioned table. A partitioned table is divided into segments (called partitions) along a particular column in the table schema. Partitioned tables are more performant and cost-effective if queries on the table use the relevant column in queries on the table.
The UDP Event store table is partitioned on the
event_timecolumn. Consequently, queries against the UDP Event store that select fixed periods of time to query data (using
event_time) will be generally more performant and cost less.
Each record in the UDP Event store table represents a single, timestamped event.
The UDP Event store table schema enables common event query patterns over sets of Caliper events.
Beyond storing the entirety of a Caliper event payload itself, the UDP Event store schema defines a set of columns for values that are common to query patterns. The values are extracted from the Caliper event payload during the event ETL process. The values include the UDP identifiers that correspond to an event’s native identifiers if the event qualifies for enrichment. For example, values in the actor_id column of the Event store correspond to the id value in the actor node of the Caliper event.
The following table describes the Event record schema for the UDP Event store.