Expanded table
The UDP Event store's expanded
table explodes the nested attributes of IMS Global Caliper events into a tabular format. By decomposing and flattening an event's hierarchical data, the UDP Event store's expanded table makes querying event data far more performant and efficient.
Features of the "expanded" table
The UDP Event store schema explodes the nested attributes and values of a Caliper event into a single, tabular schema. The UDP Event store also computes a small number of variables that make downstream querying and reporting more convenient.
Identifiers
The expanded
table presents many identifiers related to an event. The identifiers in the expanded
table fall into one of two categories.
Given identifiers are the values in the
id
fields of a Caliper event. They are almost always IRIs that use a qualified namespace to communicate a native identifier. These identifiers are represented, unchanged, inid
columns in theexpanded
table. For example, thegroup.id
column in theexpanded
table contains the IRI value found in thegroup[id]
location of the Caliper event.Learning tool identifiers. The UDP is able to use event data to unambiguously identify, generate, or lookup a tool's native identifier. When this is possible, the UDP will store the learning tool's identifiers with column names that communicate to which tool an identifier belongs. For example, the
engage_id
andsis_id
columns represent identifiers fro Unizin Engage and an SIS, respectively. The values for these identifiers are represented in their native form (not using IRIs). At present, learning tool identifiers are provided for Persons and Course offerings.
To illustrate the different categories of identifiers captured in the UDP Event store, consider a single event from Instructure Canvas. The actor.id
column value for this event will be the given, fully qualified IRI value in the actor[id] location (e.g., "urn:instructure:canvas:user:1"). By contrast, the person.canvas_id
value will be the native, unqualified identifier (e.g., "1").
Use of the STRUCT data type
The UDP Event store's expanded table is implemented in Google BigQuery, which supports the STRUCT data type. A STRUCT data type is a container of ordered fields, each of which is defined by its own data type and a name. The advantage of a STRUCT data type is that it enables you to store multiple attributes related to a single object in a single row.
The UDP Event store's expanded table makes liberal use of the STRUCT data type to represent data points in Caliper events that contain 1 or more other data points. For example, the actor
attribute in a Caliper event will typically look like this:
In the UDP Event store's expanded table, this data is represented as an actor STRUCT with two fields: id
and type
. To query a field in a STRUCTure, you use the dot (.
) operator, like so:
Root nodes
As a JSON object, a Caliper event will contain a number of root attributes, many of which are nodes for nested attributes and sub-nodes. For example, the following event describes that an instructor has graded a particular student assignment:
In this event, the object
node contains a mix of attributes and sub-nodes. For every Caliper event, the UDP Event store will represent a certain number of root nodes as STRUCTs. Each of these structs will contain the following attributes:
Given the example above, here is how you might query the Event store for the attributes of the object
node in Caliper events:
The following root nodes of Caliper events are:
actor
edApp
extensions
federatedSession
generated
group
membership
object
profile
referrer
session
target
Sub-nodes
In a Caliper event, the generated
and object
root nodes often contain sub-nodes called assignable
, assignee
, and attempt
, which usually refer to an assigned learning activity (assignable
) and the person who completed or created it (assignee
).
The UDP Event store will extract these sub-nodes and as attributes of their parent nodes STRUCTs. The sub-nodes are then represented as nested STRUCTs.
Given the example "graded" event above, the following SQL can be used to query the type
attributes of a nested assignable
attribute:
Dates and times
The UDP Event store records when the event was written to the UDP Event store (store_time
) along with when the behavior occurred (event_time
) and the date and hour when the behavior occurred (event_date
and event_hour
).
Table partitioning
The UDP Event store table is a partitioned table. A partitioned table is divided into segments (called partitions) along a particular column in the table schema. Partitioned tables are more performant and cost-effective if queries on the table use the relevant column in queries on the table.
The UDP Event store table is partitioned on the event_time
column. Consequently, queries against the UDP Event store that select fixed periods of time to query data (using event_time
) will be generally more performant and cost less.
Canvas events data mapping
Unique to the expanded events table is how the STRUCT data type allows for the normalization of Canvas events data. For further information, please visit our documentation on Canvas edApp mapping.
Schema
Each record in the UDP Event store table represents a single, timestamped event.
The UDP Event store table schema enables common event query patterns over sets of Caliper events.
Beyond storing the entirety of a Caliper event payload itself, the UDP Event store schema defines a set of columns for values that are common to query patterns. The values are extracted from the Caliper event payload during the event ETL process. The values include the UDP identifiers that correspond to an event’s native identifiers if the event qualifies for enrichment. For example, values in the actor_id column of the Event store correspond to the id value in the actor node of the Caliper event.
The following table describes the Event record schema for the UDP Event store.
Last updated