Expanded table

The UDP Event store's expanded table explodes the nested attributes of IMS Global Caliper events into a tabular format. By decomposing and flattening an event's hierarchical data, the UDP Event store's expanded table makes querying event data far more performant and efficient.

Features of the "expanded" table

The UDP Event store schema explodes the nested attributes and values of a Caliper event into a single, tabular schema. The UDP Event store also computes a small number of variables that make downstream querying and reporting more convenient.

Identifiers

The expanded table presents many identifiers related to an event. The identifiers in the expanded table fall into one of two categories.

  • Given identifiers are the values in the id fields of a Caliper event. They are almost always IRIs that use a qualified namespace to communicate a native identifier. These identifiers are represented, unchanged, in id columns in the expanded table. For example, the group.id column in the expanded table contains the IRI value found in the group[id] location of the Caliper event.

  • Learning tool identifiers. The UDP is able to use event data to unambiguously identify, generate, or lookup a tool's native identifier. When this is possible, the UDP will store the learning tool's identifiers with column names that communicate to which tool an identifier belongs. For example, the engage_id and sis_id columns represent identifiers fro Unizin Engage and an SIS, respectively. The values for these identifiers are represented in their native form (not using IRIs). At present, learning tool identifiers are provided for Persons and Course offerings.

To illustrate the different categories of identifiers captured in the UDP Event store, consider a single event from Instructure Canvas. The actor.id column value for this event will be the given, fully qualified IRI value in the actor[id] location (e.g., "urn:instructure:canvas:user:1"). By contrast, the person.canvas_id value will be the native, unqualified identifier (e.g., "1").

Use of the STRUCT data type

The UDP Event store's expanded table is implemented in Google BigQuery, which supports the STRUCT data type. A STRUCT data type is a container of ordered fields, each of which is defined by its own data type and a name. The advantage of a STRUCT data type is that it enables you to store multiple attributes related to a single object in a single row.

The UDP Event store's expanded table makes liberal use of the STRUCT data type to represent data points in Caliper events that contain 1 or more other data points. For example, the actor attribute in a Caliper event will typically look like this:

{
  "id": "urn:vendor:tool:user:12345",
  "type": "Person"
}

In the UDP Event store's expanded table, this data is represented as an actor STRUCT with two fields: id and type. To query a field in a STRUCTure, you use the dot (.) operator, like so:

SELECT
  actor.id AS actor_id
FROM
  event_store.expanded
WHERE
  event_time >= '2021-01-01'
;

Root nodes

As a JSON object, a Caliper event will contain a number of root attributes, many of which are nodes for nested attributes and sub-nodes. For example, the following event describes that an instructor has graded a particular student assignment:

{
  "@context": "http://purl.imsglobal.org/ctx/caliper/v1p1",
  "id": "urn:uuid:cc0115bd-da36-450a-87aa-6c29a2f632f7",
  "eventTime": "2020-10-01T00:01:31.480Z",
  "type": "GradeEvent",
  "action": "Graded",
  "actor": {
    "id": "urn:tool:user:1",
    "type": "Person"
  },
  "edApp": {
    "id": "http://learningtool.com/",
    "type": "SoftwareApplication"
  },
  "object": {
    "id": "urn:tool:submission:5",
    "type": "Attempt",
    "assignable": {
      "id": "urn:tool:assignment:2",
      "type": "AssignableDigitalResource"
    },
    "assignee": {
      "id": "urn:tool:user:1",
      "type": "Person"
    }
  }
}

In this event, the object node contains a mix of attributes and sub-nodes. For every Caliper event, the UDP Event store will represent a certain number of root nodes as STRUCTs. Each of these structs will contain the following attributes:

Given the example above, here is how you might query the Event store for the attributes of the object node in Caliper events:

SELECT
  object.id AS object_id
  , object.extensions as object_extensions
FROM
  event_store.expanded
WHERE
  event_time >= '2021-01-01'
;

The following root nodes of Caliper events are:

  • actor

  • edApp

  • extensions

  • federatedSession

  • generated

  • group

  • membership

  • object

  • profile

  • referrer

  • session

  • target

Sub-nodes

In a Caliper event, the generated and object root nodes often contain sub-nodes called assignable, assignee, and attempt, which usually refer to an assigned learning activity (assignable) and the person who completed or created it (assignee).

The UDP Event store will extract these sub-nodes and as attributes of their parent nodes STRUCTs. The sub-nodes are then represented as nested STRUCTs.

Given the example "graded" event above, the following SQL can be used to query the type attributes of a nested assignable attribute:

SELECT
  object.assignable.type as object_assignable_type
FROM
  event_store.expanded
WHERE
  event_time >= '2021-01-01'
;

Dates and times

The UDP Event store records when the event was written to the UDP Event store (store_time) along with when the behavior occurred (event_time) and the date and hour when the behavior occurred (event_date and event_hour).

Table partitioning

The UDP Event store table is a partitioned table. A partitioned table is divided into segments (called partitions) along a particular column in the table schema. Partitioned tables are more performant and cost-effective if queries on the table use the relevant column in queries on the table.

The UDP Event store table is partitioned on the event_time column. Consequently, queries against the UDP Event store that select fixed periods of time to query data (using event_time) will be generally more performant and cost less.

Canvas events data mapping

Unique to the expanded events table is how the STRUCT data type allows for the normalization of Canvas events data. For further information, please visit our documentation on Canvas edApp mapping.

Schema

Each record in the UDP Event store table represents a single, timestamped event.

The UDP Event store table schema enables common event query patterns over sets of Caliper events.

Beyond storing the entirety of a Caliper event payload itself, the UDP Event store schema defines a set of columns for values that are common to query patterns. The values are extracted from the Caliper event payload during the event ETL process. The values include the UDP identifiers that correspond to an event’s native identifiers if the event qualifies for enrichment. For example, values in the actor_id column of the Event store correspond to the id value in the actor node of the Caliper event.

The following table describes the Event record schema for the UDP Event store.

Last updated

Logo

Copyright © 2023, Unizin, Ltd.