Keymap support
Every UDP Loading schema will specify the data identifiers (or "keys") that a Learning tool, SIS, or LMS must include in its context dataset.
It is not uncommon for data platforms to integrate datasets that include primary and foreign keys in the dataset contents. Indeed, such identifiers are valuable to understand relationships between records in the dataset.
The UDP is unique, however, insofar as it requires that two classes of identifiers be provided for any given record: an internal identifier and an external identifier. In almost every case, internal identifiers are the primary and foreign keys typical of a large dataset generated for batch processing. By contrast, external identifiers are a concept unique to the UDP.
The requirement to provide both classes stems from a central feature of the UDP: the UDP keymap. The purpose of the UDP keymap is to associate records from different systems that are about the same thing together.
It is expected that any context dataset provides both internal and external identifiers, as indicated by the loading schema.
Internal identifiers
Internal identifiers are the keys that are native to a particular system. These identifiers are primary keys and foreign keys. The UDP calls them "internal identifiers" because their validity and utility are only internal to the system itself. Internal identifiers are used to define the uniqueness of a record within the context of that system and, when used as foreign keys, to express the relationships between entities in a dataset.
Internal identifiers are required by the UDP's context data import process to ensure the referential integrity of a dataset. Records whose foreign keys fail to refer to an existing primary key in another entity will be dropped.
External identifiers
External identifiers are keys that one system provides (externalizes) to another system and that uniquely identify a record from the first system. It is on the basis of external identifiers that the UDP uses external identifiers to join data from different systems.
External identifiers are usually shared between systems through a data provisioning or launch process. For example, an institution will include SIS identifiers (i.e., external identifiers) in the dataset used to provision the LMS on a nightly basis. As another example, consider that the LMS will pass its own identifiers (i.e., external identifiers) during an LTI launch process to a Learning tool. In both examples, one system is externalizing identifiers from its own system for another system.
It is common for the internal and external identifiers to be distinct. There are many reasons why a system that externalizes its identifiers to another for any given record may not use its actual internal identifiers (primary keys). These may include data modeling constraints (such as a requirement to generate a synthetic identifier), and/or security concerns.
Tools that receive external identifiers from another system may hold on to that system’s external identifiers in a variety of ways. Typically, however, the tool will include an external identifier received from another system as an attribute or property of their own records. For example, an LMS may model its "user" concept to include an "sis_id" attribute.
LMS and Learning tools must provide external identifiers received from other systems in their context datasets.
Primary and composite entities
Most UCDM and Loading schema entities are "primary entities" that represent a fundamental concept or idea. These include entities such as Person, Course section, Academic degree, etc. There also exist "composite entities" that are, as their name suggests, composed of two or more entities. These include entities such as Person-Academic degree or Person-Academic term.
In loading schemas, primary entities will always require both “internal” and “external” identifiers in a context dataset. Composite entities will never require external identifiers. Instead, they will contain foreign keys (which are internal identifiers) to other entities in the context dataset.
Example scenarios
The following two examples illustrate the utility of providing both internal and external identifiers in support of the UDP keymap.
SIS context data integration
It is typical for the Student Information System (SIS) to provision the Learning Management System (LMS) with data for which the SIS is the system of record (e.g., users, academic terms, etc). Almost always, the data provisioning requirements include that the SIS must provide its own identifiers in the dataset used to provision LMS data.
During the provisioning process, the LMS will capture the SIS’s identifiers for a record even as it creates its own new primary keys for the record in its system. Consequently, both systems share a common key (the key included in the SIS dataset) even as they maintain their own primary keys for the same record.
For example, the Instructure Canvas LMS can import batch data from a Student Information System (SIS) to provision its own internal data for users, academic terms, courses, course sections, and enrollments (among others). The Canvas SIS import process will store the SIS keys (i.e., external identifiers) it receives from the SIS as “SIS IDs.” In its Canvas Data product, for example, the SIS ID that Canvas receives for a course is called the “sis_source_id.” The UDP uses Instructure’s Canvas Data product as its source of LMS context data.
It becomes subsequently possible to join SIS and LMS data based on the LMS’s capture of the identifiers that were provided to it by the SIS.
Learning tool context data integration
Learning tools are integrated into, and launched from, Learning Management Systems. When a Learning tool launch occurs in the LMS, it is almost always by using the IMS Global LTI standard. The LTI standard requires that the LMS pass certain data fields to a Learning tool as part of an LTI launch. Included in these required fields are identifiers for certain records –such as the user and course IDs– that the Learning tool can store for its own use. Such identifiers are effectively external LMS identifiers.
When the UDP consumes data from Learning tools, it will require that the Learning tools UDP Loading schema include the LMS identifiers given to it by the LMS in the LTI launch. Those LMS identifiers are then matched to the LMS's external identifiers, enabling the LMS and Learning tool data to be joined.
Last updated