Canvas New Analytics vs. UDP

Introduction & Purpose

Canvas New Analytics is a powerful LTI tool baked into Canvas that allows instructors and admins to view grade analytics, participation metrics, file views, and more across a variety of course contexts. Instructure intends to have Canvas New Analytics replace the current analytics feature at the course and user levels. The main data source for Canvas New Analytics is the requests table: “[r]equest data is the foundation of our student activity data within Canvas.” (Instructure Community Site, Analytics Page Views and Participations).

Existing UDP users may be familiar with the requests table and also the Caliper event_store dataset in BigQuery. Both of these event feeds capture student activity, but the foci are different. The requests table captures what the server sees; rows in the request table showcase GET and POST rows as well as HTTP responses such as 200, 403, etc. The Caliper feed (documented here) is governed by 1EdTech’s standard that centers around “providing a structured approach to describing, collecting and exchanging learning activity data.” (1EdTech Caliper V1.2 Spec, Design Goals and Rationale).

The Caliper standard is both tool and server-agnostic; the requests table applies only to Canvas and does not generalize to any other teaching and learning tool. Unizin’s approach to aggregating and coalescing teaching and learning data has favored the Caliper standard because of this agnostic approach. The Caliper standard allows Unizin to engage consistently with all teaching and learning tool vendors and unify data across company and organizational silos. The requests table is mighty and robust; however, it will always sit behind Instructure’s curtain.

Naturally, a disconnect of experience surfaces for users accustomed to the requests table and Caliper events: the activity metrics seem to differ for a given course in a consistent time frame. Which source, Canvas New Analytics or Caliper events in the UDP, should be trusted?

The answer is not a binary choice between the two, and the purpose of this document is not to villainize either of these solutions. Instead, this document aims to demonstrate each data source's strengths, differences, and quirks so that informed users can confidently navigate and explain deltas that will naturally surface during comparisons. The requests table and the Caliper event feed from Canvas are independently generated and maintained, and neither focus on parity with the other.

Analysis Framework & Approach

Jane Russell and Anna Marie Smith at the University of Iowa partnered with Unizin on this exploration of differences between the requests table and the Caliper events. For one of their courses in scope, Anna Marie generated the following histogram of page views between requests and Caliper:

According to Image 1 above, the Caliper histogram (blue, labeled UDP) is positively skewed and has a positive kurtosis compared to the New Analytics histogram. This concludes that New Analytics is counting certain events that the UDP version is excluding (or is not even capturing).

Our approach, given this distribution insight, is the following:

For the same course in scope, gather raw rows from both the requests table and the Caliper events
Filter the events from both sets to make a common baseline: we want to focus only on student-generated rows, so we filter out all instructor events and system/daemon-generated rows.
Join the result sets together to identify three sets of data for analysis, visualized as a Venn diagram.

The remainder of this document is the analysis of this Venn Diagram.

Sets of Events - A Venn Diagram

Based on the filtering described above (i.e. filtering to include only events that we know are attributable to a student), the following result set emerges.

The process to generate the above result set is the following:

Filter both the requests table and Caliper events for the same course in scope. We pull the LMS ID for course and apply this in both sources.
Filter both the requests table and Caliper events for only student-generated events. We leverage the course_section_enrollment data in the UDP to identify people enrolled in the course in scope with the role value of Student. The LMS IDs of these students are found, and we filter these sources for user_id (requests) and person.canvas_id (Caliper).
Join the requests data with Caliper data to see what matches and what is unique to each source. In many of the Caliper events, we see a request_id field that matches the id field in the requests table. In these cases, the join key is very straightforward: we simply match the request_id field to the id field.

Based on this logic, the following result sets surface:

Overlap - Requests + Caliper: these are the rows that successfully complete the join on request_id and id.
Requests Only: these rows have id values in the request table that we were not able to find matches for in the Caliper event request_id values
Caliper Only: these rows have request_id values that we were not able to find matches for in the requests table id values.

Overlap - Requests + Caliper (421K Events, 14.1%)

Majority of the 421K events that align between these two sources are what we call NavigationEvents. This term is a Caliper-specific name for a more general event that we can think of as a “user click”. The important point to emphasize is that each event is user-generated, not system-generated.

The top 10 course asset types navigated by the students are the following (in descending order):

Wiki Pages - (129K events, 30.64%)
Course Homepage - (124K events, 29.45%)
Course Modules - (49K events, 11.64%)
Attachments (also called files) - (41K events, 9.73%)
LTI Tools - (27K events, 6.41%)
Assignments - (16K events, 3.80%)
Discussion Threads - (9K events, 2.14%)
Course Grades - (9K events, 2.14%)
Conversation Topics - (4K events, 0.95%)
Announcements - (1.4K events, 0.33%)

In addition to Navigation Events, we have assignment submission events. These account for 1.1K events in the final 421K events.

Note, the distribution of activity shown above is not intended to model all courses. This is just one example course, and the list of tools and distribution of activity will naturally deviate as courses are all designed differently. What will remain consistent, however, is the type of data that will overlap between the requests table and the Caliper events. User clicks within a Canvas course shell and assignment submissions are common between the two data sources.

Requests Only (2.57M events, 85.9%)

System-Generate API Calls

The largest selection of events in this category are API calls that happen on the server side but aren’t necessarily explicitly user-generated like a “click” is. These account for 1.78M (69.3%) of the 2.57M records in this set of data. The server tool utilizing API calls in the request table is identified in the web_application_controller and web_application_action fields. Examples include:

wiki_pages_api:show_revision
tabs:index
courses:ping
context_module_items_api:item_sequence
courses:activity_stream_summary
files:api_show
courses:permissions
feature_flags:enabled_features

The index rows seem to align with the server figuring out how different assets need to be arranged before sending a response back to the user interface. The show_revision for wiki_pages seems to resolve the latest published version of a page for users to see. The permissions rows seem to check for what items need to be visible based on the user in scope.

It should be noted, however, that not all API calls fall in this set. We do see Caliper events with URL values that signal an API call. The distinction with this set existing only in the requests table is that these API calls seem system-generated or as a predicate of a separate user action. For example, the show_revision events for wiki_pages have a preceding click or navigation to the wiki page in scope, and that navigation triggers the Canvas server to initiate the show_revision API call accordingly.

Item Redirects

Instances where Canvas issues a redirect event are only captured in the requests table. The most common web_application_controller values that trigger this are lti/ims/authentication and the context_module_items_api.

The lti/ims/authentication has a url path that passes in the current user information and allows the LTI tool to authenticate (or reject) the tool launch. The tool launch itself is represented in both the Caliper events and requests; only the authentication response is represented in the requests table.

This makes up 6300 events for the example course in scope.

File Previews

There are three web_application_controller values that capture file previews in the requests table: files, file_previews, and submissions/previews. There are just over 600K (23.3% of the 2.57M) events that fall into this category.

The pattern we noticed with these events is that they seem to follow a navigation action to a page, and the “preview” part of the URL corresponds to content embedded into the page. On the Caliper and requests sides, we capture the navigation to the page equally. On the requests side, though, if the page has 5 files or images embedded, we also see 5 file “preview” events in the requests table that we don’t see in the Caliper events.

This makes sense on the surface since the Caliper events are user-generated, and loading embedded content isn’t user-generated. However, the events in this category warrant learning analytics considerations. Even though embedded “preview” files and content are not explicitly clicked by users, the files still have “views”.

The University of Iowa’s initial analysis focused on deltas with page views between Canvas New Analytics and the UDP. The Unizin team believes the presence of these “preview” events in the requests table is the cause of this delta; Canvas New Analytics consistently counts higher page views than are present in the Caliper events.

The list of files and assets in Image 3 above illustrates this embedded behavior. It appears based on these filenames that some of these are icons that are parts of other pages; however, they are not standalone pieces of content that students would be expected to access in isolation. Thus, we see these as rows in the requests table and not in Caliper.

A larger analytics question surfaces from this insight. To what extent should “previews” be included in models and analytics for student activity? Previews to simple icons and buttons are probably irrelevant; it’s fine to leave these out of Caliper events. However, an instructive flowchart or embedded assignment PDF may be critical for learning analytics and student success. This may be a valid gap in the current set of Caliper events used for analytics.

Caliper Only (117 events, ~0%)

The size of this set of events is so small that Unizin’s intuition is that these 117 events should be included in the requests table (and thus, this set rolls into the Overlap - Requests + Caliper set).

These are all NavigationEvents, and these events also have a request_id value. We think these may have been a small outage on Instructure’s side where these 117 events got dropped from being recorded in the requests table. We are not certain, however, but the next step to understand this gap relies on speaking with Instructure. We don’t think these 117 events detracts from other insights discussed in this document.

Conclusion - Understanding Each Source

Both the requests table and the Caliper events uncover the interactions and activity happening in Canvas. The Caliper events focus solely on user-generated behavior. The requests table events focus on the view of Instructure’s servers.

The Caliper events are a true subset of the requests table: aside from the anomaly of 117 events, we found that all student Caliper events have a valid, corresponding request_id value in the requests table.

Unizin’s opinion is that learning analytics and student success modeling should occur on the Caliper events, even though the set of data is much smaller than what is in the requests table. From this case study, 85.9% of the events in the requests table could be interpreted as “noise” from a learning analytics lens. The only caveat is the discussion around file previews; there may be cause to explore bridging the requests table and the caliper events for these records from a student activity/engagement point of view.

The requests table is exhaustive and is the source of truth for Canvas New Analytics; however, the amount of “noise” in the requests table makes recommending Canvas New Analytics as a good source of course-level analytics difficult. Unizin recommends seeing more detailed documentation from Instructure on what filters and requirements are enforced on the requests table for each visualization in Canvas New Analytics before confidently saying it can be trusted.

Unizin is taking the next two steps of action:

Coordinating a follow-up conversation with the Instructure team on our findings, and working with them on more detailed explanations of how visualizations are designed and qualified against the requests table in Canvas New Analytics
Modeling requirements for file preview events that should exist in the Caliper event set. This will entail working with consortium UDP users to make sure our efforts capture a valid set of student activity that may be currently missing in the Caliper events.

For more information or questions about anything discussed in this document, please submit a support request by emailing [email protected]. The Unizin DSS and Services teams will correspond accordingly.

PreviousCanvas Live Events: from SQS to HTTPS NextCourse Section Enrollment Role Status Mappings

Last updated 1 year ago