Unizin Product Documentation
ProductsSupport and TrainingPolicies
  • Unizin Product Documentation
  • Products
    • Content
      • Unizin Engage
        • eReader User Guide
          • Notes, Highlights, and Citations
          • Appearance Settings
          • Download for Offline
          • eReader Layout
          • Keyboard Shortcuts
          • Navigating Your eBook
          • Print
          • Text to Speech
          • Copy and Paste
          • Creating Flashcards
          • Collaboration and Note Sharing
          • Pearson Titles
        • Institution Support
          • Disabled Student Services / Alt-Format
            • Best Practices for Republishing Course Content
            • Disabled Student Services
            • Requesting eTextbook Files for Accessibility Purposes
            • WCAG 2.0 AA evaluation for Engage
            • WCAG 2.0 AA evaluation for EPUB for Engage
          • Institution's Support Responsibilities
        • Caliper 1.1 sensor
        • Release Notes
          • 2.28.22
          • 2019-09-17
          • 2019-05-29
          • 2.26.8
          • 2.26.0
          • 2.25.0
          • 2.22.0
          • 2.21.6
          • 2.21.5
          • 2.20.8
          • 2.20.5
          • 2.20.3
          • 2.19.1
          • 2.18.0
          • 2.17.0
          • 2.14.0
          • 2.12.0
          • 2.11.0
          • 2.9.0
          • 2.8.3
          • 2016-03-17
          • 2016-02-11
          • 2016-01-28
        • Using Analytics (New)
      • Unizin Order Tool
        • Overview of the User Interface
        • Key Concepts
          • Profiles
          • Ordering periods
          • Coordinator permissions
          • Program administrator permissions
        • Courses & Ordering
          • Course filtering
          • Place an order
          • Add sections to a placed order
          • Edit a placed order
          • Cancel an order
          • Reordering
        • Order History
          • Instructor Order History
          • Coordinator and Program Administrator Order History
          • Order Activity
        • Student Choice
          • Student Choice (Program Administrators)
          • Student Choice (Students)
        • Entitlements
          • Entitlements (Program Administrators)
          • Entitlements (Students)
        • Catalog Tool
        • Schedule of Classes
        • Content Request Tool
        • Order Tool Dashboard
        • Vendor Sandbox Tenant
        • Institution Support
        • Implementation
          • SIS Data Integration
            • 1.0 - SIS Integration
            • 2.0 - SIS Integration
          • SSO integration
          • UI customizations
          • Order Feed
            • 1.0 - Order Feed
            • 2.0 - Order Feed
            • 3.0 - Order Feed
          • Publisher report
          • Final declined offers feed
          • Institutional (SIS) Catalog Import
          • Student Price
          • Historical Entitlements Import
        • Release Notes
          • Order Tool Bug Fixes and Enhancements
          • Order Tool Bug Fixes
          • Order Tool Accessibility Improvements
          • Order Feed Improvements
          • Content Request Form Update and Minor Bug Fix
          • Flat Markup Fee Update
          • Ordering Email Receipt Update & Minor Bug Fix
          • Bug Fix for Public Catalog Feature
          • Catalog Search Enhancements
          • Reordering Reminder Email Notifications
          • UX Improvements & Minor Bug Fixes
          • Historical Entitlements Import
          • Student Prices
          • Reordering Feature
          • Email Enhancements
          • Ordering Enhancements
          • Bug Fix for the Institutional Catalog Import
          • Bug Fix for the Final Declined Offers Feed (FDOF)
          • Order Activity Feature and Other Enhancements
          • Bug Fixes for Order History and Report an Issue Features
          • Public catalog feature
          • Minor Bug Fixes for Ordering and Student Choice
          • Entitlements Production Release, Bug fixes, and Minor updates
          • Minor Updates and Bug Fixes for Ordering Workflows
          • Catalog Search Optimization
          • Student Choice
          • Archive Terms Feature and Integration Improvements
          • Introduces the Program Administrator role, Catalog Tool, and Schedule
          • User interface updates and improvements
          • Order feed improvements
          • Order history, UI enhancements
          • Email notification upgrades, UI improvements
          • Order feed changes
          • New features for Course coordinators and upgrades to the UI
          • Changes to the Term, Course, and Section models; introduces a Session
          • Bug fixes, import improvements, and validation improvements
          • Tracking Order History
          • Publisher Reporting
          • Fixes the order feed, automates SIS data importing, and automates the generation of order feed repor
    • Data & Analytics
      • Unizin Data Platform
        • Key concepts
          • Platform overview
          • Data categories
          • Data models
          • Loading schemas
          • Keymap
        • Unizin Common Data Model
          • Academic structures (ERD)
          • Learners (ERD)
          • Course structures (ERD)
          • Course resources (ERD)
          • Learner activities (ERD)
          • Quizzes (ERD)
          • Social (ERD)
          • Course outcomes (ERD)
        • System overview
          • Context data pipeline
            • Context data ingress
            • Batch-ingest application
            • Batch-ingest db server
            • Context store
          • Event data pipeline
            • UDP Caliper endpoint
            • Approval process for implementing Caliper compliant tools
            • UDP Event enricher
            • Event store
        • Data stores
          • Data lake
            • UDP Context store
            • UDP Event store
              • Accessing the Event store
              • Expanded table
                • Expanded table: Canvas edApp mapping
            • Synthetic Data [beta]
              • Viewing Synthetic Data datasets within the BigQuery UI
              • Query Synthetic Data via client libraries
          • Data marts
            • UDP Distributions
            • Interaction sessions
            • Learning Environment Organization
            • File Interaction
            • Last Activity
            • Long Inactivity
            • Course Status
            • Daily Course Grade Record
            • LTI Tool Use
            • LMS Tool Use
            • Tool Usage Metrics
            • Links
            • Taskforce
              • Level 1 Aggregated
              • Level 2 Aggregated
              • Level 2 Course Weekly Distribution Summary
              • Student Term Profile
              • Course Profile
            • Student Activity Score
              • Student Course Metrics
              • Student Course Section Metrics
              • Final
              • Course Final
              • Course Section Final
        • Data integrations
          • Context data integration
            • Loading schema
            • Keymap support
            • Manifest file
            • File requirements
            • Integration mechanics
          • Event data integration
          • SIS data integration
          • LMS data integration
            • Instructure Canvas
        • Release Notes
          • UDP Marts Release Notes
            • 1.0.83
            • 1.0.80
            • 1.0.79
            • 1.0.78
            • 1.0.77
            • 1.0.72
            • 1.0.67
            • 1.0.58
            • 1.0.51
            • 1.0.44
            • 1.0.42
            • 1.0.32
            • 1.0.31
            • 1.0.0
            • Level 2 Taskforce data marts now available
          • 2.0.167
          • 2.0.152
          • 2.0.138
          • 2.0.137
          • 2.0.113
          • 2.0.112
          • 2.0.111
          • 2.0.110
          • 2.0.99
          • 2.0.98
          • 2.0.83
          • 2.0.80
          • 2.0.71
          • 2.0.66
          • 2.0.59
          • 2.0.58
          • 2.0.53
          • 2.0.47
          • 2.0.25
        • Miscellaneous
          • Canvas Data additions, ~Fall 2021
          • Canvas Live Events: from SQS to HTTPS
          • Canvas New Analytics vs. UDP
          • Course Section Enrollment Role Status Mappings
          • Migrating from UDW to UDP
      • Unizin Data Warehouse
        • Implementation Guide
        • Scope of Services
        • Access Provisioning
        • Access Revocation
        • Connecting to the UDW
      • Raw Canvas Data 2
        • Flat Files
        • BigQuery Datasets
    • Hosted Services
      • My Learning Analytics
        • Install MyLA via LTI 1.3
        • Custom configure MyLA
  • Support and Training
    • Professional Development
      • Stepping Stones: A Faculty Development Curriculum for Learning Analytics Use
      • Structured Conversations initiative
    • UDP Self-paced Training
    • Resources Site Broken Links
    • Status Pages
  • Policies
    • General policies
      • Sponsor Teams
      • Browser Support Policy
      • Opt-Out & Invoicing Policy (Order Tool)
    • Support Policy
      • Unizin Engage - SP
      • Unizin Order Tool - SP
      • Unizin Data Platform - SP
      • Unizin Data Warehouse - SP
      • Unizin Data Analysis - SP
      • Pressbooks Hosting - SP
    • Privacy Policy
      • Unizin Engage - PP
      • Unizin Order Tool - PP
      • Unizin Data Platform - PP
      • RStudio service - PP
    • End User License Agreements
      • Unizin Engage - EULA
      • Unizin Order Tool - EULA
    • Terms of Use
      • Unizin Data Platform - ToU
    • Incident Reports
Powered by GitBook
LogoLogo

Unizin Homepage

  • unizin.org

Data & Analytics

  • Unizin Data Platform
  • Unizin Data Warehouse

Content

  • Unizin Engage
  • Unizin Order Tool

Hosted Services

  • My Learning Analytics

Copyright © 2023, Unizin, Ltd.

On this page
  • BQ Prod Dataset Location
  • Schema
  • udp_distributions.context_store_distributions
  • Categorical Distributions
  • Continuous Distributions
  • Staging Helper Tables
  1. Products
  2. Data & Analytics
  3. Unizin Data Platform
  4. Data stores
  5. Data marts

UDP Distributions

PreviousData martsNextInteraction sessions

Last updated 7 months ago

The udp_distributions.context_store_distributions mart defines the distributions of categorical and continuous fields in the UDP context store. The distribution of a categorical field is defined as the proportion of records associated with each possible field value to the total records for the field. The distribution of a continuous field is defined as the average, median, standard deviation, minimum, and maximum value of the field. Some categorical distributions are defined with respect to other fields, to account for relationships between fields. Distributions are defined for most fields and entities in the UDP context store.

BQ Prod Dataset Location

udp_distributions

Schema

udp_distributions.context_store_distributions

Field
Type
Description

entity

STRING

The entity that includes the field for which the distribution is defined.

field_name

STRING

The field for which the distribution is defined.

field_value

STRING

For categorical distributions, a possible value of the field. Null for continuous distributions.

distribution_type

STRING

Whether the distribution is categorical or continuous.

avg_field_value

NUMERIC

For continuous distributions, the average of the values associated with the field. Null for categorical distributions.

median_field_value

NUMERIC

For continuous distributions, the median of the values associated with the field. Null for categorical distributions.

sd_field_value

NUMERIC

For continuous distributions, the standard deviation of the values associated with the field. Null for categorical distributions.

min_field_value

NUMERIC

For continuous distributions, the minimum value of the values associated with the field. Null for categorical distributions.

max_field_value

NUMERIC

For continuous distributions, the maximum value of the values associated with the field. Null for categorical distributions.

field_source

STRING

Whether the field is sourced directly from the UCDM or is a custom field. Currently, this field will always be 'UCDM'.

num_records

INTEGER

For categorical distributions, the number of records associated with the field value. Null for continuous distributions.

pct_records

FLOAT

For categorical distributions, the number of records for that field value compared to the number of records for all possible values for that field. Null for continuous distributions.

with_respect_to_field_name_1

STRING

If the distribution is defined with respect to other fields, the first field that the distribution may be defined with respect to.

with_respect_to_field_name_2

STRING

If the distribution is defined with respect to other fields, the second field that the distribution may be defined with respect to.

with_respect_to_field_name_3

STRING

If the distribution is defined with respect to other fields, the third field that the distribution may be defined with respect to.

with_respect_to_field_value_1

STRING

If the distribution is defined with respect to other fields, a possible value for the first field that the distribution may be defined with respect to.

with_respect_to_field_value_2

STRING

If the distribution is defined with respect to other fields, a possible value for the second field that the distribution may be defined with respect to.

with_respect_to_field_value_3

STRING

If the distribution is defined with respect to other fields, a possible value for the third field that the distribution may be defined with respect to.

number_of_dependencies

INTEGER

The number of dependencies of the distribution. If the distribution is defined with respect to no other fields, the number of dependencies is zero; if the distribution is defined with respect one other field, the number of dependencies is one; and so on. Currently, the maximum possible number of dependencies is three.

Categorical Distributions

In this mart, distributions of categorical fields are defined as the proportion of the number of records associated with each possible value for the field to the total number of records for the field. For example, a boolean field will have three possible values, true, false and null. The proportion of records associated with each of these three values makes up the distribution of that field.

Some categorical distributions are defined with respect to other fields. These distributions have a non-zero number_of_dependencies value. These dependent distributions are included in the mart to make sure important relationships between fields are maintained. For example, role_status in course_section_enrollment has a distribution with one dependency, with respect to role, and enrollment_status, in turn, has a distribution with two dependencies, with respect to role_status and role. For distributions with dependencies, the proportion of records is no longer the proportion of the number of records for each possible field value to the total number of records for the field. Instead, it is the proportion of the number of records of each possible field value to the number of records of each possible value for the related fields.

Continuous Distributions

Distributions of continuous fields are defined as the average, median, standard deviation, minimum, and maximum values of the field. When defining the distributions for this mart, we wanted to make sure that they can both serve as an accurate reflection of how the UDP data looks as well as a useful representation of learners and learning behavior. To achieve the latter, we remove some outliers from the calculation of the continuous distributions metrics. We detect possible outlier values for a field using the interquartile range (IQR), which is the range between the first quartile (Q1) and the third quartile (Q3). We remove any values below Q1-1.5*IQR or above Q3+1.5*IQR.

Staging Helper Tables

Three other tables exist in the udp_distributions dataset:

  • categorical_distributions - fields from the UCDM, without dependency on other fields, that describe the proportion of categorical values (e.g. role, grade_state, completion_type, etc.)

  • continuous_distributions - fields from the UCDM that are numeric and of a continuous nature (e.g. score, weight, time_limit, etc.)

  • categorical_dependent_distributions - categorical fields from the UCDM that have dependency on other fields (e.g. role_status that depends on role, grade_on_official_transcript that depends on grading_basis, is_anonymous_peer_reviews that depends on has_peer_reviews)

The schema for all three of these staging tables matches the schema above. These three tables are created and the final context_store_distributions table is the union of these three tables.

BQ Prod Dataset Locations
Schema
Categorical Distributions
Continuous Distributions