Get 20M+ Full-Text Papers For Less Than $1.50/day. Subscribe now for You or Your Team.

Learn More →

The Argument for a “Data Cube” for Large-Scale Psychometric Data

The Argument for a “Data Cube” for Large-Scale Psychometric Data TECHNOLOGY REPORT published: 18 July 2019 doi: 10.3389/feduc.2019.00071 The Argument for a “Data Cube” for Large-Scale Psychometric Data Alina A. von Davier*, Pak Chung Wong, Steve Polyak and Michael Yudelson ACTNext, ACT Inc, Iowa City, IA, United States In recent years, work with educational testing data has changed due to the affordances provided by technology, the availability of large data sets, and by the advances made in data mining and machine learning. Consequently, data analysis has moved from traditional psychometrics to computational psychometrics. Despite advances in the methodology and the availability of the large data sets collected at each administration, the way assessment data is collected, stored, and analyzed by testing organizations is not conducive to these real-time, data intensive computational methods that can reveal new patterns and information about students. In this paper, we propose a new way to label, collect, and store data from large scale educational learning and assessment systems (LAS) using the concept of the “data cube.” This paradigm will make the application of machine-learning, learning analytics, and complex analyses possible. It will also allow for storing the content for tests (items) and instruction (videos, simulations, items with scaffolds) as data, which opens up new avenues for personalized learning. Edited by: This data paradigm will allow us to innovate at a scale far beyond the hypothesis-driven, Frank Goldhammer, German Institute for International small-scale research that has characterized educational research in the past. Educational Research (LG), Germany Keywords: database alignment, learning analytics, diagnostic models, learning pathways, data standards Reviewed by: Pei Sun, Tsinghua University, China INTRODUCTION Hendrik Drachsler, German Institute for International Educational Research (LG), Germany In recent years, work with educational testing data has changed due to the affordances provided by technology, availability of large data sets, and due to advances made in data mining and machine *Correspondence: learning. Consequently, data analysis moved from traditional psychometrics to computational Alina A. von Davier Alina.vonDavier@act.org psychometrics. In the computational psychometrics framework, psychometric theory is blended with large scale, data-driven knowledge discovery (von Davier, 2017). Despite advances in the Specialty section: methodology and the availability of the large data sets collected at each test administration, the way This article was submitted to the data (from multiple test forms at multiple test administrations) is currently collected, stored and Educational Psychology, analyzed by testing organizations is not conducive to these real-time, data intensive computational a section of the journal psychometrics and analytics methods that can reveal new patterns and information about students. Frontiers in Education In this paper we primarily focus on data collected from large-scale standardized testing Received: 19 November 2018 programs that have been around for decades and that have multiple administrations per year. Accepted: 03 July 2019 Recently, many testing organizations have started to consider including performance or activity- Published: 18 July 2019 based tasks in the assessments, developing formative assessments, or embedding assessments Citation: into the learning process, which led to new challenges around the data governance: data von Davier AA, Wong PC, Polyak S design, collection, alignment, and storage. Some of these challenges have similarities with those and Yudelson M (2019) The Argument encountered and addressed in the field of learning analytics, in which multiple types of data are for a “Data Cube” for Large-Scale merged to provide a comprehensive picture of students’ progress. For example, Bakharia et al. Psychometric Data. Front. Educ. 4:71. doi: 10.3389/feduc.2019.00071 (2016), Cooper (2014) and Rayon et al. (2014) propose solutions for the interoperability of learning Frontiers in Education | www.frontiersin.org 1 July 2019 | Volume 4 | Article 71 von Davier et al. Data Cubes for Large-Scale Psychometric Data data coming from multiple sources. In recent years, the testing Content as Data organizations started to work with logfiles and even before the Additionally, in this paper we expand the traditional definition data exchange standards for activities and events, such as the of educational data (learning and testing data) to include Caliper or xAPI standards, have been developed, researchers have the content (items, passages, scaffolding to support learning), worked on designing the data schema for this type of rich data taxonomies (educational standards, domain specification), the (see Hao et al., 2016). The approach presented in this paper items’ metadata (including item statistics, skills and attributes conceptually builds on these approaches, being focused on the associated with each item), alongside the students’ demographics, data governance for testing organizations. responses, and process data. Rayon et al. (2014) and Bakharia et al. (2016) also proposed including the content and context Database Alignment for learning data in their data interoperability structures for In this paper, we propose a new way to label, collect, and learning analytics, Scalable Competence Assessment through a store data from large scale educational learning and assessment Learning Analytics approach (SCALA), and Connected Learning systems (LAS) using the concept of the “data cube,” which Analytics (CLA) tool kit, respectively. The difference from was introduced by data scientists in the past decade to deal their approach is in the specifics of the content for tests with big data stratification problems in marketing contexts. (items), usage in psychometrics (item banks with metadata), and This concept is also mentioned by Cooper (2014) in the domain structures such as taxonomies or learning progressions. context of interoperability for learning analytics. In statistics In addition, we propose a natural language processing (NLP) and data science the data cube is related to the concept perspective on these data types that facilitates the analysis and of database alignment, where multiple databases are aligned integration with the other types of data. on various dimensions under some prerequisites (see Gilbert Any meaningful learning and assessment system is based on et al., 2017). Applying this paradigm to educational test data a good match of the samples of items and test takers, in terms is quite challenging, due to the lack of coherence of traditional of the difficulty and content on the items’ side, and ability and content tagging, of a common identity management system for educational needs on the students’ side. In order to facilitate test-takers across testing instruments, of collaboration between this match at scale, the responses to the test items, the items psychometricians and data scientists, and until recently, of the themselves and their metadata, and demographic data, need to lack of proven validity of the newly proposed machine learning be aligned. Traditionally, in testing data, we collected and stored methods for measurement. Currently, data for psychometrics the students’ responses and the demographic data, but the items, is stored and analyzed as a two-dimensional matrix—item by instructional content, and the standards have been stored often examinee. In the time of big data, the expectation is not only as a narrative and often it has not been developed, tagged, or that one has access to large volumes of data, but also that the stored in a consistent way. There are numerous systems for data can be aligned and analyzed on different dimensions in real authoring test content, from paper-based, to Excel spreadsheets, time—including various item features like content standards. to sophisticated systems. Similarly, the taxonomies or theoretical The best part is that the testing data available from the large frameworks by which the content is tagged are also stored in testing organizations is valid (the test scores measure what they different formats and systems, again from paper to open-sources are supposed to measure, and these validity indices are known) systems, such as OpenSALT. OpenSALT is an Open source and data privacy policies have been followed appropriately when Standards ALignment Tool that can be used to inspect, ingest, the data was collected. These are two important features that edit, export and build crosswalks of standards expressed using the support quality data and the statistical alignment of separate IMS Global Competencies and Academic Standards Exchange databases (see Gilbert et al., 2017). (CASE) format; we will refer to data standards and models in more detail later in the paper. Some testing programs have well- Data Cubes designed item banks where the items and their metadata are The idea of relational databases has evolved over time, but the stored, but often the content metadata is not necessarily attached paradigm of the “data cube” is easy to describe. Obviously, to a taxonomy. the “data cube” is not a cube, given that different data-vectors We propose that we rewrite the taxonomies and standards are of different lengths. A (multidimensional) data cube is as data in NLP structures that may take the form of sets, or designed to organize the data by grouping it into different mathematical vectors, and add these vectors as dimensions to the dimensions, indexing the data, and precomputing queries “data cube.” Similarly, we should vectorize the items’ metadata frequently. Psychometricians and data scientists can interactively and/or item models and align them on different dimensions of navigate their data and visualize the results through slicing, the “cube.” dicing, drilling, rolling, and pivoting, which are various ways Data Lakes to query the data in a data science vocabulary. Because all the data are indexed and precomputed, a data cube query often runs The proposed data cube concept could be embedded within significantly faster than standard queries. Once a data cube is the larger context of psychometric data, such as ACT’s data built and precomputed, intuitive data projections on different lake. At ACT, we are building the LEarning Analytics Platform dimensions can be applied to it through a number of operations. (LEAP) for which we proposed an updated version of this Traditional psychometric models can also be applied at scale and data-structure: the in-memory database technology that allows in real time in ways which were not possible before. for newer interactive visualization tools to query a higher Frontiers in Education | www.frontiersin.org 2 July 2019 | Volume 4 | Article 71 von Davier et al. Data Cubes for Large-Scale Psychometric Data number of data dimensions interactively. A data lake is a storage solution based on an ability to host large amounts of unprocessed, raw data in the format the sender provides. This includes a range of data representations such as structured, semi-structured, and unstructured. Typically, in a data lake solution, the data structure, and the process for formally accessing it, are not defined until the point where access is required. An architecture for a data lake is typically based on a highly distributed, flexible, scalable storage solution like the Hadoop Distributed File System (HDFS). These types of tools are becoming familiar to testing organizations, as the volume and richness of event data increase. They also facilitate a parallel computational approach for the parameter estimation of complex psychometric models applied to large data sets (see von Davier, 2016). Data Standards for Exchange Data standards allow those interoperating in a data ecosystem to access and work with this complex, high-dimensional data (see for example, Cooper, 2014). There are several data standards that exist in the education space which allow schools, testing, and learning companies to share information and build new knowledge, such as combining the test scores with the GPA, attendance data, and demographics for each student in order to identify meaningful patterns that may lead to differentiated instructions or interventions to help students improve. We will describe several of these standards FIGURE 1 | A relational database. and emphasize the need for universal adoption of data standards for better collaboration and better learning analytics at scale. In the rest of the paper, we describe the evolution of data Relational Data Model and Relational storage and the usefulness of the data cube paradigm for large- scale psychometric data. We then describe the approach we are Databases Management System (RDBMS) considering for testing and learning data (including the content). In a relational data model, data are stored in a table with In the last section, we present preliminary results from a real- rows and columns that look similar to a spreadsheet, as shown data example of the alignment of two taxonomies from the in Figure 1. The columns are referred to as attributes or taxonomy-dimension in the “data cube.” fields, the rows are called tuples or records, and the table that comprises a set of columns and rows is the relation in RDMBS literature. THE FOUNDATIONS OF THE DATA CUBE The technology was developed when CPU speed was AND ITS EXTENSIONS slow, memory was expensive, and disk space was limited. Background and Terminology Consequently, design goals were influenced by the need to In computer science literature, a data cube is a multi- eliminate the redundancies (or duplicated information), such dimensional data structure, or a data array in a computer as “2015” in the Year column in Figure 1, through the programming context. Despite the implicit 3D structural concept concept of normalization. The data normalization process derived from the word “cube,” a data cube can represent involves breaking down a large table into smaller ones through any number of data dimensions such as 1D, 2D. . . nD. a series of normal forms (or procedures). The discussion In scientific computing studies, such as computational fluid of the normalization process is important, but beyond the dynamics, data structures similar to a data cube are often scope of this paper. Readers are referred to Codd (1970) for referred to as scalars (1D), vectors (2D), or tensors (3D). further details. We will briefly discuss the concept of the relational data Information retrieval from these normalized tables can be model (Codd, 1970) and the corresponding relational databases done by joining these tables through the use of unique keys management system (RDBMS) developed in the 70’s, followed identified during the normalization process. The standard by the concept of the data warehouse (Inmon, 1992; Devlin, RDBMS language for maintaining and querying a relational 1996) developed in the 80’s. Together they contributed to the database is Structured Query Language (SQL). Variants of development of the data cube (Gray et al., 1996) concept in SQL can still be found in most modern day databases and the 90’s. spreadsheet systems. Frontiers in Education | www.frontiersin.org 3 July 2019 | Volume 4 | Article 71 von Davier et al. Data Cubes for Large-Scale Psychometric Data Data Warehousing The concept of data warehousing was presented by Devlin and Murphy in 1988, as described by Hayes (2002). A data warehouse is primarily a data repository from one or more disparate sources, such as marketing or sales data. Within an enterprise system, such as those commonly found in many large organizations, it is not uncommon to find multiple systems operating independently, even though they all share the same stored data for market research, data mining, and decision support. The role of data warehousing is to eliminate the duplicated efforts in each decision support system. A data warehouse typically includes some business intelligence tools, tools to extract, transform, and load data into the repository, as well as tools to manage and retrieve the data. Running complex SQL queries on a large data warehouse, however, can be time consuming and too costly to be practical. Data Cube Due to the limitations of the data warehousing described above, data scientists developed the data cube. A data cube is designed to organize the data by grouping it into different dimensions, indexing the data, and precomputing queries frequently. Because all the data are indexed and precomputed, a data cube query often runs significantly faster than a standard SQL query. In business intelligence applications, the data cube concept is often referred to as Online Analytical Processing (OLAP). FIGURE 2 | A 3D data cube. Online Analytical Processing (OLAP) and Business Intelligence The business sector developed OnLine Analytical Processing technology (OLAP) to conduct business intelligence analysis Dicing and look for insights. An OLAP data cube is indeed a The dicing operation is similar to slicing, except dicing allows multidimensional array of data. For example, the data cube users to pick specific values along multiple dimensions. In in Figure 2 represents the same relational data table shown in Figure 6, the dicing operation is applied to both Name (Chloe, Figure 1 with scores from multiple years (i.e., 2015–2017) of the Ada, and Jacob) and Subject (Calculus and Algebra) dimensions. same five students (Noah, Chloe, Ada, Jacob, and Emily) in three The result is a small 2 × 3 × 3 cube shown in the second part of academic fields (Science, Math, and Technology). Once again, Figure 6. there is no limitation on the number of dimensions within an OLAP data cube; the 3D cube in Figure 2 is simply for illustrative Drilling purposes. Once a data cube is built and precomputed, intuitive Drilling-up and -down are standard data navigation approaches data projections (i.e., mapping of a set into a subset) can be for multi-dimensional data mining. Drilling-up often involves applied to it through a number of operations. an aggregation (such as averaging) of a set of attributes, Describing data as a cube has a lot of advantages when whereas drilling-down brings back the details of a prior drilling- analyzing the data. Users can interactively navigate their data up process. and visualize the results through slicing, dicing, drilling, rolling, The drilling operation is particularly useful when dealing with and pivoting. core academic skills that can be best described as a hierarchy. For example, Figure 7A shows four skills of Mathematics (i.e., Slicing Number and Quantity; Operations, Algebra, and Functions; Given a data cube, such as the one shown in Figure 2, users can, Geometry and Measurement; and Statistics and Probability) as for example, extract a part of the data by slicing a rectangular defined by the ACT Holistic Framework (Camara et al., 2015). portion of it from the cube, as highlighted in blue in Figure 3A. Each of these skill sets can be further divided into finer sub- The result is a smaller cube that contains only the 2015 data skills. Figure 7B shows an example of dividing the Number in Figure 3B. Users can slice a cube along any dimension. For and Quantity skill from Figure 7A into eight sub-skills—from example, Figure 4 shows an example of slicing along the Name Counting and Cardinality to Vectors and Matrices. dimension highlighted in blue, and Figure 5 shows an example Figure 8 shows a drill-down operation in a data cube that of slicing along the Subject dimension. first slices along the Subject dimension with the value “Math.” Frontiers in Education | www.frontiersin.org 4 July 2019 | Volume 4 | Article 71 von Davier et al. Data Cubes for Large-Scale Psychometric Data FIGURE 3 | (A,B) Slicing along the Year dimension of a data cube. FIGURE 4 | Slicing along the Name dimension of a data cube. The result is a slice of only the Math scores for all five names a cube and create a new dimension for the cube. The idea, which from 2015 to 2017 in Figure 8. The drilling-down operation is similar to the application of a “function” on a spreadsheet, is in Figure 8 then shows the single Math score that summarizes often referred to as “rolling-up” a data cube. the three different Math sub-scores of Calculus, Algebra, and Topology. For example, Emily’s 2015 Math score is 2, which is an Pivoting average of his Calculus (1), Algebra (3), and Topology (2) scores Pivoting a data cube allows users to look at the cube via different as depicted in Figure 8. perspectives. Figure 9 depicts an example of pivoting the data The drilling-up operation can go beyond aggregation and can cube from showing the Name vs. Subject front view in the first apply rules or mathematical equations to multiple dimensions of part of Figure 9 to a Year vs. Subject in the third part of Figure 9, Frontiers in Education | www.frontiersin.org 5 July 2019 | Volume 4 | Article 71 von Davier et al. Data Cubes for Large-Scale Psychometric Data FIGURE 5 | Slicing along the Subject dimension of a data cube. FIGURE 6 | Dicing a 3D data cube. which shows not just Emily’s 2015 scores but also scores from database querying using languages such as MDX (2016). 2016 and 2017. The 3D data cube is indeed rotated backward The more pre-aggregations done on the disk, the better the along the Subject dimension from the middle image to the last performance for users. However, all operations are conducted image in Figure 9. at disk level, which involves slow operation, and thus CPU load and latency issues. As the production cost of Beyond Data Cubes computer memory continues to go down and its computational performance continues to go up simultaneously, it has become Data cube applications, such as OLAP, take advantage of pre- aggregated data along dimension-levels and provide efficient evident that it is more practical to query data in the Frontiers in Education | www.frontiersin.org 6 July 2019 | Volume 4 | Article 71 von Davier et al. Data Cubes for Large-Scale Psychometric Data FIGURE 7 | (A) Four skills of Mathematics. (B) Eight sub-skills of the Number and Quantity skill. FIGURE 8 | Drilling-down of a data cube. memory instead of pre-aggregating data on the disk as across the 32 nodes. Because such a large amount of information OLAP data-cubes. can be queued from a database in interactive time, the role of data warehouses continues to diminish in the big data era and as cloud computing becomes the norm. In-memory Computation Today, researchers use computer clusters with as much as 1 TB of memory (or more) per computer node for high dimensional, The Traditional Data Cubes Concept in-memory database queries in interactive response time. For Additionally, in-memory database technology allows researchers example, T-Rex (Wong et al., 2015) is able to query billions to develop newer interactive visualization tools to query a of data records in interactive response time using a Resource higher number of data dimensions interactively, which allows Description Framework RDF 2014 database and the SPARQL users to look at their data simultaneously from different (2008) query language running on a Linux cluster with 32 nodes perspectives. For example, T-Rex’s “data facets” design, as of Intel Xeon processors and ∼24.5 TB of memory installed shown in Figure 10A, shows seven data dimensions of a cybersecurity benchmark dataset available in the public domain. https://en.wikipedia.org/wiki/Resource_Description_Framework After the IP address 172.10.0.6 (in the SIP column) in Frontiers in Education | www.frontiersin.org 7 July 2019 | Volume 4 | Article 71 von Davier et al. Data Cubes for Large-Scale Psychometric Data FIGURE 9 | Pivoting a data cube from one perspective (dimensional view) to another. FIGURE 10 | Interactive database queries of a high dimensional dataset. Figure 10A is selected, the data facets update the other 172.10.1.102 is queried in the DIP column. Figure 10C shows six columns as shown in Figure 10B simultaneously. The the results after two consecutive queries, shown in green in query effort continues in Figure 10B where the IP address the figure. Frontiers in Education | www.frontiersin.org 8 July 2019 | Volume 4 | Article 71 von Davier et al. Data Cubes for Large-Scale Psychometric Data The spreadsheet-like visual layout in Figure 10 performs more “Users define the computation in terms of a map and a reduce effectively than many traditional OLAP data interfaces found function, and the underlying runtime system automatically in business intelligence tools. Most importantly, the data facets parallelizes the computation across large-scale clusters of design allows users to queue data in interactive time without machines, handles machine failures, and schedules inter-machine the need for pre-aggregating data with pre-defined options. This communication to make efficient use of the network and disk” video (Pacific Northwest National Laboratory, 2014) shows how (Dean and Ghemawat, 2008). T-Rex operates using a number of benchmark datasets available Scripts for slicing, dicing, drilling, and pivoting [See Section in the public domain. Online Analytical Processing (OLAP) and Business Intelligence] The general in-memory data cube technology has extensive in a data cube fashion can be written, executed, and shared commercial and public domain support and is here to stay until via notebook-style interfaces such as those implemented by, for the next great technology comes along. example, open source solutions such as Apache Zeppelin and Jupyter. Zeppelin and Jupyter are web based tools that allow users to create, edit, reuse, and run “data cube”-like analytics DATA CUBE AS PART OF A DATA LAKE using a variety of languages (e.g., R, Python, Scala, etc.). Such SOLUTION AND THE LEAP FOR scripts can access data on an underlying data source such as HDFS. Organizing analytical code into “notebooks” means PSYCHOMETRIC DATA combining the descriptive narration of the executed analytical or The proposed data cube concept could be embedded within research methodology along with the code blocks and the results the larger context of collecting/pooling psychometric data in of running them. These scripts are sent to sets of computing something that is known in the industry as a data lake machines (called clusters) that manage the process of executing (Miloslavskaya and Tolstoy, 2016). An example of this is ACT’s the notebook in a scalable fashion. Data cube applications in the data lake solution known as the LEarning Analytics Platform data lake solution typically run as independent sets of processes, (LEAP). ACT’s LEAP is a data lake is a storage solution based coordinated by a main driver program. on an ability to host large amounts of unprocessed, raw data Data Standards for Exchange in the format the sender provides. This includes a range of While data lakes provide flexibility in storage and enable the data representations such as structured, semi-structured, and creation of scaleable data cube analysis, it is also typically a good unstructured. Typically, in a data lake solution, the data structure, idea for those operating in a data ecosystem to select a suitable and the process for formally accessing it, are not defined until the data standard for exchange. This makes it easier for those creating point where access is required. the data, transmitting, and receiving the data to avoid the need to A data lake changes the typical process of: extract data, create translations of the data from one system to the next. Data transform it (to a format suitable for querying) and load in to exchange standards allow for the alignment of databases (across tables (ETL) into one favoring extract, load and transform (ELT), various systems), and therefore, facilitate high connectivity of prioritizing the need to capture raw, streaming data prior to the data stored in the date cube. Specifically, the data exchange prescribing any specific transformation of the data. Thus, data standards impose a data schema (names and descriptions of transformation for future use in an analytic procedure is delayed the variables, units, format, etc.) that allow data from multiple until the need for running this procedure arises. We now describe sources to be accessed in a similar way. how the technologies of a data lake help to embed the data cube There are several data standards that exist in the education analysis functionality we described above. space that address the data exchange for different types of data, An architecture for a data lake is typically based on a such as: highly distributed, flexible, scalable storage solution like the Hadoop Distributed File System (HDFS). In a nutshell, an HDFS • Schools Interoperability Framework (SIF) Data instance is similar to a typical distributed file system, although Model Specification it provides higher data throughput and access through the use • SIF is a data sharing, open specification for academic of an implementation of the MapReduce algorithm. MapReduce institutions from kindergarten through workforce. The here refers to the Google algorithm defined in Dean and specification is “composed of two parts: an specification for Ghemawat (2008). ACT’s LEAP implementation of this HDFS modeling educational data which is specific to the educational architecture is based on the industry solution: Hortonworks Data locale, and a system architecture based on both direct and Platform (HDP) which is an easily accessed set of open source assisted models for sharing that data between institutions, technologies. This stores and preserves data in any format given which is international and shared between the locales.” across a set of available servers as data streams (a flow of data) • Ed-Fi Data Standard in stream event processors. These stream event processor uses The Ed-Fi Data Standard was developed in order to address an easy-to-use library for building highly scalable, distributed the needs of standard integration and organization of data in analyses in real time, such as learning events or (serious) game education. This integration and organization of information play events. Using map/reduce task elements, data scientists and https://en.wikipedia.org/wiki/Schools_Interoperability_Framework (Retrieved researchers can efficiently handle large volumes of incoming, raw May 7, 2018). data files. In the MapReduce paradigm: https://www.ed-fi.org/ Frontiers in Education | www.frontiersin.org 9 July 2019 | Volume 4 | Article 71 von Davier et al. Data Cubes for Large-Scale Psychometric Data ranges across a broad set of data sources so it can be analyzed, spent browsing), tutored interaction, synergetic activities filtered, and put to everyday use in various educational (e.g., interactive labs). platforms and systems. ◦ Item classes may include: test items, quizzes, and tasks, • Common Education Data Standards (CEDS) tutorials, and reading materials. CEDS provides a lens for considering and capturing the • Data that contextualizes this item response analysis data standards’ relations and applied use in products and within a hierarchical expression of learning services. The area of emphasis for CEDS is on data items objectives/standards collection and representations across the pre-kindergarten, typical K- 12 learning, learning beyond high school, as well as jobs and ◦ Item contextualization that addresses multiple technical education, ongoing adult-based education, and into hypotheses of how the conceptualization is structured. workforce areas as well. Multiple hypotheses include accounts for human vs. • IMS Global Question and Test Interoperability Specification machine indexing and alternative conceptualizations in includes many standards. The most popular are the IMS the process for development. Caliper and CASE. • Demographic data that may include gender, Social and ◦ IMS Caliper, which allows us to stream in assessment item Emotional Skills (SES), locale, and cultural background. responses and processes data that indicate dichotomous • Item statistical metadata determined during design outcomes, processes, as well as grade/scoring. and calibration stages (beyond contextualization ◦ IMS Global Competencies and Academic Standards mentioned above). Exchange (CASE), which allows us to import and export The selection of which standards to use to accelerate or machine readable, hierarchical expressions of standards enhance the construction of data cubes (within data lakes) knowledge, skills, abilities and other characteristics for large-scale psychometric data depend on the nature of the (KSAOs). One of the notable examples could be found in educational data for the application. For example, CASE is (Rayon et al., 2014). an emerging standard for injecting knowledge about academic • xAPI – Experience API competencies whereas something like xAPI is used to inject the xAPI is a specification for education technology that direct feed of learner assessment results (potentially aligned to enables collection of data on the wide range of experiences those CASE-based standards) in a standards-based way into a a person has (both online and offline). xAPI records data data cube. in a consistent format about an individual or a group of By committing to these data standards, we can leverage individual learners interacting with multiple technologies. The the unique capability of the data lake (i.e., efficiently ingesting vocabulary of the xAPI is simple by design, and the rigor of the high volumes of raw data relating to item responses and item systems that are able to securely share data streams is high. On metadata) while also prescribing structured commitments to top of regulating data exchange, there exists a body of work incoming data so that we can build robust, reliable processing toward using xAPI for aligning the isomorphic user data from scripts. The data cube concept then acts as a high-powered multiple platforms (rf. Bakharia et al., 2016). An example of toolset that can take this processed data and enable the aligning activity across multiple social networking platforms is online analytical operations such as slicing, dicing, drilling, discussed. Also, concrete code and data snippets are given. and pivoting. Moreover, the availability of the data cube and • OpenSalt alignment of databases will influence the standards that will need We have built and released a tool called OpenSALT which to be available for a smooth integration. It is also possible that is an Open-source Standards ALignment Tool that can be new standards will be developed. used to inspect, ingest, edit, export and build crosswalks of standards expressed using the IMS Global CASE format. As we outlined in the data cube overview, we are interested EXAMPLE OF APPLICATIONS OF THE in fusing several main data perspectives: DATA CUBE CONCEPT • Data containing raw item vector analysis data Alignment of Instruments (e.g., correct/incorrect). One of the key elements of an assessment or learning system • Data containing complex student-item interactions for item is the contextualization of the items and learning activities in classes beyond assessment. terms of descriptive keywords that tie them to the subject. The keywords are often referred to as attributes in the Q-matrices (in ◦ Examples of complex outcomes may include: partial psychometrics—see Tatsuoka, 1985), skills, concepts, or tags (in credit results, media interaction results (play), the learning sciences). We will use “concepts” as an overarching engagement results, and process data (e.g., time term for simplicity. Besides items that psychometrics focuses on, the field of learning sciences has a suite of monikers for elements https://en.wikipedia.org/wiki/Common_Education_Data_Standards that cater to learning. The latter include: readings, tutorials, https://www.imsglobal.org/aboutims.html interactive visualizations, and tutored problems (both single- https://xapi.com/overview/ http://opensalt.opened.com/about loop and stepped). To cover all classes of deliverable learning Frontiers in Education | www.frontiersin.org 10 July 2019 | Volume 4 | Article 71 von Davier et al. Data Cubes for Large-Scale Psychometric Data and assessment items we would use the term “content-based yielded a 51% adjusted accuracy. Since the index could be resources” or “resources” for short. sparse, due to the large size of the concept taxonomy and the The relationships between concepts and resources are lower density of items per concept, and the classic machine often referred to as indexing. The intensive labor required learning definition of accuracy (matched classifications over total to create indexes for a set of items can be leveraged via cases classified) would yield an inflated accuracy result due to machine learning/NLP techniques over a tremendous corpus of overwhelming number of cases where the absence of a concept items/resources. This large scale application was not possible is easily confirmed (we obtained classical accuracies at 99% level before we had present day storage solutions and sophisticated consistently). Adjusted accuracy addresses this phenomenon by NLP algorithms. More specifically, the production of said limiting the denominator to the union of concepts that were indexing is time-consuming, laborious, and requires trained present in the human coder-supplied ground-truth training data, subject matter experts. There are multiple approaches that or in the prediction (the latter came in the form of pairings address lowering the costs of producing indices that contextualize of source and target taxonomy concepts, see Figure 11 for an assessment items and learning resources. These approaches can example). Thus, our work so far and the 51% accuracy should come in the form a machine learning procedure that, given be understood as the first step toward automating taxonomy the training data from an exemplary human indexing, would alignment. We learned that it is significantly harder to align perform automated indexing of resources. test items than it is to align the instructional resources, because Data cubes can offer affordances to support the process of the test items do not usually contain the words that describe production and management of concept-content/resource/item the concepts, while the instructional resources do have richer indices. First, even within one subject, such as Math or Science, descriptions. This motivated us to include additional data about there could be alternative taxonomies or ontologies that could be the test items and the test takers, to increase the samples for the used to contextualize resources. See Figures 7, 8 for illustrations. training data, and to refine the models. This is work in progress. Alternatives could come from multiple agencies that develop educational or assessment content or could rely upon an iterative Diagnostic Models process within one team. In addition to the alignment of content which is a relatively new Second, the case when multiple concept taxonomies are application in education, the data cube can support psychometric used to describe multiple non-overlapping pools of items or models that use data from multiple testing administrations resources reserves room for a class of machine learning indexing and multiple testing instruments. For example, one could procedures that could be described as taxonomy alignment develop cognitive diagnostic models (CDMs) that use the data from multiple tests taken by the same individual. CDMs procedures. These procedures are tasked with translating between the languages of multiple taxonomies to achieve a are multivariate latent variable models developed primarily ubiquitous indexing of resources. to identify the mastery of skills measured in a particular Third, all classes of machine learning procedures rely upon domain. The CDMs provide fine-grained inferences about the multiple features within a data cube. The definition and students’ mastery and relevance of these inferences to the student composition of these features is initially developed by subject learning process. matter experts. For example, the text that describes the item or Basically, a CDM in a data cube relates the response vector resource, its content, or its rationale could be parsed into a high- X = X , . . . , X , . . . , X T , where X represents the i i11 ijt iJ ijt dimensional linguistic space. Under these circumstances, a deck response of the ith individual to the jth item from the testing of binary classifiers (one per concept), or a multi-label classifier instrument t, using a lower dimensional discrete latent variable could be devised to produce the indexing. A = (A , . . . , A , . . . , A ) and A is a discrete latent variable for i i1 iK ik ik Also, when we are talking about translation form one concept individual i for latent dimension k as described by the taxonomy taxonomy to another, one could treat existing expert-produced or the Q-matrix. CDMs model the conditional probability of double-coding of a pool of resources, in terms of the two observing X given A , that is, P X |A . The specific form of ( ) i i i i taxonomies being translated, as a training set. A machine the CDM depends on the assumptions we make regarding how learning procedure, then, would be learning the correspondence the elements of A interact to produce the probabilities of relationships. Often, in the form of an n-to-m mapping example, response X . ijt when one item/resource is assigned n concepts from one Traditional data governances in testing organizations cannot taxonomy and m from the other. easily support the application of the CDMs over many testing One of our first attempts with translating two alternative administrations and testing instruments: usually the data from concept taxonomies—between the ACT Subject Taxonomy and each testing instrument is saved in a separate database, that ACT Holistic Framework—has yielded only modest results. We often is not aligned with the data from other instruments. In had only 845 items indexed in both taxonomies and 2,388 addition, in the traditional data governance, the taxonomies (and items that only had ACT Subject Taxonomy indexing. Active the Q-matrices) across testing instruments are not part of the sets of concepts present in the combined set of 3,233 items same framework and are not aligned. included 435 and 455 for the Subject Taxonomy and Holistic Framework respectively. A machine learning procedure based Learning Analytics and Navigation on an ensemble of a deck of multinomial regressions (one Another example of the usefulness of a data cube is to per each of the 455 predicted Holistic Framework concepts) provide learning analytics based on the data available about Frontiers in Education | www.frontiersin.org 11 July 2019 | Volume 4 | Article 71 von Davier et al. Data Cubes for Large-Scale Psychometric Data FIGURE 11 | Examples of question items manually tagged with holistic framework and subject taxonomy. each student. As before, in a data cube, we start with the learning progress by continuously monitoring measurement response vector X = X , . . . , X , . . . , X T , where X data drawn from learner interactions across multiple sources, i i11 ijt iJ ijt represents the response of the ith individual to the jth item including ACT’s portfolio of learning and assessment products. from the testing instrument t. Then, let’s assume that we also Using test scores from ACT’s college readiness exam as a have ancillary data about the student (demographic data, school starting point, Companion identifies the underlying relationships data, attendance data, etc.) collected in the vector (or matrix) between a learner’s measurement data and skill taxonomies or B = (B , . . . , B , . . . , B ) and B represents a specific type across core academic areas identified in ACT’s Holistic i i1 im iM im of ancillary variable (gender, school type, attendance data, etc.). Framework (HF). If available, additional academic assessment Let’s assume that for some students we also have data about their data is drawn from a workforce skills assessment (ACT success in college, collected under C. These data, X, B, and C can WorkKeys), as well as Socio-Emotional Learning (SEL) data now be combined across students to first classify all the students, taken from ACT’s Tessera exam. Bringing these data streams and then later on, to predict the student’s success in the first together, the app predicts skill and knowledge mastery at multiple year of college for each student using only the X and B . Most levels in a taxonomy, such as the HF. i i importantly, these analytics can be used as the basis for learning See Figure 12 for an illustration of the architecture for the pathways for different learning goals and different students to Educational Companion App. More details about this prototype support navigation through educational and career journey. are given in von Davier et al. (2019). As explained in section Alignment of Instruments above, Learning, Measurement, and Navigation through aligning instructional resources and taxonomic structures using ML and NLP methods, and in conjunction with Systems continuously monitoring updates to a learner’s assessment data, The ACTNext prototype app, Educational Companion, illustrates Companion uses its knowledge of the learner’s predicted an applied instance of linking learning, assessment, and abilities along with the understanding of hierarchical, navigation data streams using the data governance described parent/child relationships within the content structure to above as the data cube. The app was designed as a mobile produce personalized lists of content and drive their learning solution for flexibly handling the alignment of learner data and activities forward. Over time, as learners continue to engage content (assessment and instructional) with knowledge and skill with the app, Companion refines, updates, and adapts its taxonomies, while also providing learning analytics feedback and recommendations and predictive analytics to best support an personalized resource recommendations based on the mastery individual learner’s needs. The Companion app also incorporates theory of learning to support progress in areas identified navigational tools developed by Mattern et al. (2017) which as needing intervention. Educational Companion evaluates Frontiers in Education | www.frontiersin.org 12 July 2019 | Volume 4 | Article 71 von Davier et al. Data Cubes for Large-Scale Psychometric Data FIGURE 12 | Illustration of the data flow for the ACTNext Educational Companion App. In this figure, the PLKG denotes the personal learning knowledge graph, and the LOR denotes Learning Object Repository. The Elo-based proficiency refers to the estimated proficiency using the Elo ranking algorithm. The knowledge graph is based on the hierarchical relationship of the skills and subskills as described by a taxonomy or standards. A detailed description is available in von Davier et al. (2019). provide learners with insights related to career interests, as well new structure will allow real-time, big data analyses, including as the relationships between their personal data (assessment machine-learning-based alignment of testing instruments, real- results, g.p.a., etc.) and longitudinal data related to areas of time updates of cognitive diagnostic models during the learning study in college and higher education outcome studies. The process, and real-time feedback and routing to appropriate Companion app was piloted with a group of Grades 11 and 12 resources for learners and test takers. The data cube it is high school students in 2017 (unpublished report, Polyak et al., almost like Rubik’s Cube where one is trying to find the 2018). ideal or typical combination of data. There could be clear Following the pilot, components from the Educational purposes for that search, for instance creating recommended Companion App were redeployed as capabilities that could pathways or recognizing typical patterns for students for extend this methodology to other learning and assessment specific goals. systems. The ACTNext Recommendation and Diagnostics In many ways, the large testing companies are well-positioned (RAD) API was released and integrated into ACT’s free, to create flexible and well-aligned data cubes as described online test preparation platform ACT Academy, offering previously. Specifically, the testing data is valid (the test the same mastery theory of learning and free agency via scores measure what they are supposed to measure, and these evidence-based diagnostics and personalized recommendations validity indices are known) and data privacy policies have of resources. been followed appropriately when the data was collected, which are two important features that support quality data and the statistical alignment of separate databases. Nevertheless, CONCLUSION this new type of data governance has posed challenges for testing organizations. Part of the problem seems to be that In this paper we discussed and proposed a new way to structure the psychometric community has not embraced yet the data large-scale psychometric data at testing organizations based governance as part of the psychometrician’s duties. The role on the concepts and tools that exist in other fields, such of this paper is to bring these issues to the attention of as marketing and learning analytics. The simplest concept is matching the data across individuals, constructs, and testing psychometricians and underscore the importance of expanding the psychometric tool box to include elements of the data science instruments in a data cube. We outlined and described the data structure for taxonomies, item metadata, and item and governance. More research and work is needed to refine and responses in this matched multidimensional matrix that will allow for rapid and in-depth visualization and analysis. This improve AI-based methodologies, but without flexible Frontiers in Education | www.frontiersin.org 13 July 2019 | Volume 4 | Article 71 von Davier et al. Data Cubes for Large-Scale Psychometric Data data alignment, the AI-based methods are not possible ACKNOWLEDGMENTS at all. The authors thank Andrew Cantine for his help editing the paper. The authors thank Drs. John Whitmer and Maria AUTHOR CONTRIBUTIONS Bolsinova for their feedback on the previous version of the paper. The authors thank to the reviewers for their feedback All authors listed have made a substantial, direct and intellectual and suggestions. contribution to the work, and approved it for publication. REFERENCES Miloslavskaya, N., and Tolstoy, A. (2016). Big data, fast data and data lake concepts. Proc. Comput. Sci. 88, 300–305. doi: 10.1016/j.procs.2016.07.439 Bakharia, A., Kitto, K., Pardo, A., Gaševic´, D., and Dawson, S. (2016). “Recipe Pacific Northwest National Laboratory (2014). T. Rex Visual Analytics for for success: lessons learnt from using xAPI within the connected learning Transactional Exploration [Video File]. Retrieved from: https://www.youtube. analytics toolkit,” in Proceedings of the Sixth International Conference on com/watch?v=GSPkAGREO2E Learning Analytics and Knowledge (ACM), 378–382. doi: 10.1145/2883851.28 Polyak, S., Yudelson, M., Peterschmidt, K., von Davier, A. A., and Woo, A. (2018). 83882 ACTNext Educational Companion Pilot Study Report. Unpublished Manuscript. Camara, W., O’Connor, R., Mattern, K., and Hanson, M.-A. (2015). Beyond Rayon, A., Guenaga, M., and Nunez, A. (2014). “Ensuring the integrity Academics: A Holistic Framework for Enhancing Education and Workplace and interoperability of educational usage and social data through Caliper Success. ACT Research Report Series (4), ACT, Inc. framework to support competency-assessment,” in 2014 IEEE Frontiers in Codd, E. F. (1970). A relational model of data for large shared Education Conference (FIE) Proceedings (Madrid: IEEE), 1–9. doi: 10.1109/F. data banks. Commun. ACM 13, 377–387. doi: 10.1145/362384.3 I. E.2014.7044448 62685 RDF (2014). RDF-Semantic Web Standards. Available online at: https://www.w3. Cooper, A. (2014). Learning Analytics Interoperability-the Big Picture in Brief. org/RDF/ Learning Analytics Community Exchange. SPARQL (2008). SPARQL Query Language for RDF. Available online at: www.w3. Dean, J., and Ghemawat, S. (2008). MapReduce: simplified data processing org/TR/rdf-sparql-query/ on large clusters. Commun. ACM 51, 107–113. doi: 10.1145/1327452.13 Tatsuoka, K. (1985). A probabilistic model for diagnosing misconceptions 27492 in the pattern classification approach. J. Educ. Stat. 12, 55–73. Devlin, B. (1996). Data Warehouse: From Architecture to Implementation. Boston, doi: 10.3102/10769986010001055 MA: Addison-Wesley Longman Publishing Co., Inc. von Davier, A. A. (2017). Computational psychometrics in support of collaborative Gilbert, R., Lafferty, R., Hagger-Johnson, G., Harron, K., Zhang, L. C., Smith, P., educational assessments. J. Educ. Meas. 54, 3–11. doi: 10.1111/jedm.12129 et al. (2017). GUILD: guidance for information about linking data sets. J. Public von Davier, A. A., Deonovic, B., Polyak, S. T., and Woo, A. (2019). Computational Health 40, 191–198. doi: 10.1093/pubmed/fdx037 psychometrics approach to holistic learning and assessment systems. Gray, J., Bosworth, A., Layman, A., and Pirahesh, H. (1996). “Data Front. Educ. 4:69. doi: 10.3389/feduc.2019.00069 cube: a relational aggregation operator generalizing group-by, cross- von Davier, M. (2016). High-Performance Psychometrics: The Parallel-e Parallel-m tab, and sub-totals,” in Proceedings of the International Conference on Algorithm for Generalized Latent Variable Models. Princeton, NJ: ETS Research Data Engineering (ICDE) (IEEE Computer Society Press), 152–159. Report. doi: 10.1002/ets2.12120 doi: 10.1109/ICDE.1996.492099 Wong, P. C., Haglin, D. J., Gillen, D., Chavarria-Miranda, D. G., Giovanni, Hao, J., Smith, L., Mislevy, R., von Davier, A. A., and Bauer, M. (2016). C., Joslyn, C., et al. (2015). “A visual analytics paradigm enabling trillion- Taming Log Files From Game/Simulation-Based Assessments: Data Models and edge graph exploration,” in Proceedings IEEE Symposium on Large Data Data Analysis Tools. ETS Research Report Series. Available online at: http:// Analysis and Visualization (LDAV) 2015 (IEEE Computer Society Press), 57–64. onlinelibrary.wiley.com/doi/10.1002/ets2.12096/full doi: 10.1109/LDAV.2015.7348072 Hayes, F. (2002). The Story So Far. Available online at: https://www. computerworld.com/article/2588199/business-intelligence/the-story-so-far. Conflict of Interest Statement: AvD, SP, and MY are employed by ACT Inc. PW html was employed by ACT Inc. at the time this work was conducted. Inmon, W. H. (1992). Building the Data Warehouse. New York, NY: John Wiley & Sons, Inc. Copyright © 2019 von Davier, Wong, Polyak and Yudelson. This is an open-access Mattern, K., Radunzel, J., Ling, J., Liu, R., Allen, J., and Cruce, T. (2017). article distributed under the terms of the Creative Commons Attribution License (CC Personalized College Readiness Zone Technical Documentation. Unpublished BY). The use, distribution or reproduction in other forums is permitted, provided ACT Technical Manual. Iowa City, IA: ACT. the original author(s) and the copyright owner(s) are credited and that the original MDX (2016). Multidimensional Expressions (MDX) Reference. Available online publication in this journal is cited, in accordance with accepted academic practice. at: https://docs.microsoft.com/en-us/sql/mdx/multidimensional-expressions- No use, distribution or reproduction is permitted which does not comply with these mdx-reference terms. Frontiers in Education | www.frontiersin.org 14 July 2019 | Volume 4 | Article 71 http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Frontiers in Education Unpaywall

The Argument for a “Data Cube” for Large-Scale Psychometric Data

Frontiers in EducationJul 18, 2019

Loading next page...
 
/lp/unpaywall/the-argument-for-a-data-cube-for-large-scale-psychometric-data-GKXieY2iSQ

References

References for this paper are not available at this time. We will be adding them shortly, thank you for your patience.

Publisher
Unpaywall
ISSN
2504-284X
DOI
10.3389/feduc.2019.00071
Publisher site
See Article on Publisher Site

Abstract

TECHNOLOGY REPORT published: 18 July 2019 doi: 10.3389/feduc.2019.00071 The Argument for a “Data Cube” for Large-Scale Psychometric Data Alina A. von Davier*, Pak Chung Wong, Steve Polyak and Michael Yudelson ACTNext, ACT Inc, Iowa City, IA, United States In recent years, work with educational testing data has changed due to the affordances provided by technology, the availability of large data sets, and by the advances made in data mining and machine learning. Consequently, data analysis has moved from traditional psychometrics to computational psychometrics. Despite advances in the methodology and the availability of the large data sets collected at each administration, the way assessment data is collected, stored, and analyzed by testing organizations is not conducive to these real-time, data intensive computational methods that can reveal new patterns and information about students. In this paper, we propose a new way to label, collect, and store data from large scale educational learning and assessment systems (LAS) using the concept of the “data cube.” This paradigm will make the application of machine-learning, learning analytics, and complex analyses possible. It will also allow for storing the content for tests (items) and instruction (videos, simulations, items with scaffolds) as data, which opens up new avenues for personalized learning. Edited by: This data paradigm will allow us to innovate at a scale far beyond the hypothesis-driven, Frank Goldhammer, German Institute for International small-scale research that has characterized educational research in the past. Educational Research (LG), Germany Keywords: database alignment, learning analytics, diagnostic models, learning pathways, data standards Reviewed by: Pei Sun, Tsinghua University, China INTRODUCTION Hendrik Drachsler, German Institute for International Educational Research (LG), Germany In recent years, work with educational testing data has changed due to the affordances provided by technology, availability of large data sets, and due to advances made in data mining and machine *Correspondence: learning. Consequently, data analysis moved from traditional psychometrics to computational Alina A. von Davier Alina.vonDavier@act.org psychometrics. In the computational psychometrics framework, psychometric theory is blended with large scale, data-driven knowledge discovery (von Davier, 2017). Despite advances in the Specialty section: methodology and the availability of the large data sets collected at each test administration, the way This article was submitted to the data (from multiple test forms at multiple test administrations) is currently collected, stored and Educational Psychology, analyzed by testing organizations is not conducive to these real-time, data intensive computational a section of the journal psychometrics and analytics methods that can reveal new patterns and information about students. Frontiers in Education In this paper we primarily focus on data collected from large-scale standardized testing Received: 19 November 2018 programs that have been around for decades and that have multiple administrations per year. Accepted: 03 July 2019 Recently, many testing organizations have started to consider including performance or activity- Published: 18 July 2019 based tasks in the assessments, developing formative assessments, or embedding assessments Citation: into the learning process, which led to new challenges around the data governance: data von Davier AA, Wong PC, Polyak S design, collection, alignment, and storage. Some of these challenges have similarities with those and Yudelson M (2019) The Argument encountered and addressed in the field of learning analytics, in which multiple types of data are for a “Data Cube” for Large-Scale merged to provide a comprehensive picture of students’ progress. For example, Bakharia et al. Psychometric Data. Front. Educ. 4:71. doi: 10.3389/feduc.2019.00071 (2016), Cooper (2014) and Rayon et al. (2014) propose solutions for the interoperability of learning Frontiers in Education | www.frontiersin.org 1 July 2019 | Volume 4 | Article 71 von Davier et al. Data Cubes for Large-Scale Psychometric Data data coming from multiple sources. In recent years, the testing Content as Data organizations started to work with logfiles and even before the Additionally, in this paper we expand the traditional definition data exchange standards for activities and events, such as the of educational data (learning and testing data) to include Caliper or xAPI standards, have been developed, researchers have the content (items, passages, scaffolding to support learning), worked on designing the data schema for this type of rich data taxonomies (educational standards, domain specification), the (see Hao et al., 2016). The approach presented in this paper items’ metadata (including item statistics, skills and attributes conceptually builds on these approaches, being focused on the associated with each item), alongside the students’ demographics, data governance for testing organizations. responses, and process data. Rayon et al. (2014) and Bakharia et al. (2016) also proposed including the content and context Database Alignment for learning data in their data interoperability structures for In this paper, we propose a new way to label, collect, and learning analytics, Scalable Competence Assessment through a store data from large scale educational learning and assessment Learning Analytics approach (SCALA), and Connected Learning systems (LAS) using the concept of the “data cube,” which Analytics (CLA) tool kit, respectively. The difference from was introduced by data scientists in the past decade to deal their approach is in the specifics of the content for tests with big data stratification problems in marketing contexts. (items), usage in psychometrics (item banks with metadata), and This concept is also mentioned by Cooper (2014) in the domain structures such as taxonomies or learning progressions. context of interoperability for learning analytics. In statistics In addition, we propose a natural language processing (NLP) and data science the data cube is related to the concept perspective on these data types that facilitates the analysis and of database alignment, where multiple databases are aligned integration with the other types of data. on various dimensions under some prerequisites (see Gilbert Any meaningful learning and assessment system is based on et al., 2017). Applying this paradigm to educational test data a good match of the samples of items and test takers, in terms is quite challenging, due to the lack of coherence of traditional of the difficulty and content on the items’ side, and ability and content tagging, of a common identity management system for educational needs on the students’ side. In order to facilitate test-takers across testing instruments, of collaboration between this match at scale, the responses to the test items, the items psychometricians and data scientists, and until recently, of the themselves and their metadata, and demographic data, need to lack of proven validity of the newly proposed machine learning be aligned. Traditionally, in testing data, we collected and stored methods for measurement. Currently, data for psychometrics the students’ responses and the demographic data, but the items, is stored and analyzed as a two-dimensional matrix—item by instructional content, and the standards have been stored often examinee. In the time of big data, the expectation is not only as a narrative and often it has not been developed, tagged, or that one has access to large volumes of data, but also that the stored in a consistent way. There are numerous systems for data can be aligned and analyzed on different dimensions in real authoring test content, from paper-based, to Excel spreadsheets, time—including various item features like content standards. to sophisticated systems. Similarly, the taxonomies or theoretical The best part is that the testing data available from the large frameworks by which the content is tagged are also stored in testing organizations is valid (the test scores measure what they different formats and systems, again from paper to open-sources are supposed to measure, and these validity indices are known) systems, such as OpenSALT. OpenSALT is an Open source and data privacy policies have been followed appropriately when Standards ALignment Tool that can be used to inspect, ingest, the data was collected. These are two important features that edit, export and build crosswalks of standards expressed using the support quality data and the statistical alignment of separate IMS Global Competencies and Academic Standards Exchange databases (see Gilbert et al., 2017). (CASE) format; we will refer to data standards and models in more detail later in the paper. Some testing programs have well- Data Cubes designed item banks where the items and their metadata are The idea of relational databases has evolved over time, but the stored, but often the content metadata is not necessarily attached paradigm of the “data cube” is easy to describe. Obviously, to a taxonomy. the “data cube” is not a cube, given that different data-vectors We propose that we rewrite the taxonomies and standards are of different lengths. A (multidimensional) data cube is as data in NLP structures that may take the form of sets, or designed to organize the data by grouping it into different mathematical vectors, and add these vectors as dimensions to the dimensions, indexing the data, and precomputing queries “data cube.” Similarly, we should vectorize the items’ metadata frequently. Psychometricians and data scientists can interactively and/or item models and align them on different dimensions of navigate their data and visualize the results through slicing, the “cube.” dicing, drilling, rolling, and pivoting, which are various ways Data Lakes to query the data in a data science vocabulary. Because all the data are indexed and precomputed, a data cube query often runs The proposed data cube concept could be embedded within significantly faster than standard queries. Once a data cube is the larger context of psychometric data, such as ACT’s data built and precomputed, intuitive data projections on different lake. At ACT, we are building the LEarning Analytics Platform dimensions can be applied to it through a number of operations. (LEAP) for which we proposed an updated version of this Traditional psychometric models can also be applied at scale and data-structure: the in-memory database technology that allows in real time in ways which were not possible before. for newer interactive visualization tools to query a higher Frontiers in Education | www.frontiersin.org 2 July 2019 | Volume 4 | Article 71 von Davier et al. Data Cubes for Large-Scale Psychometric Data number of data dimensions interactively. A data lake is a storage solution based on an ability to host large amounts of unprocessed, raw data in the format the sender provides. This includes a range of data representations such as structured, semi-structured, and unstructured. Typically, in a data lake solution, the data structure, and the process for formally accessing it, are not defined until the point where access is required. An architecture for a data lake is typically based on a highly distributed, flexible, scalable storage solution like the Hadoop Distributed File System (HDFS). These types of tools are becoming familiar to testing organizations, as the volume and richness of event data increase. They also facilitate a parallel computational approach for the parameter estimation of complex psychometric models applied to large data sets (see von Davier, 2016). Data Standards for Exchange Data standards allow those interoperating in a data ecosystem to access and work with this complex, high-dimensional data (see for example, Cooper, 2014). There are several data standards that exist in the education space which allow schools, testing, and learning companies to share information and build new knowledge, such as combining the test scores with the GPA, attendance data, and demographics for each student in order to identify meaningful patterns that may lead to differentiated instructions or interventions to help students improve. We will describe several of these standards FIGURE 1 | A relational database. and emphasize the need for universal adoption of data standards for better collaboration and better learning analytics at scale. In the rest of the paper, we describe the evolution of data Relational Data Model and Relational storage and the usefulness of the data cube paradigm for large- scale psychometric data. We then describe the approach we are Databases Management System (RDBMS) considering for testing and learning data (including the content). In a relational data model, data are stored in a table with In the last section, we present preliminary results from a real- rows and columns that look similar to a spreadsheet, as shown data example of the alignment of two taxonomies from the in Figure 1. The columns are referred to as attributes or taxonomy-dimension in the “data cube.” fields, the rows are called tuples or records, and the table that comprises a set of columns and rows is the relation in RDMBS literature. THE FOUNDATIONS OF THE DATA CUBE The technology was developed when CPU speed was AND ITS EXTENSIONS slow, memory was expensive, and disk space was limited. Background and Terminology Consequently, design goals were influenced by the need to In computer science literature, a data cube is a multi- eliminate the redundancies (or duplicated information), such dimensional data structure, or a data array in a computer as “2015” in the Year column in Figure 1, through the programming context. Despite the implicit 3D structural concept concept of normalization. The data normalization process derived from the word “cube,” a data cube can represent involves breaking down a large table into smaller ones through any number of data dimensions such as 1D, 2D. . . nD. a series of normal forms (or procedures). The discussion In scientific computing studies, such as computational fluid of the normalization process is important, but beyond the dynamics, data structures similar to a data cube are often scope of this paper. Readers are referred to Codd (1970) for referred to as scalars (1D), vectors (2D), or tensors (3D). further details. We will briefly discuss the concept of the relational data Information retrieval from these normalized tables can be model (Codd, 1970) and the corresponding relational databases done by joining these tables through the use of unique keys management system (RDBMS) developed in the 70’s, followed identified during the normalization process. The standard by the concept of the data warehouse (Inmon, 1992; Devlin, RDBMS language for maintaining and querying a relational 1996) developed in the 80’s. Together they contributed to the database is Structured Query Language (SQL). Variants of development of the data cube (Gray et al., 1996) concept in SQL can still be found in most modern day databases and the 90’s. spreadsheet systems. Frontiers in Education | www.frontiersin.org 3 July 2019 | Volume 4 | Article 71 von Davier et al. Data Cubes for Large-Scale Psychometric Data Data Warehousing The concept of data warehousing was presented by Devlin and Murphy in 1988, as described by Hayes (2002). A data warehouse is primarily a data repository from one or more disparate sources, such as marketing or sales data. Within an enterprise system, such as those commonly found in many large organizations, it is not uncommon to find multiple systems operating independently, even though they all share the same stored data for market research, data mining, and decision support. The role of data warehousing is to eliminate the duplicated efforts in each decision support system. A data warehouse typically includes some business intelligence tools, tools to extract, transform, and load data into the repository, as well as tools to manage and retrieve the data. Running complex SQL queries on a large data warehouse, however, can be time consuming and too costly to be practical. Data Cube Due to the limitations of the data warehousing described above, data scientists developed the data cube. A data cube is designed to organize the data by grouping it into different dimensions, indexing the data, and precomputing queries frequently. Because all the data are indexed and precomputed, a data cube query often runs significantly faster than a standard SQL query. In business intelligence applications, the data cube concept is often referred to as Online Analytical Processing (OLAP). FIGURE 2 | A 3D data cube. Online Analytical Processing (OLAP) and Business Intelligence The business sector developed OnLine Analytical Processing technology (OLAP) to conduct business intelligence analysis Dicing and look for insights. An OLAP data cube is indeed a The dicing operation is similar to slicing, except dicing allows multidimensional array of data. For example, the data cube users to pick specific values along multiple dimensions. In in Figure 2 represents the same relational data table shown in Figure 6, the dicing operation is applied to both Name (Chloe, Figure 1 with scores from multiple years (i.e., 2015–2017) of the Ada, and Jacob) and Subject (Calculus and Algebra) dimensions. same five students (Noah, Chloe, Ada, Jacob, and Emily) in three The result is a small 2 × 3 × 3 cube shown in the second part of academic fields (Science, Math, and Technology). Once again, Figure 6. there is no limitation on the number of dimensions within an OLAP data cube; the 3D cube in Figure 2 is simply for illustrative Drilling purposes. Once a data cube is built and precomputed, intuitive Drilling-up and -down are standard data navigation approaches data projections (i.e., mapping of a set into a subset) can be for multi-dimensional data mining. Drilling-up often involves applied to it through a number of operations. an aggregation (such as averaging) of a set of attributes, Describing data as a cube has a lot of advantages when whereas drilling-down brings back the details of a prior drilling- analyzing the data. Users can interactively navigate their data up process. and visualize the results through slicing, dicing, drilling, rolling, The drilling operation is particularly useful when dealing with and pivoting. core academic skills that can be best described as a hierarchy. For example, Figure 7A shows four skills of Mathematics (i.e., Slicing Number and Quantity; Operations, Algebra, and Functions; Given a data cube, such as the one shown in Figure 2, users can, Geometry and Measurement; and Statistics and Probability) as for example, extract a part of the data by slicing a rectangular defined by the ACT Holistic Framework (Camara et al., 2015). portion of it from the cube, as highlighted in blue in Figure 3A. Each of these skill sets can be further divided into finer sub- The result is a smaller cube that contains only the 2015 data skills. Figure 7B shows an example of dividing the Number in Figure 3B. Users can slice a cube along any dimension. For and Quantity skill from Figure 7A into eight sub-skills—from example, Figure 4 shows an example of slicing along the Name Counting and Cardinality to Vectors and Matrices. dimension highlighted in blue, and Figure 5 shows an example Figure 8 shows a drill-down operation in a data cube that of slicing along the Subject dimension. first slices along the Subject dimension with the value “Math.” Frontiers in Education | www.frontiersin.org 4 July 2019 | Volume 4 | Article 71 von Davier et al. Data Cubes for Large-Scale Psychometric Data FIGURE 3 | (A,B) Slicing along the Year dimension of a data cube. FIGURE 4 | Slicing along the Name dimension of a data cube. The result is a slice of only the Math scores for all five names a cube and create a new dimension for the cube. The idea, which from 2015 to 2017 in Figure 8. The drilling-down operation is similar to the application of a “function” on a spreadsheet, is in Figure 8 then shows the single Math score that summarizes often referred to as “rolling-up” a data cube. the three different Math sub-scores of Calculus, Algebra, and Topology. For example, Emily’s 2015 Math score is 2, which is an Pivoting average of his Calculus (1), Algebra (3), and Topology (2) scores Pivoting a data cube allows users to look at the cube via different as depicted in Figure 8. perspectives. Figure 9 depicts an example of pivoting the data The drilling-up operation can go beyond aggregation and can cube from showing the Name vs. Subject front view in the first apply rules or mathematical equations to multiple dimensions of part of Figure 9 to a Year vs. Subject in the third part of Figure 9, Frontiers in Education | www.frontiersin.org 5 July 2019 | Volume 4 | Article 71 von Davier et al. Data Cubes for Large-Scale Psychometric Data FIGURE 5 | Slicing along the Subject dimension of a data cube. FIGURE 6 | Dicing a 3D data cube. which shows not just Emily’s 2015 scores but also scores from database querying using languages such as MDX (2016). 2016 and 2017. The 3D data cube is indeed rotated backward The more pre-aggregations done on the disk, the better the along the Subject dimension from the middle image to the last performance for users. However, all operations are conducted image in Figure 9. at disk level, which involves slow operation, and thus CPU load and latency issues. As the production cost of Beyond Data Cubes computer memory continues to go down and its computational performance continues to go up simultaneously, it has become Data cube applications, such as OLAP, take advantage of pre- aggregated data along dimension-levels and provide efficient evident that it is more practical to query data in the Frontiers in Education | www.frontiersin.org 6 July 2019 | Volume 4 | Article 71 von Davier et al. Data Cubes for Large-Scale Psychometric Data FIGURE 7 | (A) Four skills of Mathematics. (B) Eight sub-skills of the Number and Quantity skill. FIGURE 8 | Drilling-down of a data cube. memory instead of pre-aggregating data on the disk as across the 32 nodes. Because such a large amount of information OLAP data-cubes. can be queued from a database in interactive time, the role of data warehouses continues to diminish in the big data era and as cloud computing becomes the norm. In-memory Computation Today, researchers use computer clusters with as much as 1 TB of memory (or more) per computer node for high dimensional, The Traditional Data Cubes Concept in-memory database queries in interactive response time. For Additionally, in-memory database technology allows researchers example, T-Rex (Wong et al., 2015) is able to query billions to develop newer interactive visualization tools to query a of data records in interactive response time using a Resource higher number of data dimensions interactively, which allows Description Framework RDF 2014 database and the SPARQL users to look at their data simultaneously from different (2008) query language running on a Linux cluster with 32 nodes perspectives. For example, T-Rex’s “data facets” design, as of Intel Xeon processors and ∼24.5 TB of memory installed shown in Figure 10A, shows seven data dimensions of a cybersecurity benchmark dataset available in the public domain. https://en.wikipedia.org/wiki/Resource_Description_Framework After the IP address 172.10.0.6 (in the SIP column) in Frontiers in Education | www.frontiersin.org 7 July 2019 | Volume 4 | Article 71 von Davier et al. Data Cubes for Large-Scale Psychometric Data FIGURE 9 | Pivoting a data cube from one perspective (dimensional view) to another. FIGURE 10 | Interactive database queries of a high dimensional dataset. Figure 10A is selected, the data facets update the other 172.10.1.102 is queried in the DIP column. Figure 10C shows six columns as shown in Figure 10B simultaneously. The the results after two consecutive queries, shown in green in query effort continues in Figure 10B where the IP address the figure. Frontiers in Education | www.frontiersin.org 8 July 2019 | Volume 4 | Article 71 von Davier et al. Data Cubes for Large-Scale Psychometric Data The spreadsheet-like visual layout in Figure 10 performs more “Users define the computation in terms of a map and a reduce effectively than many traditional OLAP data interfaces found function, and the underlying runtime system automatically in business intelligence tools. Most importantly, the data facets parallelizes the computation across large-scale clusters of design allows users to queue data in interactive time without machines, handles machine failures, and schedules inter-machine the need for pre-aggregating data with pre-defined options. This communication to make efficient use of the network and disk” video (Pacific Northwest National Laboratory, 2014) shows how (Dean and Ghemawat, 2008). T-Rex operates using a number of benchmark datasets available Scripts for slicing, dicing, drilling, and pivoting [See Section in the public domain. Online Analytical Processing (OLAP) and Business Intelligence] The general in-memory data cube technology has extensive in a data cube fashion can be written, executed, and shared commercial and public domain support and is here to stay until via notebook-style interfaces such as those implemented by, for the next great technology comes along. example, open source solutions such as Apache Zeppelin and Jupyter. Zeppelin and Jupyter are web based tools that allow users to create, edit, reuse, and run “data cube”-like analytics DATA CUBE AS PART OF A DATA LAKE using a variety of languages (e.g., R, Python, Scala, etc.). Such SOLUTION AND THE LEAP FOR scripts can access data on an underlying data source such as HDFS. Organizing analytical code into “notebooks” means PSYCHOMETRIC DATA combining the descriptive narration of the executed analytical or The proposed data cube concept could be embedded within research methodology along with the code blocks and the results the larger context of collecting/pooling psychometric data in of running them. These scripts are sent to sets of computing something that is known in the industry as a data lake machines (called clusters) that manage the process of executing (Miloslavskaya and Tolstoy, 2016). An example of this is ACT’s the notebook in a scalable fashion. Data cube applications in the data lake solution known as the LEarning Analytics Platform data lake solution typically run as independent sets of processes, (LEAP). ACT’s LEAP is a data lake is a storage solution based coordinated by a main driver program. on an ability to host large amounts of unprocessed, raw data Data Standards for Exchange in the format the sender provides. This includes a range of While data lakes provide flexibility in storage and enable the data representations such as structured, semi-structured, and creation of scaleable data cube analysis, it is also typically a good unstructured. Typically, in a data lake solution, the data structure, idea for those operating in a data ecosystem to select a suitable and the process for formally accessing it, are not defined until the data standard for exchange. This makes it easier for those creating point where access is required. the data, transmitting, and receiving the data to avoid the need to A data lake changes the typical process of: extract data, create translations of the data from one system to the next. Data transform it (to a format suitable for querying) and load in to exchange standards allow for the alignment of databases (across tables (ETL) into one favoring extract, load and transform (ELT), various systems), and therefore, facilitate high connectivity of prioritizing the need to capture raw, streaming data prior to the data stored in the date cube. Specifically, the data exchange prescribing any specific transformation of the data. Thus, data standards impose a data schema (names and descriptions of transformation for future use in an analytic procedure is delayed the variables, units, format, etc.) that allow data from multiple until the need for running this procedure arises. We now describe sources to be accessed in a similar way. how the technologies of a data lake help to embed the data cube There are several data standards that exist in the education analysis functionality we described above. space that address the data exchange for different types of data, An architecture for a data lake is typically based on a such as: highly distributed, flexible, scalable storage solution like the Hadoop Distributed File System (HDFS). In a nutshell, an HDFS • Schools Interoperability Framework (SIF) Data instance is similar to a typical distributed file system, although Model Specification it provides higher data throughput and access through the use • SIF is a data sharing, open specification for academic of an implementation of the MapReduce algorithm. MapReduce institutions from kindergarten through workforce. The here refers to the Google algorithm defined in Dean and specification is “composed of two parts: an specification for Ghemawat (2008). ACT’s LEAP implementation of this HDFS modeling educational data which is specific to the educational architecture is based on the industry solution: Hortonworks Data locale, and a system architecture based on both direct and Platform (HDP) which is an easily accessed set of open source assisted models for sharing that data between institutions, technologies. This stores and preserves data in any format given which is international and shared between the locales.” across a set of available servers as data streams (a flow of data) • Ed-Fi Data Standard in stream event processors. These stream event processor uses The Ed-Fi Data Standard was developed in order to address an easy-to-use library for building highly scalable, distributed the needs of standard integration and organization of data in analyses in real time, such as learning events or (serious) game education. This integration and organization of information play events. Using map/reduce task elements, data scientists and https://en.wikipedia.org/wiki/Schools_Interoperability_Framework (Retrieved researchers can efficiently handle large volumes of incoming, raw May 7, 2018). data files. In the MapReduce paradigm: https://www.ed-fi.org/ Frontiers in Education | www.frontiersin.org 9 July 2019 | Volume 4 | Article 71 von Davier et al. Data Cubes for Large-Scale Psychometric Data ranges across a broad set of data sources so it can be analyzed, spent browsing), tutored interaction, synergetic activities filtered, and put to everyday use in various educational (e.g., interactive labs). platforms and systems. ◦ Item classes may include: test items, quizzes, and tasks, • Common Education Data Standards (CEDS) tutorials, and reading materials. CEDS provides a lens for considering and capturing the • Data that contextualizes this item response analysis data standards’ relations and applied use in products and within a hierarchical expression of learning services. The area of emphasis for CEDS is on data items objectives/standards collection and representations across the pre-kindergarten, typical K- 12 learning, learning beyond high school, as well as jobs and ◦ Item contextualization that addresses multiple technical education, ongoing adult-based education, and into hypotheses of how the conceptualization is structured. workforce areas as well. Multiple hypotheses include accounts for human vs. • IMS Global Question and Test Interoperability Specification machine indexing and alternative conceptualizations in includes many standards. The most popular are the IMS the process for development. Caliper and CASE. • Demographic data that may include gender, Social and ◦ IMS Caliper, which allows us to stream in assessment item Emotional Skills (SES), locale, and cultural background. responses and processes data that indicate dichotomous • Item statistical metadata determined during design outcomes, processes, as well as grade/scoring. and calibration stages (beyond contextualization ◦ IMS Global Competencies and Academic Standards mentioned above). Exchange (CASE), which allows us to import and export The selection of which standards to use to accelerate or machine readable, hierarchical expressions of standards enhance the construction of data cubes (within data lakes) knowledge, skills, abilities and other characteristics for large-scale psychometric data depend on the nature of the (KSAOs). One of the notable examples could be found in educational data for the application. For example, CASE is (Rayon et al., 2014). an emerging standard for injecting knowledge about academic • xAPI – Experience API competencies whereas something like xAPI is used to inject the xAPI is a specification for education technology that direct feed of learner assessment results (potentially aligned to enables collection of data on the wide range of experiences those CASE-based standards) in a standards-based way into a a person has (both online and offline). xAPI records data data cube. in a consistent format about an individual or a group of By committing to these data standards, we can leverage individual learners interacting with multiple technologies. The the unique capability of the data lake (i.e., efficiently ingesting vocabulary of the xAPI is simple by design, and the rigor of the high volumes of raw data relating to item responses and item systems that are able to securely share data streams is high. On metadata) while also prescribing structured commitments to top of regulating data exchange, there exists a body of work incoming data so that we can build robust, reliable processing toward using xAPI for aligning the isomorphic user data from scripts. The data cube concept then acts as a high-powered multiple platforms (rf. Bakharia et al., 2016). An example of toolset that can take this processed data and enable the aligning activity across multiple social networking platforms is online analytical operations such as slicing, dicing, drilling, discussed. Also, concrete code and data snippets are given. and pivoting. Moreover, the availability of the data cube and • OpenSalt alignment of databases will influence the standards that will need We have built and released a tool called OpenSALT which to be available for a smooth integration. It is also possible that is an Open-source Standards ALignment Tool that can be new standards will be developed. used to inspect, ingest, edit, export and build crosswalks of standards expressed using the IMS Global CASE format. As we outlined in the data cube overview, we are interested EXAMPLE OF APPLICATIONS OF THE in fusing several main data perspectives: DATA CUBE CONCEPT • Data containing raw item vector analysis data Alignment of Instruments (e.g., correct/incorrect). One of the key elements of an assessment or learning system • Data containing complex student-item interactions for item is the contextualization of the items and learning activities in classes beyond assessment. terms of descriptive keywords that tie them to the subject. The keywords are often referred to as attributes in the Q-matrices (in ◦ Examples of complex outcomes may include: partial psychometrics—see Tatsuoka, 1985), skills, concepts, or tags (in credit results, media interaction results (play), the learning sciences). We will use “concepts” as an overarching engagement results, and process data (e.g., time term for simplicity. Besides items that psychometrics focuses on, the field of learning sciences has a suite of monikers for elements https://en.wikipedia.org/wiki/Common_Education_Data_Standards that cater to learning. The latter include: readings, tutorials, https://www.imsglobal.org/aboutims.html interactive visualizations, and tutored problems (both single- https://xapi.com/overview/ http://opensalt.opened.com/about loop and stepped). To cover all classes of deliverable learning Frontiers in Education | www.frontiersin.org 10 July 2019 | Volume 4 | Article 71 von Davier et al. Data Cubes for Large-Scale Psychometric Data and assessment items we would use the term “content-based yielded a 51% adjusted accuracy. Since the index could be resources” or “resources” for short. sparse, due to the large size of the concept taxonomy and the The relationships between concepts and resources are lower density of items per concept, and the classic machine often referred to as indexing. The intensive labor required learning definition of accuracy (matched classifications over total to create indexes for a set of items can be leveraged via cases classified) would yield an inflated accuracy result due to machine learning/NLP techniques over a tremendous corpus of overwhelming number of cases where the absence of a concept items/resources. This large scale application was not possible is easily confirmed (we obtained classical accuracies at 99% level before we had present day storage solutions and sophisticated consistently). Adjusted accuracy addresses this phenomenon by NLP algorithms. More specifically, the production of said limiting the denominator to the union of concepts that were indexing is time-consuming, laborious, and requires trained present in the human coder-supplied ground-truth training data, subject matter experts. There are multiple approaches that or in the prediction (the latter came in the form of pairings address lowering the costs of producing indices that contextualize of source and target taxonomy concepts, see Figure 11 for an assessment items and learning resources. These approaches can example). Thus, our work so far and the 51% accuracy should come in the form a machine learning procedure that, given be understood as the first step toward automating taxonomy the training data from an exemplary human indexing, would alignment. We learned that it is significantly harder to align perform automated indexing of resources. test items than it is to align the instructional resources, because Data cubes can offer affordances to support the process of the test items do not usually contain the words that describe production and management of concept-content/resource/item the concepts, while the instructional resources do have richer indices. First, even within one subject, such as Math or Science, descriptions. This motivated us to include additional data about there could be alternative taxonomies or ontologies that could be the test items and the test takers, to increase the samples for the used to contextualize resources. See Figures 7, 8 for illustrations. training data, and to refine the models. This is work in progress. Alternatives could come from multiple agencies that develop educational or assessment content or could rely upon an iterative Diagnostic Models process within one team. In addition to the alignment of content which is a relatively new Second, the case when multiple concept taxonomies are application in education, the data cube can support psychometric used to describe multiple non-overlapping pools of items or models that use data from multiple testing administrations resources reserves room for a class of machine learning indexing and multiple testing instruments. For example, one could procedures that could be described as taxonomy alignment develop cognitive diagnostic models (CDMs) that use the data from multiple tests taken by the same individual. CDMs procedures. These procedures are tasked with translating between the languages of multiple taxonomies to achieve a are multivariate latent variable models developed primarily ubiquitous indexing of resources. to identify the mastery of skills measured in a particular Third, all classes of machine learning procedures rely upon domain. The CDMs provide fine-grained inferences about the multiple features within a data cube. The definition and students’ mastery and relevance of these inferences to the student composition of these features is initially developed by subject learning process. matter experts. For example, the text that describes the item or Basically, a CDM in a data cube relates the response vector resource, its content, or its rationale could be parsed into a high- X = X , . . . , X , . . . , X T , where X represents the i i11 ijt iJ ijt dimensional linguistic space. Under these circumstances, a deck response of the ith individual to the jth item from the testing of binary classifiers (one per concept), or a multi-label classifier instrument t, using a lower dimensional discrete latent variable could be devised to produce the indexing. A = (A , . . . , A , . . . , A ) and A is a discrete latent variable for i i1 iK ik ik Also, when we are talking about translation form one concept individual i for latent dimension k as described by the taxonomy taxonomy to another, one could treat existing expert-produced or the Q-matrix. CDMs model the conditional probability of double-coding of a pool of resources, in terms of the two observing X given A , that is, P X |A . The specific form of ( ) i i i i taxonomies being translated, as a training set. A machine the CDM depends on the assumptions we make regarding how learning procedure, then, would be learning the correspondence the elements of A interact to produce the probabilities of relationships. Often, in the form of an n-to-m mapping example, response X . ijt when one item/resource is assigned n concepts from one Traditional data governances in testing organizations cannot taxonomy and m from the other. easily support the application of the CDMs over many testing One of our first attempts with translating two alternative administrations and testing instruments: usually the data from concept taxonomies—between the ACT Subject Taxonomy and each testing instrument is saved in a separate database, that ACT Holistic Framework—has yielded only modest results. We often is not aligned with the data from other instruments. In had only 845 items indexed in both taxonomies and 2,388 addition, in the traditional data governance, the taxonomies (and items that only had ACT Subject Taxonomy indexing. Active the Q-matrices) across testing instruments are not part of the sets of concepts present in the combined set of 3,233 items same framework and are not aligned. included 435 and 455 for the Subject Taxonomy and Holistic Framework respectively. A machine learning procedure based Learning Analytics and Navigation on an ensemble of a deck of multinomial regressions (one Another example of the usefulness of a data cube is to per each of the 455 predicted Holistic Framework concepts) provide learning analytics based on the data available about Frontiers in Education | www.frontiersin.org 11 July 2019 | Volume 4 | Article 71 von Davier et al. Data Cubes for Large-Scale Psychometric Data FIGURE 11 | Examples of question items manually tagged with holistic framework and subject taxonomy. each student. As before, in a data cube, we start with the learning progress by continuously monitoring measurement response vector X = X , . . . , X , . . . , X T , where X data drawn from learner interactions across multiple sources, i i11 ijt iJ ijt represents the response of the ith individual to the jth item including ACT’s portfolio of learning and assessment products. from the testing instrument t. Then, let’s assume that we also Using test scores from ACT’s college readiness exam as a have ancillary data about the student (demographic data, school starting point, Companion identifies the underlying relationships data, attendance data, etc.) collected in the vector (or matrix) between a learner’s measurement data and skill taxonomies or B = (B , . . . , B , . . . , B ) and B represents a specific type across core academic areas identified in ACT’s Holistic i i1 im iM im of ancillary variable (gender, school type, attendance data, etc.). Framework (HF). If available, additional academic assessment Let’s assume that for some students we also have data about their data is drawn from a workforce skills assessment (ACT success in college, collected under C. These data, X, B, and C can WorkKeys), as well as Socio-Emotional Learning (SEL) data now be combined across students to first classify all the students, taken from ACT’s Tessera exam. Bringing these data streams and then later on, to predict the student’s success in the first together, the app predicts skill and knowledge mastery at multiple year of college for each student using only the X and B . Most levels in a taxonomy, such as the HF. i i importantly, these analytics can be used as the basis for learning See Figure 12 for an illustration of the architecture for the pathways for different learning goals and different students to Educational Companion App. More details about this prototype support navigation through educational and career journey. are given in von Davier et al. (2019). As explained in section Alignment of Instruments above, Learning, Measurement, and Navigation through aligning instructional resources and taxonomic structures using ML and NLP methods, and in conjunction with Systems continuously monitoring updates to a learner’s assessment data, The ACTNext prototype app, Educational Companion, illustrates Companion uses its knowledge of the learner’s predicted an applied instance of linking learning, assessment, and abilities along with the understanding of hierarchical, navigation data streams using the data governance described parent/child relationships within the content structure to above as the data cube. The app was designed as a mobile produce personalized lists of content and drive their learning solution for flexibly handling the alignment of learner data and activities forward. Over time, as learners continue to engage content (assessment and instructional) with knowledge and skill with the app, Companion refines, updates, and adapts its taxonomies, while also providing learning analytics feedback and recommendations and predictive analytics to best support an personalized resource recommendations based on the mastery individual learner’s needs. The Companion app also incorporates theory of learning to support progress in areas identified navigational tools developed by Mattern et al. (2017) which as needing intervention. Educational Companion evaluates Frontiers in Education | www.frontiersin.org 12 July 2019 | Volume 4 | Article 71 von Davier et al. Data Cubes for Large-Scale Psychometric Data FIGURE 12 | Illustration of the data flow for the ACTNext Educational Companion App. In this figure, the PLKG denotes the personal learning knowledge graph, and the LOR denotes Learning Object Repository. The Elo-based proficiency refers to the estimated proficiency using the Elo ranking algorithm. The knowledge graph is based on the hierarchical relationship of the skills and subskills as described by a taxonomy or standards. A detailed description is available in von Davier et al. (2019). provide learners with insights related to career interests, as well new structure will allow real-time, big data analyses, including as the relationships between their personal data (assessment machine-learning-based alignment of testing instruments, real- results, g.p.a., etc.) and longitudinal data related to areas of time updates of cognitive diagnostic models during the learning study in college and higher education outcome studies. The process, and real-time feedback and routing to appropriate Companion app was piloted with a group of Grades 11 and 12 resources for learners and test takers. The data cube it is high school students in 2017 (unpublished report, Polyak et al., almost like Rubik’s Cube where one is trying to find the 2018). ideal or typical combination of data. There could be clear Following the pilot, components from the Educational purposes for that search, for instance creating recommended Companion App were redeployed as capabilities that could pathways or recognizing typical patterns for students for extend this methodology to other learning and assessment specific goals. systems. The ACTNext Recommendation and Diagnostics In many ways, the large testing companies are well-positioned (RAD) API was released and integrated into ACT’s free, to create flexible and well-aligned data cubes as described online test preparation platform ACT Academy, offering previously. Specifically, the testing data is valid (the test the same mastery theory of learning and free agency via scores measure what they are supposed to measure, and these evidence-based diagnostics and personalized recommendations validity indices are known) and data privacy policies have of resources. been followed appropriately when the data was collected, which are two important features that support quality data and the statistical alignment of separate databases. Nevertheless, CONCLUSION this new type of data governance has posed challenges for testing organizations. Part of the problem seems to be that In this paper we discussed and proposed a new way to structure the psychometric community has not embraced yet the data large-scale psychometric data at testing organizations based governance as part of the psychometrician’s duties. The role on the concepts and tools that exist in other fields, such of this paper is to bring these issues to the attention of as marketing and learning analytics. The simplest concept is matching the data across individuals, constructs, and testing psychometricians and underscore the importance of expanding the psychometric tool box to include elements of the data science instruments in a data cube. We outlined and described the data structure for taxonomies, item metadata, and item and governance. More research and work is needed to refine and responses in this matched multidimensional matrix that will allow for rapid and in-depth visualization and analysis. This improve AI-based methodologies, but without flexible Frontiers in Education | www.frontiersin.org 13 July 2019 | Volume 4 | Article 71 von Davier et al. Data Cubes for Large-Scale Psychometric Data data alignment, the AI-based methods are not possible ACKNOWLEDGMENTS at all. The authors thank Andrew Cantine for his help editing the paper. The authors thank Drs. John Whitmer and Maria AUTHOR CONTRIBUTIONS Bolsinova for their feedback on the previous version of the paper. The authors thank to the reviewers for their feedback All authors listed have made a substantial, direct and intellectual and suggestions. contribution to the work, and approved it for publication. REFERENCES Miloslavskaya, N., and Tolstoy, A. (2016). Big data, fast data and data lake concepts. Proc. Comput. Sci. 88, 300–305. doi: 10.1016/j.procs.2016.07.439 Bakharia, A., Kitto, K., Pardo, A., Gaševic´, D., and Dawson, S. (2016). “Recipe Pacific Northwest National Laboratory (2014). T. Rex Visual Analytics for for success: lessons learnt from using xAPI within the connected learning Transactional Exploration [Video File]. Retrieved from: https://www.youtube. analytics toolkit,” in Proceedings of the Sixth International Conference on com/watch?v=GSPkAGREO2E Learning Analytics and Knowledge (ACM), 378–382. doi: 10.1145/2883851.28 Polyak, S., Yudelson, M., Peterschmidt, K., von Davier, A. A., and Woo, A. (2018). 83882 ACTNext Educational Companion Pilot Study Report. Unpublished Manuscript. Camara, W., O’Connor, R., Mattern, K., and Hanson, M.-A. (2015). Beyond Rayon, A., Guenaga, M., and Nunez, A. (2014). “Ensuring the integrity Academics: A Holistic Framework for Enhancing Education and Workplace and interoperability of educational usage and social data through Caliper Success. ACT Research Report Series (4), ACT, Inc. framework to support competency-assessment,” in 2014 IEEE Frontiers in Codd, E. F. (1970). A relational model of data for large shared Education Conference (FIE) Proceedings (Madrid: IEEE), 1–9. doi: 10.1109/F. data banks. Commun. ACM 13, 377–387. doi: 10.1145/362384.3 I. E.2014.7044448 62685 RDF (2014). RDF-Semantic Web Standards. Available online at: https://www.w3. Cooper, A. (2014). Learning Analytics Interoperability-the Big Picture in Brief. org/RDF/ Learning Analytics Community Exchange. SPARQL (2008). SPARQL Query Language for RDF. Available online at: www.w3. Dean, J., and Ghemawat, S. (2008). MapReduce: simplified data processing org/TR/rdf-sparql-query/ on large clusters. Commun. ACM 51, 107–113. doi: 10.1145/1327452.13 Tatsuoka, K. (1985). A probabilistic model for diagnosing misconceptions 27492 in the pattern classification approach. J. Educ. Stat. 12, 55–73. Devlin, B. (1996). Data Warehouse: From Architecture to Implementation. Boston, doi: 10.3102/10769986010001055 MA: Addison-Wesley Longman Publishing Co., Inc. von Davier, A. A. (2017). Computational psychometrics in support of collaborative Gilbert, R., Lafferty, R., Hagger-Johnson, G., Harron, K., Zhang, L. C., Smith, P., educational assessments. J. Educ. Meas. 54, 3–11. doi: 10.1111/jedm.12129 et al. (2017). GUILD: guidance for information about linking data sets. J. Public von Davier, A. A., Deonovic, B., Polyak, S. T., and Woo, A. (2019). Computational Health 40, 191–198. doi: 10.1093/pubmed/fdx037 psychometrics approach to holistic learning and assessment systems. Gray, J., Bosworth, A., Layman, A., and Pirahesh, H. (1996). “Data Front. Educ. 4:69. doi: 10.3389/feduc.2019.00069 cube: a relational aggregation operator generalizing group-by, cross- von Davier, M. (2016). High-Performance Psychometrics: The Parallel-e Parallel-m tab, and sub-totals,” in Proceedings of the International Conference on Algorithm for Generalized Latent Variable Models. Princeton, NJ: ETS Research Data Engineering (ICDE) (IEEE Computer Society Press), 152–159. Report. doi: 10.1002/ets2.12120 doi: 10.1109/ICDE.1996.492099 Wong, P. C., Haglin, D. J., Gillen, D., Chavarria-Miranda, D. G., Giovanni, Hao, J., Smith, L., Mislevy, R., von Davier, A. A., and Bauer, M. (2016). C., Joslyn, C., et al. (2015). “A visual analytics paradigm enabling trillion- Taming Log Files From Game/Simulation-Based Assessments: Data Models and edge graph exploration,” in Proceedings IEEE Symposium on Large Data Data Analysis Tools. ETS Research Report Series. Available online at: http:// Analysis and Visualization (LDAV) 2015 (IEEE Computer Society Press), 57–64. onlinelibrary.wiley.com/doi/10.1002/ets2.12096/full doi: 10.1109/LDAV.2015.7348072 Hayes, F. (2002). The Story So Far. Available online at: https://www. computerworld.com/article/2588199/business-intelligence/the-story-so-far. Conflict of Interest Statement: AvD, SP, and MY are employed by ACT Inc. PW html was employed by ACT Inc. at the time this work was conducted. Inmon, W. H. (1992). Building the Data Warehouse. New York, NY: John Wiley & Sons, Inc. Copyright © 2019 von Davier, Wong, Polyak and Yudelson. This is an open-access Mattern, K., Radunzel, J., Ling, J., Liu, R., Allen, J., and Cruce, T. (2017). article distributed under the terms of the Creative Commons Attribution License (CC Personalized College Readiness Zone Technical Documentation. Unpublished BY). The use, distribution or reproduction in other forums is permitted, provided ACT Technical Manual. Iowa City, IA: ACT. the original author(s) and the copyright owner(s) are credited and that the original MDX (2016). Multidimensional Expressions (MDX) Reference. Available online publication in this journal is cited, in accordance with accepted academic practice. at: https://docs.microsoft.com/en-us/sql/mdx/multidimensional-expressions- No use, distribution or reproduction is permitted which does not comply with these mdx-reference terms. Frontiers in Education | www.frontiersin.org 14 July 2019 | Volume 4 | Article 71

Journal

Frontiers in EducationUnpaywall

Published: Jul 18, 2019

There are no references for this article.