Strengthening Data Literacy across the Curriculum (SDLC)

This project is developing and studying high school curriculum modules that integrate social justice topics with statistical data investigations to promote skills and interest in data science among underrepresented groups in STEM.

Full Description: 

The Strengthening Data Literacy across the Curriculum (SDLC) project is an exploratory/early stage design and development effort that aims to promote understanding of core statistical concepts and interest in quantitative data analysis among high school students from underrepresented groups in STEM. Led by a collaboration of researchers and developers at Education Development Center (EDC), statistics educators at California Polytechnic State University (Cal Poly), and technology developers at The Concord Consortium, the project is creating and studying a set of curriculum modules targeted to high school students who are taking mathematics or statistics classes that are not at advanced-placement (AP) levels. Iteratively developed and tested in collaboration with high school statistics and social studies teachers, the modules consist of applied data investigations structured around a four-step data investigation cycle that engage students in explorations of authentic social science issues using large-scale data sets from the U.S. Census Bureau. The project hypothesizes that students who engage in guided investigations using data visualization tools to explore and visualize statistical concepts may develop deeper understandings of these concepts as well as the data investigation process. Similarly, high school students – particularly those from historically marginalized groups who are underrepresented in STEM fields – may develop greater interest in statistics when they can use data to examine patterns of social and economic inequality and questions related to social justice.

One module, Investigating Income Inequality in the U.S., focuses on describing, comparing, and making sense of quantitative variables. Students deepen their understanding of this content by investigating questions such as: How have incomes for higher- and lower-income individuals in the U.S. changed over time? How much income inequality exists between males and females in the U.S.? Does education explain the wage gap between males and females? Another module, Investigating Immigration to the U.S., focuses on describing, comparing, and making sense of categorical variables. Students investigate questions such as: Are there more immigrants in the U.S. today than in previous years? Where have immigrants to the U.S. come from, now and in the past? Are immigrants as likely as the U.S. born to be participating in the labor force, after adjusting for education? Students conduct these analyses using the Common Online Data Analysis Platform (CODAP), an open-source set of tools that supports data visualization and conceptual understanding of statistical ideas over calculations. Lessons encourage collaborative inquiry and provide students with experiences in multivariable analysis—an important domain that is underemphasized in current high school mathematics and statistics curricula but critical for analyzing data in a big-data world.

The project is using a mixed methods approach to study three primary research questions: 1) What is the feasibility of implementing SDLC modules, and what supports may teachers and students need to use the modules? 2) In what ways may different features and components of the SDLC modules help to promote positive student learning and interest outcomes? 3) To what extent do students show greater interest in statistics and data analysis, as well as improved understandings of target statistical concepts, after module use? To investigate these questions, the project has worked with 12 mathematics and six social studies teachers in diverse public high schools in Massachusetts and California to conduct iterative research with over 600 students.  Through this work, the project aims to build knowledge of curriculum-based approaches that prepare and attract more diverse populations to data science fields.