Assessment

NLP-Enabled Automated Assessment of Scientific Explanations: Towards Eliminating Linguistic Discrimination

As use of artificial intelligence (AI) has increased, concerns about AI bias and discrimination have been growing. This paper discusses an application called PyrEval in which natural language processing (NLP) was used to automate assessment and provide feedback on middle school science writing without linguistic discrimination. Linguistic discrimination in this study was operationalized as unfair assessment of scientific essays based on writing features that are not considered normative such as subject-verb disagreement.

Author/Presenter

ChanMin Kim

Rebecca J. Passonneau

Eunseo Lee

Mahsa Sheikhi Karizaki

Dana Gnesdilow

Sadhana Puntambekar

Year
2025
Short Description

As use of artificial intelligence (AI) has increased, concerns about AI bias and discrimination have been growing. This paper discusses an application called PyrEval in which natural language processing (NLP) was used to automate assessment and provide feedback on middle school science writing without linguistic discrimination.

NLP-Enabled Automated Assessment of Scientific Explanations: Towards Eliminating Linguistic Discrimination

As use of artificial intelligence (AI) has increased, concerns about AI bias and discrimination have been growing. This paper discusses an application called PyrEval in which natural language processing (NLP) was used to automate assessment and provide feedback on middle school science writing without linguistic discrimination. Linguistic discrimination in this study was operationalized as unfair assessment of scientific essays based on writing features that are not considered normative such as subject-verb disagreement.

Author/Presenter

ChanMin Kim

Rebecca J. Passonneau

Eunseo Lee

Mahsa Sheikhi Karizaki

Dana Gnesdilow

Sadhana Puntambekar

Year
2025
Short Description

As use of artificial intelligence (AI) has increased, concerns about AI bias and discrimination have been growing. This paper discusses an application called PyrEval in which natural language processing (NLP) was used to automate assessment and provide feedback on middle school science writing without linguistic discrimination.

A Usability Analysis and Consequences of Testing Exploration of the Problem-Solving Measures–Computer-Adaptive Test

Testing is a part of education around the world; however, there are concerns that consequences of testing is underexplored within current educational scholarship. Moreover, usability studies are rare within education. One aim of the present study was to explore the usability of a mathematics problem-solving test called the Problem Solving Measures–Computer-Adaptive Test (PSM-CAT) designed for grades six to eight students (ages 11–14).

Author/Presenter

Sophie Grace King

Jonathan David Bostic

Toni A. May

Gregory E. Stone

Year
2025
Short Description

Testing is a part of education around the world; however, there are concerns that consequences of testing is underexplored within current educational scholarship. Moreover, usability studies are rare within education. One aim of the present study was to explore the usability of a mathematics problem-solving test called the Problem Solving Measures–Computer-Adaptive Test (PSM-CAT) designed for grades six to eight students (ages 11–14). The second aim of this mixed-methods research was to unpack consequences of testing validity evidence related to the results and test interpretations, leveraging the voices of participants.

A Usability Analysis and Consequences of Testing Exploration of the Problem-Solving Measures–Computer-Adaptive Test

Testing is a part of education around the world; however, there are concerns that consequences of testing is underexplored within current educational scholarship. Moreover, usability studies are rare within education. One aim of the present study was to explore the usability of a mathematics problem-solving test called the Problem Solving Measures–Computer-Adaptive Test (PSM-CAT) designed for grades six to eight students (ages 11–14).

Author/Presenter

Sophie Grace King

Jonathan David Bostic

Toni A. May

Gregory E. Stone

Year
2025
Short Description

Testing is a part of education around the world; however, there are concerns that consequences of testing is underexplored within current educational scholarship. Moreover, usability studies are rare within education. One aim of the present study was to explore the usability of a mathematics problem-solving test called the Problem Solving Measures–Computer-Adaptive Test (PSM-CAT) designed for grades six to eight students (ages 11–14). The second aim of this mixed-methods research was to unpack consequences of testing validity evidence related to the results and test interpretations, leveraging the voices of participants.

A Usability Analysis and Consequences of Testing Exploration of the Problem-Solving Measures–Computer-Adaptive Test

Testing is a part of education around the world; however, there are concerns that consequences of testing is underexplored within current educational scholarship. Moreover, usability studies are rare within education. One aim of the present study was to explore the usability of a mathematics problem-solving test called the Problem Solving Measures–Computer-Adaptive Test (PSM-CAT) designed for grades six to eight students (ages 11–14).

Author/Presenter

Sophie Grace King

Jonathan David Bostic

Toni A. May

Gregory E. Stone

Year
2025
Short Description

Testing is a part of education around the world; however, there are concerns that consequences of testing is underexplored within current educational scholarship. Moreover, usability studies are rare within education. One aim of the present study was to explore the usability of a mathematics problem-solving test called the Problem Solving Measures–Computer-Adaptive Test (PSM-CAT) designed for grades six to eight students (ages 11–14). The second aim of this mixed-methods research was to unpack consequences of testing validity evidence related to the results and test interpretations, leveraging the voices of participants.

A Usability Analysis and Consequences of Testing Exploration of the Problem-Solving Measures–Computer-Adaptive Test

Testing is a part of education around the world; however, there are concerns that consequences of testing is underexplored within current educational scholarship. Moreover, usability studies are rare within education. One aim of the present study was to explore the usability of a mathematics problem-solving test called the Problem Solving Measures–Computer-Adaptive Test (PSM-CAT) designed for grades six to eight students (ages 11–14).

Author/Presenter

Sophie Grace King

Jonathan David Bostic

Toni A. May

Gregory E. Stone

Year
2025
Short Description

Testing is a part of education around the world; however, there are concerns that consequences of testing is underexplored within current educational scholarship. Moreover, usability studies are rare within education. One aim of the present study was to explore the usability of a mathematics problem-solving test called the Problem Solving Measures–Computer-Adaptive Test (PSM-CAT) designed for grades six to eight students (ages 11–14). The second aim of this mixed-methods research was to unpack consequences of testing validity evidence related to the results and test interpretations, leveraging the voices of participants.

Expanding Uses of the STEM Observation Protocol (STEM-OP): Secondary Science Teachers’ Reflections on Integrated STEM Practice

There are few guidelines related to how to implement integrated STEM education in the K-12 science classroom. It is important that teachers have opportunities to reflect on integrated STEM instruction when implemented so that they may further develop their practice. This research aimed to understand how the STEM Observation Protocol (STEM-OP) may be used as a way for teachers to reflect on their integrated STEM practice.

Author/Presenter

Emily Dare

Joshua Ellis

Christopher Irwin

Lead Organization(s)
Year
2025
Short Description

There are few guidelines related to how to implement integrated STEM education in the K-12 science classroom. It is important that teachers have opportunities to reflect on integrated STEM instruction when implemented so that they may further develop their practice. This research aimed to understand how the STEM Observation Protocol (STEM-OP) may be used as a way for teachers to reflect on their integrated STEM practice. This exploratory case study was designed to better understand secondary science teachers’ reflections on the STEM-OP by addressing the following research questions: 1) What are secondary science teachers’ reflections on integrated STEM practices as measured by the STEM-OP? and 2) In what ways do secondary science teachers envision using the STEM-OP as a tool in their practice?

Employing Automatic Analysis Tools Aligned to Learning Progressions to Assess Knowledge Application and Support Learning in STEM

We discuss transforming STEM education using three aspects: learning progressions (LPs), constructed response performance assessments, and artificial intelligence (AI). Using LPs to inform instruction, curriculum, and assessment design helps foster students’ ability to apply content and practices to explain phenomena, which reflects deeper science understanding. To measure the progress along these LPs, performance assessments combining elements of disciplinary ideas, crosscutting concepts and practices are needed.

Author/Presenter

Leonora Kaldaras

Kevin Haudek

Joseph Krajcik

Year
2024
Short Description

We discuss transforming STEM education using three aspects: learning progressions (LPs), constructed response performance assessments, and artificial intelligence (AI). Using LPs to inform instruction, curriculum, and assessment design helps foster students’ ability to apply content and practices to explain phenomena, which reflects deeper science understanding. To measure the progress along these LPs, performance assessments combining elements of disciplinary ideas, crosscutting concepts and practices are needed. However, these tasks are time-consuming and expensive to score and provide feedback for. Artificial intelligence (AI) allows to validate the LPs and evaluate performance assessments for many students quickly and efficiently.

Combining Natural Language Processing with Epistemic Network Analysis to Investigate Student Knowledge Integration within an AI Dialog

In this study, we used Epistemic Network Analysis (ENA) to represent data generated by Natural Language Processing (NLP) analytics during an activity based on the Knowledge Integration (KI) framework. The activity features a web-based adaptive dialog about energy transfer in photosynthesis and cellular respiration. Students write an initial explanation, respond to two adaptive prompts in the dialog, and write a revised explanation. The NLP models score the KI level of the initial and revised explanations. They also detect the ideas in the explanations and the dialog responses.

Author/Presenter

Weiying Li

Hsin-Yi Chang

Allison Bradford

Libby Gerard

Marcia C. Linn

Year
2024
Short Description

In this study, we used Epistemic Network Analysis (ENA) to represent data generated by Natural Language Processing (NLP) analytics during an activity based on the Knowledge Integration (KI) framework. The activity features a web-based adaptive dialog about energy transfer in photosynthesis and cellular respiration.

Unpacking the Nuances: An Exploratory Multilevel Analysis on the Operationalization of Integrated STEM Education and Student Attitudinal Change

Integrated STEM education (iSTEM) is recognized for its potential to improve students’ scientific and mathematical knowledge, as well as to nurture positive attitudes toward STEM, which are essential for motivating students to consider STEM-related careers. While prior studies have examined the relationship between specific iSTEM activities or curricula and changes in student attitudes, research is lacking on how the aspects of iSTEM are operationalized and their influence on shifts in student attitudes towards STEM, especially when considering the role of demographic factors.

Author/Presenter

Benny Mart R. Hiwatig

Gillian H. Roehrig

Mark D. Rouleau

Lead Organization(s)
Year
2024
Short Description

Integrated STEM education (iSTEM) is recognized for its potential to improve students’ scientific and mathematical knowledge, as well as to nurture positive attitudes toward STEM, which are essential for motivating students to consider STEM-related careers. While prior studies have examined the relationship between specific iSTEM activities or curricula and changes in student attitudes, research is lacking on how the aspects of iSTEM are operationalized and their influence on shifts in student attitudes towards STEM, especially when considering the role of demographic factors. Addressing this gap, our study applied multilevel modeling to analyze how different iSTEM aspects and demographic variables predict changes in student attitudes.