Science

Fostering Mathematics Engagement Through Citizen Science

Teach mathematics and science using materials for the weather-focused Community Collaborative Rain, Hail, & Snow Network project.

Author/Presenter

Danielle R. Scharen

Erin McInerney

Lindsey H. Sachs

Meredith L. Hayes

P. Sean Smith

Lead Organization(s)
Year
2025
Short Description

Teach mathematics and science using materials for the weather-focused Community Collaborative Rain, Hail, & Snow Network project.

Citizen Science in the Elementary Classroom: Going Beyond Data Collection

This article portrays how citizen science (CS) projects can be integrated into elementary classrooms to enhance students’ sensemaking skills and connect to real-world science problems. For the last several years, we have been involved in a study, Teacher Learning for Effective School-Based Citizen Science (TL4CS), that developed materials for elementary school teachers to engage their students in data collection, analysis, and interpretation for two existing CS projects: Community Collaborative Rain, Hail, and Snow Network (CoCoRaHS) and the Lost Ladybug Project (LLP).

Author/Presenter

Jill K. McGowan

Lindsey Sachs

Anna Bruce

Danielle R. Scharen

Meredith Hayes

P. Sean Smith

Lead Organization(s)
Year
2025
Short Description

This article portrays how citizen science (CS) projects can be integrated into elementary classrooms to enhance students’ sensemaking skills and connect to real-world science problems.

Unveiling Scoring Processes: Dissecting the Differences Between LLMs and Human Graders in Automatic Scoring

Large language models (LLMs) have demonstrated strong potential in performing automatic scoring for constructed response assessments. While constructed responses graded by humans are usually based on given grading rubrics, the methods by which LLMs assign scores remain largely unclear. It is also uncertain how closely AI’s scoring process mirrors that of humans or if it adheres to the same grading criteria. To address this gap, this paper uncovers the grading rubrics that LLMs used to score students’ written responses to science tasks and their alignment with human scores.

Author/Presenter

Xuansheng Wu

Padmaja Pravin Saraf

Gyeonggeon Lee

Ehsan Latif

Ninghao Liu

Xiaoming Zhai

Lead Organization(s)
Year
2025
Short Description

Large language models (LLMs) have demonstrated strong potential in performing automatic scoring for constructed response assessments. While constructed responses graded by humans are usually based on given grading rubrics, the methods by which LLMs assign scores remain largely unclear. It is also uncertain how closely AI’s scoring process mirrors that of humans or if it adheres to the same grading criteria. To address this gap, this paper uncovers the grading rubrics that LLMs used to score students’ written responses to science tasks and their alignment with human scores. We also examine whether enhancing the alignments can improve scoring accuracy.

Unveiling Scoring Processes: Dissecting the Differences Between LLMs and Human Graders in Automatic Scoring

Large language models (LLMs) have demonstrated strong potential in performing automatic scoring for constructed response assessments. While constructed responses graded by humans are usually based on given grading rubrics, the methods by which LLMs assign scores remain largely unclear. It is also uncertain how closely AI’s scoring process mirrors that of humans or if it adheres to the same grading criteria. To address this gap, this paper uncovers the grading rubrics that LLMs used to score students’ written responses to science tasks and their alignment with human scores.

Author/Presenter

Xuansheng Wu

Padmaja Pravin Saraf

Gyeonggeon Lee

Ehsan Latif

Ninghao Liu

Xiaoming Zhai

Lead Organization(s)
Year
2025
Short Description

Large language models (LLMs) have demonstrated strong potential in performing automatic scoring for constructed response assessments. While constructed responses graded by humans are usually based on given grading rubrics, the methods by which LLMs assign scores remain largely unclear. It is also uncertain how closely AI’s scoring process mirrors that of humans or if it adheres to the same grading criteria. To address this gap, this paper uncovers the grading rubrics that LLMs used to score students’ written responses to science tasks and their alignment with human scores. We also examine whether enhancing the alignments can improve scoring accuracy.