The world is being quietly rearranged by people who write very long documents.


The title they went with Designing Reliable LLM-Assisted Rubric Scoring for Constructed Responses: Evidence from Physics Exams Noisy translates that to

AI can grade physics exams, but only if the rubric is a checklist


AI can reliably score handwritten physics exams, but only if the grading rules are very specific. This means that for AI to be useful in grading, educators need to design detailed, checklist-style rubrics.
For years, grading complex student responses has been slow and inconsistent, especially for partial credit. This paper shows that AI can help, but it also reveals a critical bottleneck: the quality of the rubric. If the rubric is too vague, the AI struggles, meaning the human effort shifts from grading to designing extremely precise instructions for the AI.
Watch for universities and testing organizations to start publishing new, highly granular rubric designs for AI-assisted grading, or for AI tools to offer built-in rubric design features.

If you insist
Read the original →