AI can grade physics exams, but only if the rubric is a checklist

What happened

AI can reliably score handwritten physics exams, but only if the grading rules are very specific. This means that for AI to be useful in grading, educators need to design detailed, checklist-style rubrics.

Why it matters

For years, grading complex student responses has been slow and inconsistent, especially for partial credit. This paper shows that AI can help, but it also reveals a critical bottleneck: the quality of the rubric. If the rubric is too vague, the AI struggles, meaning the human effort shifts from grading to designing extremely precise instructions for the AI.

The signal

Watch for universities and testing organizations to start publishing new, highly granular rubric designs for AI-assisted grading, or for AI tools to offer built-in rubric design features.