One university tested AI tutors for teaching software design — they were accurate but cold

What happened

A master's course used a customized ChatGPT chatbot to teach 29 students domain-specific knowledge (cryptocurrency finance and design patterns) over two weeks, with all student interactions logged and evaluated. The AI gave factually correct answers 98.9% of the time and stayed relevant 92.2% of the time, but felt impersonal and didn't offer follow-up support — students reported big gains in confidence about what they learned, but the teaching method left room for improvement in tone and feedback.

Why it matters

This is a small, single-course experiment, but it's one of the first times anyone has actually measured how well a large language model works as a tutor in a real classroom with real students — not a lab benchmark. The numbers matter: 99% accuracy on factual questions suggests the main risk people worry about (AI hallucinating wrong answers) wasn't the actual problem here; instead, the AI was technically competent but socially wooden. That tells you something important about where the real bottleneck is in AI-assisted education: not whether the machine knows the answer, but whether it knows how to encourage a student who's stuck.

The signal

Track whether other universities run similar experiments with larger cohorts and report comparable data on accuracy, relevance, and student confidence gains — if the 99% accuracy holds across different subjects and institutions, it's evidence that factual tutoring is actually a solved problem; if supportiveness remains the weak point across studies, that becomes the design focus for AI learning tools.