Historians can now query AI to extract data from old documents — no fixed pipeline required

What happened

A new AI tool lets historians convert scans of primary sources into structured data by talking to an AI agent in plain language, rather than forcing all documents through one fixed extraction process. This means researchers can adapt the tool to different types of historical sources and test whether the AI is actually working on their specific materials.

Why it matters

For decades, historians working with large document collections either hired teams to manually transcribe and code materials, or used software designed for modern digital text — neither fit the messy reality of old papers, handwriting, and degraded scans. This tool lets historians use AI without surrendering control to a pre-built system. The real question is whether it actually works: can historians trust the extractions enough to build arguments on them, or does the AI introduce enough errors to make the output useless for scholarship.

The signal

Track whether historians actually adopt this on substantial archival projects in the next 18 months, and whether published historical scholarship starts citing datasets extracted this way as reliable evidence.