Model extraction from AI systems harder than recent research claimed

What happened

A recent paper claimed that extracting a trained neural network's internal structure becomes easier and faster as networks get deeper, even when you can only see the final classification answer. This new research shows the opposite: a single neuron that barely changes state can make extraction exponentially slower, and the claimed polynomial speedup doesn't hold up in practice. This matters because it means protecting proprietary AI models through obscurity is harder to break than the field thought — but only because the attacks are themselves harder than advertised.

Why it matters

If model extraction is actually much harder than recent high-profile papers suggest, it reshapes the practical security posture of deployed AI systems and may reduce urgency around legal protections for proprietary model weights.