The world is being quietly rearranged by people who write very long documents.


The title they went with ASPIRin: Action Space Projection for Interactivity-Optimized Reinforcement Learning in Full-Duplex Speech Language Models Noisy translates that to

AI models can now decide when to speak, not just what to say


AI speech models often struggle to have natural conversations. Researchers built a new method that lets these models decide when to speak, not just what words to use, making interactions sound less robotic.
Current AI speech models often talk over people or repeat themselves. This makes conversations feel unnatural. This new method separates the timing of speech from the actual words. It means future AI assistants could have much smoother, more human-like interactions, removing a key barrier to wider adoption of AI in customer service or personal assistant roles.
Watch for this method to appear in commercial speech AI products, and whether it actually makes conversations feel less robotic to users.

If you insist
Read the original →