Microsoft has spent the last two years adding new productivity features to Teams, and now they are overhauling how the fundamentals work with thanks to Artificial Intelligence (AI). Microsoft’s new AI-powered voice quality improvements should improve or even eliminate the day-to-day annoyances such as echoing and the dreaded 'Can you hear me now?' issues.
Microsoft now uses machine learning models to improve room acoustics so you’ll no longer sound like you’re hiding in a cave. “While we have been trying our best with digital signal processing to do a really good job in Teams, we have now started using machine learning for the first time to build echo cancellation where you can truly reduce echo from all the different devices,” explains Robert Aichner, a principal program manager for intelligent conversation and communications cloud at Microsoft, in an interview with The Verge.
Microsoft have been in the testing stage for months, measuring its models in the real world to ensure Teams users are noticing the echo reduction and improvements in call quality. The software developers used 30,000 hours of speech to help train models, and captured thousands of devices through crowd sourcing where Teams users are paid to record their voice and playback audio from their device.
“We also simulate about 100,000 different rooms... the room acoustics play a big role in echo cancellation,” says Aichner. This has resulted in big improvements in call audio quality, and an elimination of echo that also allows multiple people to speak at the same time. You can see all of the improvements Microsoft have made in the video above.
If Teams detects sound is bouncing or reverberating in a room resulting in shallow audio, the model will also convert captured audio and process it to make it sound like Teams participants are speaking into a close-range microphone instead of an echoey mess.
The most impressive part is the ability for people to interrupt each other on Teams calls now, without the awkward overlap where you can’t hear the other person due to the echo. Microsoft is now shipping all this work in Teams, alongside the improvements it has made with AI-based noise suppression previously. All of the processing is done locally on client devices, instead of the cloud.
“We said we want to do it on the client, because the cloud is still expensive if you want to do every call processed in the cloud... and obviously we’d have to pass that cost onto the customer,” explains Aichner. That would mean potentially restricting these important Teams improvements to paying customers, and the on-device route means features like noise suppression are available on 90 percent of devices using Teams.
All of these new Microsoft Teams improvements are now live, alongside some real-time screen optimisations for text in videos and AI-based improvements to bandwidth constraints during video or screen-sharing calls.
To read the full article click here.