A new AI can focus on one voice in a crowd The program can identify and suppress background noise using both visual and audio cues
BY MARIA TEMMING 9:00AM, JUNE 11, 2018
Much like someone listening to a conversation at a crowded party, a new artificial intelligence can tune out background noise in videos to hear what a particular person on screen is saying.
Humans are naturally good at focusing on specific voices amidst the din — a phenomenon known as the cocktail party effect (SN Online: 4/29/14). But until now, programs designed to listen for specific speakers in noisy audio tracks have struggled to mimic humans’ selective mental muting. The new AI is designed to use both audio and visual cues, such as mouth movements, to separate sounds produced by different speakers in videos.
Researchers at Google tested their AI on cocktail party–like video clips that featured two or three people talking over each other, with various levels of background noise. By watching and listening to the videos, the new AI could distinguish which sounds were coming from each speaker much more accurately than a similar algorithm that simply listened to the audio.
This AI, to be presented in August at the 2018 SIGGRAPH meeting in Vancouver, could be used to caption videos more accurately than current transcription systems. And a future, faster version of the program that can filter background noise from live video feeds could help people hear each other more clearly during teleconferences, says Shmuel Peleg, a computer scientist at the Hebrew University of Jerusalem.
What’s more, this kind of AI could help virtual assistants hear voice commands more clearly, adds Jen-Cheng Hou, an engineer at the Research Center for Information Technology Innovation, Academia Sinica in Taiwan.Citations
A. Ephrat et al. Looking to listen at the cocktail party: A speaker-independent audio-visual model for speech separation. International Conference and Exhibition on Computer Graphics and Interactive Techniques. Vancouver, Canada, August 15, 2018.Further Reading
A. Grant. 3-D printed device cracks cocktail party problem. Science News. Vol. 188, September 19, 2015, p. 16.
B. Brookshire. How brains filter the signal from the noise. Science News Online, April 29, 2014.
L. Sanders. Attention tunes the mind’s ear. Science News Online, April 18, 2012.
A. Witze. How to hear above the cocktail party din. Science News Online, January 3, 2011.
- Click to share on Facebook (Opens in new window)
- Click to share on Twitter (Opens in new window)
- Click to share on WhatsApp (Opens in new window)
- Click to email this to a friend (Opens in new window)
- Click to print (Opens in new window)
- Click to share on Reddit (Opens in new window)
- Click to share on Tumblr (Opens in new window)
- Click to share on Pinterest (Opens in new window)
- Click to share on Pocket (Opens in new window)
- Click to share on Telegram (Opens in new window)
- Click to share on Skype (Opens in new window)