The University of Washington has developed an artificial intelligence system that allows a user wearing headphones to hear just the voice of a specific speaker in real time, even as the listener moves around in noisy environments. By enrolling the speaker into the system by looking at them for a few seconds, the system cancels out all other noise and plays only the enrolled voice. This innovation, called “Target Speech Hearing,” builds on the team’s previous work on noise canceling headphones and semantic hearing systems.

The system uses off-the-shelf headphones equipped with microphones, allowing the user to direct their head towards the desired speaker, tapping a button to enroll them. Sound waves from the speaker’s voice reach the microphones on both sides of the headset, and machine learning software on an embedded computer learns the vocal patterns of the speaker. The system then plays back the enrolled voice to the listener in real time, tuning in more effectively as the speaker continues talking and providing additional training data to the system.

While noise-canceling headphones like Apple’s AirPods Pro can adjust sound levels during a conversation, the UW system takes it a step further by allowing the user to select whom to listen to and when. This technology could be especially beneficial in crowded environments like restaurants or cafeterias, where background noise can make it challenging to hear the person speaking across from you. With just a button press and a glance at the speaker, the system can enhance clarity and focus on the enrolled voice.

Currently, the system can only enroll one speaker at a time, and it requires there to be no other loud voice coming from the same direction as the target speaker’s voice during enrollment. Users have the option to run another enrollment on the speaker to improve the clarity of the voice playback. The findings of the UW team were presented at the ACM CHI Conference on Human Factors in Computing Systems, showcasing the potential of this technology. The code for the proof-of-concept device is available for others to build upon, although the system is not yet commercially available.

Overall, the artificial intelligence system developed by the University of Washington represents a significant advancement in enhancing speech clarity for headphone users in noisy environments. By leveraging machine learning technology, the system can effectively isolate and play back the voice of a specific speaker in real time, providing a more focused and personalized listening experience for users. As this technology continues to evolve, it holds promise for further enhancing communication in various settings and scenarios where background noise may pose a challenge to effective listening.

Share.
Exit mobile version