Acoustic blind source separation
Assignment title: | Acoustic blind source separation | |
Start date: | August 2010 | |
Client: | University of Surrey | |
Investigator | Dr David Nugent david.nugent@elucidare.co.uk |
Introduction
Researchers at the Centre for Communications Systems Research, University of Surrey, have developed a novel method for isolating multiple sound sources in a noisy environment. Sound sources can be individually separated, emphasized, suppressed, or modified and then recombined in any 3D spatial configuration. All processing is done in real-time and no prior knowledge of the number or location of the sources is required.
Technology
Blind source separation (BSS) is performed using acoustic pressure gradients derived from a small array of condenser microphones, or obtained directly from commercial B-format tetrahedral microphones. Time-frequency representations of the pressure and pressure gradient signals are calculated using a modified discrete cosine transform or fast Fourier transform. These are used to derive intensity vector directions. Beamforming is applied using a directivity function defined for each sound source and time-frequency bin. Finally, individual time-domain signals are obtained using an inverse modified cosine transform or inverse fast Fourier transform.
Performance
The intensity vector method supports numerous important advantages over conventional BSS techniques.
Feature |
Conventional BSS techniques |
Intensity vector analysis |
Number of sources |
The number of sound sources is limited to the number of microphones. |
An infinite number of directivity functions can be calculated to separate more sources than microphones, although the performance would be limited. For large number of sources, it may be more practical to use fixed directivity functions for each window, as their calculation would be computationally demanding. |
Moving sources |
Independent component analysis requires the sound sources to be stationary. |
Real time separation is achievable within 25 msec. Thus the system can lock-on to moving sound sources. |
Compactness |
The accuracy of time-delay-of-arrival techniques generally increases with the size of the microphone array. |
The physical separation of microphones in the array must be small compared to the acoustic wavelength in air. Source separation performance improves with smaller microphone arrays, such as those manufactured using MEMS. |
The following videos demonstrate two first-generation BSS implementations. Neither is optimised for any specific application, and improvements in functionality and audio quality can be expected with specification tailoring. An uncompressed version of the original audio and video can be viewed here.
Applications
Hearing-aids: Listening to selected sounds/conversations and improved
speech intelligibility for hearing-impaired.
Teleconferencing: Speaker localization, volume equalization or
selective enhancement.
Mobile phones: Environmental noise and interference suppression.
Speech recognition: Pre-processing to improve signal-to-noise ratio.
Broadcasting: Real-time audio capturing and synthesis for 3D TV
productions, ensuring spatial synchronicity of sound and picture, including
multi-view rendering.
Audio post production: Audio personalization, automated dialogue
replacement, volume balancing.
Immersive remote collaboration: Selective transmission of multiple
speech sounds and their processing for 3D reproduction.
Automotive: Noise and acoustic echo cancellation.
Surveillance: Automatic detection of sound sources and camera zooming.
Automatic keyword / threat detection in noisy, multi-speaker environments
such as airports.
Biometrics: Pre-processing to improve speaker identification.
Documents available for download
High-res demonstration of acoustic BSS in reverberant office environment | Click here |
Introduction to Real-time Sound Source Separation and Localisation |
|
PCT WO 2009/050487 "Acoustic Source Separation" |
|
Acoustic Source Separation of Convolutive Mixtures Based on Intensity Vector Statistics |
|
Intensity Vector Direction Exploitation For Exhaustive Blind Source Separation Of Convolutive Mixtures |
|
Blind source separation and directional audio synthesis for binaural auralization of multiple sound sources using microphone array recordings |
|
Performance of Closed-Form Acoustic Scene Decomposition for Forensic Analysis |
|
Spatial Synchronization of Audiovisual Objects by 3D Audio Object Coding |