home > our assignments

Acoustic blind source separation

Assignment title:	Acoustic blind source separation
Start date:	August 2010
Client:	University of Surrey
Investigator	Dr David Nugent david.nugent@elucidare.co.uk

Introduction

Researchers at the Centre for Communications Systems Research, University of Surrey, have developed a novel method for isolating multiple sound sources in a noisy environment. Sound sources can be individually separated, emphasized, suppressed, or modified and then recombined in any 3D spatial configuration. All processing is done in real-time and no prior knowledge of the number or location of the sources is required.

Technology

Blind source separation (BSS) is performed using acoustic pressure gradients derived from a small array of condenser microphones, or obtained directly from commercial B-format tetrahedral microphones. Time-frequency representations of the pressure and pressure gradient signals are calculated using a modified discrete cosine transform or fast Fourier transform. These are used to derive intensity vector directions. Beamforming is applied using a directivity function defined for each sound source and time-frequency bin. Finally, individual time-domain signals are obtained using an inverse modified cosine transform or inverse fast Fourier transform.

Performance

The intensity vector method supports numerous important advantages over conventional BSS techniques.

Feature	Conventional BSS techniques	Intensity vector analysis
Number of sources	The number of sound sources is limited to the number of microphones.	An infinite number of directivity functions can be calculated to separate more sources than microphones, although the performance would be limited. For large number of sources, it may be more practical to use fixed directivity functions for each window, as their calculation would be computationally demanding.
Moving sources	Independent component analysis requires the sound sources to be stationary.	Real time separation is achievable within 25 msec. Thus the system can lock-on to moving sound sources.
Compactness	The accuracy of time-delay-of-arrival techniques generally increases with the size of the microphone array.	The physical separation of microphones in the array must be small compared to the acoustic wavelength in air. Source separation performance improves with smaller microphone arrays, such as those manufactured using MEMS.

The following videos demonstrate two first-generation BSS implementations. Neither is optimised for any specific application, and improvements in functionality and audio quality can be expected with specification tailoring. An uncompressed version of the original audio and video can be viewed here.

Applications

Hearing-aids: Listening to selected sounds/conversations and improved speech intelligibility for hearing-impaired.
Teleconferencing: Speaker localization, volume equalization or selective enhancement.
Mobile phones: Environmental noise and interference suppression.
Speech recognition: Pre-processing to improve signal-to-noise ratio.
Broadcasting: Real-time audio capturing and synthesis for 3D TV productions, ensuring spatial synchronicity of sound and picture, including multi-view rendering.
Audio post production: Audio personalization, automated dialogue replacement, volume balancing.
Immersive remote collaboration: Selective transmission of multiple speech sounds and their processing for 3D reproduction.
Automotive: Noise and acoustic echo cancellation.
Surveillance: Automatic detection of sound sources and camera zooming. Automatic keyword / threat detection in noisy, multi-speaker environments such as airports.
Biometrics: Pre-processing to improve speaker identification.

Documents available for download

High-res demonstration of acoustic BSS in reverberant office environment	Click here
Introduction to Real-time Sound Source Separation and Localisation	Click here
PCT WO 2009/050487 "Acoustic Source Separation"	Click here
Acoustic Source Separation of Convolutive Mixtures Based on Intensity Vector Statistics	Click here
Intensity Vector Direction Exploitation For Exhaustive Blind Source Separation Of Convolutive Mixtures	Click here
Blind source separation and directional audio synthesis for binaural auralization of multiple sound sources using microphone array recordings	Click here
Performance of Closed-Form Acoustic Scene Decomposition for Forensic Analysis	Click here
Spatial Synchronization of Audiovisual Objects by 3D Audio Object Coding	Click here