Acoustic blind source separation


Assignment title: Acoustic blind source separation
Start date: August 2010
Client: University of Surrey
Investigator Dr David Nugent
david.nugent@elucidare.co.uk

 


Introduction

Researchers at the Centre for Communications Systems Research, University of Surrey, have developed a novel method for isolating multiple sound sources in a noisy environment. Sound sources can be individually separated, emphasized, suppressed, or modified and then recombined in any 3D spatial configuration. All processing is done in real-time and no prior knowledge of the number or location of the sources is required.


Technology

Blind source separation (BSS) is performed using acoustic pressure gradients derived from a small array of condenser microphones, or obtained directly from commercial B-format tetrahedral microphones. Time-frequency representations of the pressure and pressure gradient signals are calculated using a modified discrete cosine transform or fast Fourier transform. These are used to derive intensity vector directions. Beamforming is applied using a directivity function defined for each sound source and time-frequency bin. Finally, individual time-domain signals are obtained using an inverse modified cosine transform or inverse fast Fourier transform.


Performance

The intensity vector method supports numerous important advantages over conventional BSS techniques.

Feature

Conventional BSS techniques

Intensity vector analysis

Number of sources

The number of sound sources is limited to the number of microphones.

An infinite number of directivity functions can be calculated to separate more sources than microphones, although the performance would be limited. For large number of sources, it may be more practical to use fixed directivity functions for each window, as their calculation would be computationally demanding.

Moving sources

Independent component analysis requires the sound sources to be stationary.

Real time separation is achievable within 25 msec. Thus the system can lock-on to moving sound sources.

Compactness

The accuracy of time-delay-of-arrival techniques generally increases with the size of the microphone array.

The physical separation of microphones in the array must be small compared to the acoustic wavelength in air. Source separation performance improves with smaller microphone arrays, such as those manufactured using MEMS.

 

The following videos demonstrate two first-generation BSS implementations. Neither is optimised for any specific application, and improvements in functionality and audio quality can be expected with specification tailoring. An uncompressed version of the original audio and video can be viewed here.

 


Applications

Hearing-aids: Listening to selected sounds/conversations and improved speech intelligibility for hearing-impaired.
Teleconferencing: Speaker localization, volume equalization or selective enhancement.
Mobile phones: Environmental noise and interference suppression.
Speech recognition: Pre-processing to improve signal-to-noise ratio.
Broadcasting: Real-time audio capturing and synthesis for 3D TV productions, ensuring spatial synchronicity of sound and picture, including multi-view rendering.
Audio post production: Audio personalization, automated dialogue replacement, volume balancing.
Immersive remote collaboration: Selective transmission of multiple speech sounds and their processing for 3D reproduction.
Automotive: Noise and acoustic echo cancellation.
Surveillance: Automatic detection of sound sources and camera zooming. Automatic keyword / threat detection in noisy, multi-speaker environments such as airports.
Biometrics: Pre-processing to improve speaker identification.


Documents available for download

High-res demonstration of acoustic BSS in reverberant office environment Click here

Introduction to Real-time Sound Source Separation and Localisation

Click here

PCT WO 2009/050487 "Acoustic Source Separation"

Click here

Acoustic Source Separation of Convolutive Mixtures Based on Intensity Vector Statistics

Click here

Intensity Vector Direction Exploitation For Exhaustive Blind Source Separation Of Convolutive Mixtures

Click here

Blind source separation and directional audio synthesis for binaural auralization of multiple sound sources using microphone array recordings

Click here

Performance of Closed-Form Acoustic Scene Decomposition for Forensic Analysis

Click here

Spatial Synchronization of Audiovisual Objects by 3D Audio Object Coding

Click here