Blind Source Separation of
Convolutive Speech Mixtures
Yizheng Yuan
University of Brasília (UnB)
Department of Electrical Engineering (ENE)
Laboratory of Array Signal Processing
PO Box 4386
Zip Code 70.919-970, Brasília - DF
de Brasília
Homepage:Universidade
http://www.pgea.unb.br/~lasp
Laboratório de Processamento de Sinais em Arranjos
1
In Cooperation with

243 research internships in 34 countries in year 2014
Universidade de Brasília
Laboratório de Processamento de Sinais em Arranjos
2
Outline





Introduction
Problem Setting
Steps of Solution
 Short-Time Fourier Transform
 Solution Approach
 Estimating the Autocorrelation Matrices
 Reconstructing the Sorce Signals
Sensitivity to Estimation Errors
Simulations
Introduction – Problem Setting – Steps of Solution – Sensitivity to Estimation Errors – Simulations
Universidade de Brasília
Laboratório de Processamento de Sinais em Arranjos
3
Introduction


Imagine several people talking at the same time
We wish to recognize the voice of each one
Introduction – Problem Setting – Steps of Solution – Sensitivity to Estimation Errors – Simulations
Universidade de Brasília
Laboratório de Processamento de Sinais em Arranjos
4
Introduction



This procedure is called blind source separation
We will record the sounds by a microphone array and try to
reconstruct the sound played by each source
Current methods to accomplish this include ICA, TRINICON,
joint diagonalization and PARAFAC
Introduction – Problem Setting – Steps of Solution – Sensitivity to Estimation Errors – Simulations
Universidade de Brasília
Laboratório de Processamento de Sinais em Arranjos
5
Problem Setting

I speakers

J microphones

Linear convolutive speech mixture
 speaker signals can arrive with time delay

Goal: Reconstruct s from x by estimating H
Introduction – Problem Setting – Steps of Solution – Sensitivity to Estimation Errors – Simulations
Universidade de Brasília
Laboratório de Processamento de Sinais em Arranjos
6
Short-Time Fourier Transform

Multiply the signals x by a window function w which is non-zero
only for a short period of time

Apply DFT to the weighted signals within the frame of the
window function

Repeat while shifting the window function along the time
Introduction – Problem Setting – Steps of Solution – Sensitivity to Estimation Errors – Simulations
Universidade de Brasília
Laboratório de Processamento de Sinais em Arranjos
7
Short-Time Fourier Transform
DFT
DFT
DFT
Universidade de Brasília
Laboratório de Processamento de Sinais em Arranjos
8
Solution Approach

Step 1
 Apply STFT
 By convolution property
(where q is an index of STFT window frames)
 Approximation is better the greater the length of the STFT
windows are (compared to L)
Introduction – Problem Setting – Steps of Solution – Sensitivity to Estimation Errors – Simulations
Universidade de Brasília
Laboratório de Processamento de Sinais em Arranjos
9
Solution Approach

Step 2
 Divide x into P blocks along the time
 To merge the blocks better, we will make them overlap
each other by 10%
 When reassembling the entire data block, we will take a
weighted average of the samples in the overlapping
intervals
 We assume that the signals s are generated by a random
distribution, and for each block p we consider the
autocorrelation matrices
Introduction – Problem Setting – Steps of Solution – Sensitivity to Estimation Errors – Simulations
Universidade de Brasília
Laboratório de Processamento de Sinais em Arranjos
10
Solution Approach

Step 2
 This leads to following system of equations
 Assuming zero-mean, mutually uncorrelated speaker
signals, Rs are ideally diagonal
Introduction – Problem Setting – Steps of Solution – Sensitivity to Estimation Errors – Simulations
Universidade de Brasília
Laboratório de Processamento de Sinais em Arranjos
11
Estimating the Autocorrelation Matrices


Denote by x(f,kpm) the DFT of the m-th STFT window of block p
Compute the average
(where M denotes the number of STFT windows in a block)
 The estimates of Rs are more diagonal the shorter the length
of the STFT windows are
 Problem of choosing a “good” length
Introduction – Problem Setting – Steps of Solution – Sensitivity to Estimation Errors – Simulations
Universidade de Brasília
Laboratório de Processamento de Sinais em Arranjos
12
Reconstructing the Source Signals

Joint diagonalization or PARAFAC algorithms can be applied to
find estimates of H(f) up to permutation and scaling of the
columns
These ambiguities must be matched along all frequencies f
Use the identity

to recover the STFT of s
Apply inverse STFT to obtain the original source signals


Introduction – Problem Setting – Steps of Solution – Sensitivity to Estimation Errors – Simulations
Universidade de Brasília
Laboratório de Processamento de Sinais em Arranjos
13
Sensitivity to Estimation Errors


Problem: The reconstructed source signals are sensitive to
estimation errors of H
We will test the sensitivity by adding noise to H and
reconstructing the source signals using the noised versions
 Try with noise of standard deviation σ = 0.00, 0.01, ..., 0.10
 To compare the reconstructed and original source signals,
we consider their quotients
Introduction – Problem Setting – Steps of Solution – Sensitivity to Estimation Errors – Simulations
Universidade de Brasília
Laboratório de Processamento de Sinais em Arranjos
14
Sensitivity to Estimation Errors

Settings of simulation:
 2 speakers, 3 microphones, L = 4
 random speaker signals with standard deviation 1
 random mixing channel with standard deviation 0.5
 50000 time samples, divided into 5 blocks
 STFT using Hanning window of length 40 with 60% overlap
Introduction – Problem Setting – Steps of Solution – Sensitivity to Estimation Errors – Simulations
Universidade de Brasília
Laboratório de Processamento de Sinais em Arranjos
15
Sensitivity to Estimation Errors
Introduction – Problem Setting – Steps of Solution – Sensitivity to Estimation Errors – Simulations
Universidade de Brasília
Laboratório de Processamento de Sinais em Arranjos
16
Sensitivity to Estimation Errors
Introduction – Problem Setting – Steps of Solution – Sensitivity to Estimation Errors – Simulations
Universidade de Brasília
Laboratório de Processamento de Sinais em Arranjos
17
Sensitivity to Estimation Errors
Introduction – Problem Setting – Steps of Solution – Sensitivity to Estimation Errors – Simulations
Universidade de Brasília
Laboratório de Processamento de Sinais em Arranjos
18
Sensitivity to Estimation Errors




Now we try to estimate H(f) by joint diagonalization
Assume we can match the permutation and scaling ambiguities
 For now we will match them by comparing to the original
H(f)
We check the standard deviation of the error in the estimation of
H by turning Ĥ(f) into time domain and computing its root mean
square error
We try to reconstruct the source signals by the estimated H(f)
and check their root mean square error
Introduction – Problem Setting – Steps of Solution – Sensitivity to Estimation Errors – Simulations
Universidade de Brasília
Laboratório de Processamento de Sinais em Arranjos
19
Sensitivity to Estimation Errors
Introduction – Problem Setting – Steps of Solution – Sensitivity to Estimation Errors – Simulations
Universidade de Brasília
Laboratório de Processamento de Sinais em Arranjos
20
Simulations

Simulation with real sound:
 2 speakers, 2 microphones, L = 512 (see picture below)
 354105 time samples (sample rate: 16000), divided into 5
blocks (overlapping by 10%)
 STFT using Hanning window of length 2000 with 60%
overlap
Universidade de Brasília
Laboratório de Processamento de Sinais em Arranjos
21
References



D. Nion, K. N. Mokios, N. D. Sidiropoulos, A. Potamianos: Batch
and Adaptive PARAFAC-Based Blind Separation of Convolutive
Speech Mixtures. In IEEE Transactions on Audio, Speech &
Language Processing, vol. 18, no. 6, Aug. 2010, pp. 1193.
R. K. Miranda: Métodos para melhoria da qualidade de
separação cega de fontes sonoras em ângulos oblíquos de
radiação.
A. Yeredor: Non-Orthogonal Joint Diagonalization in the LeastSquares Sense with Application in Blind Source Separation. In
IEEE Transactions on Signal Processing, vol. 50, no. 7, July
2002, pp. 1545.
 Incl. MATLAB code on
https://www.eng.tau.ac.il/~arie/Links.html [2014-09-02]
Universidade de Brasília
Laboratório de Processamento de Sinais em Arranjos
22
References

Additional images:
 http://www.mi.fuberlin.de/images/mi/fu_logo.gif?1378202177 [2014-10-09]
 https://www.daad.de/rise-weltweit/de/banner.jpg [2014-1009]
 http://www.shalab.phys.waseda.ac.jp/image/res08/yamahat
a.gif [2014-10-09]
Universidade de Brasília
Laboratório de Processamento de Sinais em Arranjos
23
Download

yizheng_yuan - Universidade de Brasília