Blind Source Separation of Convolutive Speech Mixtures Yizheng Yuan University of Brasília (UnB) Department of Electrical Engineering (ENE) Laboratory of Array Signal Processing PO Box 4386 Zip Code 70.919-970, Brasília - DF de Brasília Homepage:Universidade http://www.pgea.unb.br/~lasp Laboratório de Processamento de Sinais em Arranjos 1 In Cooperation with 243 research internships in 34 countries in year 2014 Universidade de Brasília Laboratório de Processamento de Sinais em Arranjos 2 Outline Introduction Problem Setting Steps of Solution Short-Time Fourier Transform Solution Approach Estimating the Autocorrelation Matrices Reconstructing the Sorce Signals Sensitivity to Estimation Errors Simulations Introduction – Problem Setting – Steps of Solution – Sensitivity to Estimation Errors – Simulations Universidade de Brasília Laboratório de Processamento de Sinais em Arranjos 3 Introduction Imagine several people talking at the same time We wish to recognize the voice of each one Introduction – Problem Setting – Steps of Solution – Sensitivity to Estimation Errors – Simulations Universidade de Brasília Laboratório de Processamento de Sinais em Arranjos 4 Introduction This procedure is called blind source separation We will record the sounds by a microphone array and try to reconstruct the sound played by each source Current methods to accomplish this include ICA, TRINICON, joint diagonalization and PARAFAC Introduction – Problem Setting – Steps of Solution – Sensitivity to Estimation Errors – Simulations Universidade de Brasília Laboratório de Processamento de Sinais em Arranjos 5 Problem Setting I speakers J microphones Linear convolutive speech mixture speaker signals can arrive with time delay Goal: Reconstruct s from x by estimating H Introduction – Problem Setting – Steps of Solution – Sensitivity to Estimation Errors – Simulations Universidade de Brasília Laboratório de Processamento de Sinais em Arranjos 6 Short-Time Fourier Transform Multiply the signals x by a window function w which is non-zero only for a short period of time Apply DFT to the weighted signals within the frame of the window function Repeat while shifting the window function along the time Introduction – Problem Setting – Steps of Solution – Sensitivity to Estimation Errors – Simulations Universidade de Brasília Laboratório de Processamento de Sinais em Arranjos 7 Short-Time Fourier Transform DFT DFT DFT Universidade de Brasília Laboratório de Processamento de Sinais em Arranjos 8 Solution Approach Step 1 Apply STFT By convolution property (where q is an index of STFT window frames) Approximation is better the greater the length of the STFT windows are (compared to L) Introduction – Problem Setting – Steps of Solution – Sensitivity to Estimation Errors – Simulations Universidade de Brasília Laboratório de Processamento de Sinais em Arranjos 9 Solution Approach Step 2 Divide x into P blocks along the time To merge the blocks better, we will make them overlap each other by 10% When reassembling the entire data block, we will take a weighted average of the samples in the overlapping intervals We assume that the signals s are generated by a random distribution, and for each block p we consider the autocorrelation matrices Introduction – Problem Setting – Steps of Solution – Sensitivity to Estimation Errors – Simulations Universidade de Brasília Laboratório de Processamento de Sinais em Arranjos 10 Solution Approach Step 2 This leads to following system of equations Assuming zero-mean, mutually uncorrelated speaker signals, Rs are ideally diagonal Introduction – Problem Setting – Steps of Solution – Sensitivity to Estimation Errors – Simulations Universidade de Brasília Laboratório de Processamento de Sinais em Arranjos 11 Estimating the Autocorrelation Matrices Denote by x(f,kpm) the DFT of the m-th STFT window of block p Compute the average (where M denotes the number of STFT windows in a block) The estimates of Rs are more diagonal the shorter the length of the STFT windows are Problem of choosing a “good” length Introduction – Problem Setting – Steps of Solution – Sensitivity to Estimation Errors – Simulations Universidade de Brasília Laboratório de Processamento de Sinais em Arranjos 12 Reconstructing the Source Signals Joint diagonalization or PARAFAC algorithms can be applied to find estimates of H(f) up to permutation and scaling of the columns These ambiguities must be matched along all frequencies f Use the identity to recover the STFT of s Apply inverse STFT to obtain the original source signals Introduction – Problem Setting – Steps of Solution – Sensitivity to Estimation Errors – Simulations Universidade de Brasília Laboratório de Processamento de Sinais em Arranjos 13 Sensitivity to Estimation Errors Problem: The reconstructed source signals are sensitive to estimation errors of H We will test the sensitivity by adding noise to H and reconstructing the source signals using the noised versions Try with noise of standard deviation σ = 0.00, 0.01, ..., 0.10 To compare the reconstructed and original source signals, we consider their quotients Introduction – Problem Setting – Steps of Solution – Sensitivity to Estimation Errors – Simulations Universidade de Brasília Laboratório de Processamento de Sinais em Arranjos 14 Sensitivity to Estimation Errors Settings of simulation: 2 speakers, 3 microphones, L = 4 random speaker signals with standard deviation 1 random mixing channel with standard deviation 0.5 50000 time samples, divided into 5 blocks STFT using Hanning window of length 40 with 60% overlap Introduction – Problem Setting – Steps of Solution – Sensitivity to Estimation Errors – Simulations Universidade de Brasília Laboratório de Processamento de Sinais em Arranjos 15 Sensitivity to Estimation Errors Introduction – Problem Setting – Steps of Solution – Sensitivity to Estimation Errors – Simulations Universidade de Brasília Laboratório de Processamento de Sinais em Arranjos 16 Sensitivity to Estimation Errors Introduction – Problem Setting – Steps of Solution – Sensitivity to Estimation Errors – Simulations Universidade de Brasília Laboratório de Processamento de Sinais em Arranjos 17 Sensitivity to Estimation Errors Introduction – Problem Setting – Steps of Solution – Sensitivity to Estimation Errors – Simulations Universidade de Brasília Laboratório de Processamento de Sinais em Arranjos 18 Sensitivity to Estimation Errors Now we try to estimate H(f) by joint diagonalization Assume we can match the permutation and scaling ambiguities For now we will match them by comparing to the original H(f) We check the standard deviation of the error in the estimation of H by turning Ĥ(f) into time domain and computing its root mean square error We try to reconstruct the source signals by the estimated H(f) and check their root mean square error Introduction – Problem Setting – Steps of Solution – Sensitivity to Estimation Errors – Simulations Universidade de Brasília Laboratório de Processamento de Sinais em Arranjos 19 Sensitivity to Estimation Errors Introduction – Problem Setting – Steps of Solution – Sensitivity to Estimation Errors – Simulations Universidade de Brasília Laboratório de Processamento de Sinais em Arranjos 20 Simulations Simulation with real sound: 2 speakers, 2 microphones, L = 512 (see picture below) 354105 time samples (sample rate: 16000), divided into 5 blocks (overlapping by 10%) STFT using Hanning window of length 2000 with 60% overlap Universidade de Brasília Laboratório de Processamento de Sinais em Arranjos 21 References D. Nion, K. N. Mokios, N. D. Sidiropoulos, A. Potamianos: Batch and Adaptive PARAFAC-Based Blind Separation of Convolutive Speech Mixtures. In IEEE Transactions on Audio, Speech & Language Processing, vol. 18, no. 6, Aug. 2010, pp. 1193. R. K. Miranda: Métodos para melhoria da qualidade de separação cega de fontes sonoras em ângulos oblíquos de radiação. A. Yeredor: Non-Orthogonal Joint Diagonalization in the LeastSquares Sense with Application in Blind Source Separation. In IEEE Transactions on Signal Processing, vol. 50, no. 7, July 2002, pp. 1545. Incl. MATLAB code on https://www.eng.tau.ac.il/~arie/Links.html [2014-09-02] Universidade de Brasília Laboratório de Processamento de Sinais em Arranjos 22 References Additional images: http://www.mi.fuberlin.de/images/mi/fu_logo.gif?1378202177 [2014-10-09] https://www.daad.de/rise-weltweit/de/banner.jpg [2014-1009] http://www.shalab.phys.waseda.ac.jp/image/res08/yamahat a.gif [2014-10-09] Universidade de Brasília Laboratório de Processamento de Sinais em Arranjos 23