EE-240/2009 PCA EE-240/2009 PCA Principal Component Analysis EE-240/2009 Sensor 3 Sensor 4 x= N 1.8196 1.0397 1.1126 1.3735 2.4124 3.2898 2.9436 1.6561 0.4757 2.1659 1.4852 0.7544 2.3136 2.4068 2.5996 1.8136 5.6843 6.8238 4.7767 4.1195 6.0911 4.2638 4.6507 7.8633 5.5043 6.0801 8.0462 5.6324 3.6608 7.1652 5.0156 4.7301 6.0169 4.2119 4.2940 7.4612 5.2228 3.1509 5.2104 3.6473 1.4661 6.6255 4.6379 1.6211 9.2418 6.4692 3.6537 7.3074 5.1152 3.3891 6.1951 4.3365 3.5918 7.6207 5.3345 2.5844 6.8016 4.7611 4.2271 7.7247 5.4073 4.2080 6.7446 4.7212 m Medida 9 Medida 14 EE-240/2009 x= 4 3 2 x2 1 0 -1 -2 -3 -4 -4 -3 -2 -1 0 x1 1 2 3 4 0.5632 0.0525 2.3948 1.4875 0.1992 0.9442 1.0090 1.2579 -1.4904 0.0169 0.0914 -0.3765 0.5112 0.2334 -0.1388 -0.7678 2.3856 0.4861 -2.3396 -1.2983 0.5951 0.0604 0.9069 0.6180 1.5972 -0.3849 -0.0941 -0.5145 -0.6150 -0.3874 -0.4512 1.7430 1.4551 0.0444 -0.1314 -1.0344 -0.9598 -0.2293 -0.6040 1.5369 0.4161 1.3969 1.4772 0.5944 0.1178 -0.5428 -3.0050 -1.9646 0.8500 0.9795 -1.1580 -0.1741 -1.4907 -0.7346 -0.6379 -1.7942 -0.0970 -0.7409 1.0003 -0.4333 -0.9971 -0.8030 0.2530 -0.1742 -0.1868 -0.2996 1.8173 -0.5746 -0.8211 -1.7688 -2.0968 1.2367 2.3720 0.1575 -0.1781 0.3793 0.1620 -0.5782 1.5607 0.0264 0.0675 -0.3321 -2.5254 -1.8191 1.3183 0.7651 0.5193 0.9366 -0.8986 -0.5059 1.7935 1.6216 -0.1140 1.3910 -1.7429 -1.9603 -0.6703 0.1073 0.4392 -0.3092] EE-240/2009 4 clear all; N=50; sigma2=[1 0.8 ; 0.8 1]; mi=repmat([0 0],N,1); xx=mvnrnd(mi,sigma2); xmean=mean(xx,1); [lin,col]=size(xx); x=xx-repmat(xmean,lin,1); 3 2 x2 1 0 [p,lat,exp]=pcacov(x); -1 plot(x(:,1),x(:,2),'+'); hold on plot([p(1,1) 0 p(1,2)],[p(2,1) 0 p(2,2)]) -2 -3 -4 -4 -3 -2 -1 0 x1 1 2 3 4 EE-240/2009 4 3 3 2 2 1 1 xnew2 x2 4 0 0 -1 -1 -2 -2 -3 -3 -4 -4 -3 -2 -1 0 x1 1 2 3 4 -4 -4 -3 -2 -1 0 xnew1 1 2 3 4 EE-240/2009 4 3 2 xnew2 1 0 -1 -2 -3 -4 -4 -3 -2 -1 0 xnew1 1 2 3 4 EE-240/2009 Como determinar? 4 3 2 x2 1 0 -1 -2 -3 -4 -4 -3 -2 -1 0 x1 1 2 3 4 EE-240/2009 4 3 3 2 2 1 1 x2 x2 4 0 0 -1 -1 -2 -2 -3 -3 -4 -4 -3 -2 -1 0 x1 1 cov = [1 0 ; 0 1] 2 3 4 -4 -4 -3 -2 -1 0 x1 1 2 3 4 cov=[1 0.9 ; 0.9 1] EE-240/2009 4 3 3 2 2 1 1 x2 x2 4 0 0 -1 -1 -2 -2 -3 -3 -4 -4 -3 -2 -1 0 x1 1 cov = [1 0 ; 0 1] 2 3 4 -4 -4 -3 -2 -1 0 x1 1 2 3 4 cov=[1 0.9 ; 0.9 1] EE-240/2009 4 3 3 2 2 1 1 x2 x2 4 0 0 -1 -1 -2 -2 -3 -3 -4 -4 -3 -2 -1 0 x1 1 cov = [1 0 ; 0 1] 2 3 4 -4 -4 -3 -2 -1 0 x1 1 2 3 4 cov=[1 0.9 ; 0.9 1] EE-240/2009 4 x2 3 2 Para Gaussianas: 1 Não Correlacionados Independentes 0 -1 Matriz de Covariança Diagonal -2 -3 -4 -4 -3 -2 -1 0 x1 1 2 3 4 cov = [1 0 ; 0 1] EE-240/2009 Dado XNm Obter P, tal que Y = XP 1 S YT Y N 1 é diagonal. Dada uma matriz Amm , os auto-valores e os auto-vetores v são caracterizados por Av v ou seja, det I A 0 Como (s) é um polinômio de grau m, (s) =0 possui m raízes, 1, 2, ... , m associados a v1, v2, ... , vm 0 0 0 0 2 A v1 v 2 v m v1 v 2 v m P P 0 0 m No caso de e-valores distintos: P-1 A P = EE-240/2009 Dado XNm S 1 1 YT Y PT XT X P N 1 N 1 T PTP = Obter P, tal que Y = XP S 1 YT Y N 1 é diagonal. i=j A = XTX P-1 vj vi AP= P P v v T ij T i j vi 1 P v1 v 2 v m vi e-vetores de XTX ( normalizados vi = 1) P 1 X T X P P é simétrica P-1 = PT EE-240/2009 Dado XNm S 1 1 YT Y PT XT X P N 1 N 1 T PTP = Obter P, tal que Y = XP S 1 YT Y N 1 é diagonal. vj vi ij A = XTX P-1 i PTP ij i viTv j AP= P v1 v 2 v m vi e-vetores de XTX ( normalizados vi = 1) Pv v i viTv j viTPT v j P 1 X T X P P é simétrica PT = P i viTv j viTP v j Pv v P-1 = PT i viT v j j viT v j v i j T i vj 0 0 EE-240/2009 Dado XNm S 1 1 YT Y PT XT X P N 1 N 1 T PTP = Obter P, tal que Y = XP S 1 YT Y N 1 é diagonal. vj vi A = XTX P-1 A P = P v1 v 2 v m P P T ij 1 i j 0 i j vi e-vetores de XTX ( normalizados vi = 1) P 1 X T X P PTP = I P é simétrica P-1 = PT EE-240/2009 Dado XNm S 1 1 YT Y PT XT X P N 1 N 1 Obter P, tal que Y = XP S 1 YT Y N 1 A = XTX é diagonal. P-1 A P = P v1 v 2 v m vi e-vetores de XTX ( normalizados vi = 1) P 1 X T X P P é simétrica P-1 = PT P é ortogonal vi vk i j EE-240/2009 Dado XNm S 1 1 YT Y PT XT X P N 1 N 1 Singular Value Decomposition 1 Obter P, tal que Y = XP S N 1 X U V T 1 YT Y N 1 é diagonal. 1 XT X UV T N 1 N 1 1 1 XT X UV T N 1 T T UV T UV T 1 T XT X VU UV T N 1 I 1 V T XT XV 2 N 1 EE-240/2009 Dado XNm S 1 1 YT Y PT XT X P N 1 N 1 Obter P, tal que Y = XP S 1 YT Y N 1 é diagonal. P = Matriz de e-vec de (XTX) V = Matriz à direita no SVD P 1 XT X P 2 1 V T XT XV 2 N 1 EE-240/2009 x= 4 3 2 x2 1 0 -1 -2 -3 -4 -4 -3 -2 -1 0 x1 1 2 3 4 0.5632 0.0525 0.1992 0.9442 1.0090 -1.4904 0.0169 0.0914 -0.3765 0.5112 0.2334 -0.1388 -0.7678 2.3856 0.4861 -2.3396 -1.2983 0.5951 0.0604 0.9069 0.6180 1.5972 -0.3849 -0.0941 -0.5145 -0.6150 -0.3874 -0.4512 1.7430 1.4551 0.0444 -0.1314 -1.0344 -0.9598 -0.2293 -0.6040 1.5369 0.4161 1.3969 2.3948 1.4875 1.2579 1.4772 0.5944 0.1178 -0.5428 -3.0050 -1.9646 0.8500 0.9795 -1.1580 -0.1741 -1.4907 -0.7346 -0.6379 -1.7942 -0.0970 -0.7409 1.0003 -0.4333 -0.9971 -0.8030 0.2530 -0.1742 -0.1868 -0.2996 1.8173 -0.5746 -0.8211 -1.7688 -2.0968 1.2367 2.3720 0.1575 -0.1781 0.3793 0.1620 -0.5782 1.5607 0.0264 0.0675 -0.3321 -2.5254 -1.8191 1.3183 0.7651 0.5193 0.9366 -0.8986 -0.5059 1.7935 1.6216 -0.1140 1.3910 -1.7429 -1.9603 -0.6703 0.1073 0.4392 -0.3092] EE-240/2009 x= xx = 70.3445 50.3713 50.3713 55.6982 >> [P, Lambda]=eig(xx) P= Lambda = -0.7563 0.6543 -0.6543 -0.7563 113.9223 0 0 12.1205 >> Lambda = inv(P)*xx*P OK Lambda = 113.9223 0 0.0000 12.1205 Lambda (1) >> Lambda (2) 0.5632 0.0525 0.1992 0.9442 1.0090 -1.4904 0.0169 0.0914 -0.3765 0.5112 0.2334 -0.1388 -0.7678 2.3856 0.4861 -2.3396 -1.2983 0.5951 0.0604 0.9069 0.6180 1.5972 -0.3849 -0.0941 -0.5145 -0.6150 -0.3874 -0.4512 1.7430 1.4551 0.0444 -0.1314 -1.0344 -0.9598 -0.2293 -0.6040 1.5369 0.4161 1.3969 2.3948 1.4875 1.2579 1.4772 0.5944 0.1178 -0.5428 -3.0050 -1.9646 0.8500 0.9795 -1.1580 -0.1741 -1.4907 -0.7346 -0.6379 -1.7942 -0.0970 -0.7409 1.0003 -0.4333 -0.9971 -0.8030 0.2530 -0.1742 -0.1868 -0.2996 1.8173 -0.5746 -0.8211 -1.7688 -2.0968 1.2367 2.3720 0.1575 -0.1781 0.3793 0.1620 -0.5782 1.5607 0.0264 0.0675 -0.3321 -2.5254 -1.8191 1.3183 0.7651 0.5193 0.9366 -0.8986 -0.5059 1.7935 1.6216 -0.1140 1.3910 -1.7429 -1.9603 -0.6703 0.1073 0.4392 -0.3092] EE-240/2009 4 P= 3 -0.7563 0.6543 -0.6543 -0.7563 >> xnew = x * P 2 xnew2 1 -1 Pelim = -0.7563 -0.6543 0 0.0 0.0 -2 >> xelim = x * Pelim -3 -4 -4 -3 -2 -1 0 xnew1 1 2 3 4 EE-240/2009 x= Médodo da SVD x= 0.5632 0.1992 .... 0.0525 0.9442 .... Médodo da Diagonalização: >> xn=x/sqrt(N-1) >> [u,sigma,v]=svd(xn) P= -0.7563 0.6543 -0.6543 -0.7563 v= -0.7563 0.6543 -0.6543 -0.7563 Lambda = OK sigma = 113.9223 0 0.0000 12.1205 1.5248 0 0 0.4973 >> sigmadiag =sqrt(Lambda/(N-1)) sigmadiag = OK 1.5248 0.0000 0 0.4973 0.5632 0.0525 0.1992 0.9442 1.0090 -1.4904 0.0169 0.0914 -0.3765 0.5112 0.2334 -0.1388 -0.7678 2.3856 0.4861 -2.3396 -1.2983 0.5951 0.0604 0.9069 0.6180 1.5972 -0.3849 -0.0941 -0.5145 -0.6150 -0.3874 -0.4512 1.7430 1.4551 0.0444 -0.1314 -1.0344 -0.9598 -0.2293 -0.6040 1.5369 0.4161 1.3969 2.3948 1.4875 1.2579 1.4772 0.5944 0.1178 -0.5428 -3.0050 -1.9646 0.8500 0.9795 -1.1580 -0.1741 -1.4907 -0.7346 -0.6379 -1.7942 -0.0970 -0.7409 1.0003 -0.4333 -0.9971 -0.8030 0.2530 -0.1742 -0.1868 -0.2996 1.8173 -0.5746 -0.8211 -1.7688 -2.0968 1.2367 2.3720 0.1575 -0.1781 0.3793 0.1620 -0.5782 1.5607 0.0264 0.0675 -0.3321 -2.5254 -1.8191 1.3183 0.7651 0.5193 0.9366 -0.8986 -0.5059 1.7935 1.6216 -0.1140 1.3910 -1.7429 -1.9603 -0.6703 0.1073 0.4392 -0.3092] EE-240/2009 >> help pcacov PCACOV Principal Component Analysis using the covariance matrix. [PC, LATENT, EXPLAINED] = PCACOV(X) takes a the covariance matrix, X, and returns the principal components in PC, the eigenvalues of the covariance matrix of X in LATENT, and the percentage of the total variance in the observations explained by each eigenvector in EXPLAINED. >> [pc,latent,explained]=pcacov(x) pc = -0.7563 0.6543 -0.6543 -0.7563 latent = 10.6734 3.4814 explained = 75.4046 24.5954 EE-240/2009 >> help pcacov PCACOV Principal Component Analysis using the covariance matrix. [PC, LATENT, EXPLAINED] = PCACOV(X) takes a the covariance matrix, X, and returns the principal components in PC, the eigenvalues of the covariance matrix of X in LATENT, and the percentage of the total variance in the observations explained by each eigenvector in EXPLAINED. >> [pc,latent,explained]=pcacov(x) latent = pc = -0.7563 0.6543 -0.6543 -0.7563 P= 10.6734 3.4814 explained = 75.4046 24.5954 v= -0.7563 0.6543 -0.6543 -0.7563 -0.7563 0.6543 -0.6543 -0.7563 OK EE-240/2009 >> help pcacov PCACOV Principal Component Analysis using the covariance matrix. [PC, LATENT, EXPLAINED] = PCACOV(X) takes a the covariance matrix, X, and returns the principal components in PC, the eigenvalues of the covariance matrix of X in LATENT, and the percentage of the total variance in the observations explained by each eigenvector in EXPLAINED. >> [pc,latent,explained]=pcacov(x) latent = pc = -0.7563 0.6543 -0.6543 -0.7563 10.6734 3.4814 Lambda = 113.9223 0 0 12.1205 explained = 75.4046 24.5954 sqrlamb = >> sqrlamb=sqrt(evalor) 10.6734 0 0 3.4814 EE-240/2009 >> help pcacov PCACOV Principal Component Analysis using the covariance matrix. [PC, LATENT, EXPLAINED] = PCACOV(X) takes a the covariance matrix, X, and returns the principal components in PC, the eigenvalues of the covariance matrix of X in LATENT, and the percentage of the total variance in the observations explained by each eigenvector in EXPLAINED. >> [pc,latent,explained]=pcacov(x) pc = -0.7563 0.6543 -0.6543 -0.7563 latent = 10.6734 3.4814 sqrlamb = 10.6734 0 0 3.4814 explained = 75.4046 24.5954 percent = >> e=[Lambda(1,1) ; Lambda(2,2)] >> soma=sum(e) >> percent=e*100/soma 75.4046 24.5954 EE-240/2009 >> help princomp PRINCOMP Principal Component Analysis (centered and scaled data). [PC, SCORE, LATENT, TSQUARE] = PRINCOMP(X) takes a data matrix X and returns the principal components in PC, the so-called Z-scores in SCORES, the eigenvalues of the covariance matrix of X in LATENT, and Hotelling's T-squared statistic for each data point in TSQUARE. >> [pc,score,latent1,tsquare]=princomp(x) pc = -0.7563 0.6543 -0.6543 -0.7563 score = -0.4603 0.3288 -0.7684 -0.5837 1.1161 -0.9879 -0.5393 0.1579 -2.1222 1.1932 -0.4896 0.3437 ..... ..... latent1 = 2.3249 0.2474 tsquare = 0.5281 1.6315 4.4813 0.2260 7.6929 0.5806 ..... EE-240/2009 >> help princomp PRINCOMP Principal Component Analysis (centered and scaled data). [PC, SCORE, LATENT, TSQUARE] = PRINCOMP(X) takes a data matrix X and returns the principal components in PC, the so-called Z-scores in SCORES, the eigenvalues of the covariance matrix of X in LATENT, and Hotelling's T-squared statistic for each data point in TSQUARE. >> [pc,score,latent1,tsquare]=princomp(x) pc = -0.7563 0.6543 -0.6543 -0.7563 P= -0.7563 0.6543 -0.6543 -0.7563 score = -0.4603 0.3288 -0.7684 -0.5837 1.1161 -0.9879 -0.5393 0.1579 -2.1222 1.1932 -0.4896 0.3437 ..... ..... latent1 = 2.3249 0.2474 tsquare = 0.5281 1.6315 4.4813 0.2260 7.6929 0.5806 ..... EE-240/2009 >> help princomp PRINCOMP Principal Component Analysis (centered and scaled data). [PC, SCORE, LATENT, TSQUARE] = PRINCOMP(X) takes a data matrix X and returns the principal components in PC, the so-called Z-scores in SCORES, the eigenvalues of the covariance matrix of X in LATENT, and Hotelling's T-squared statistic for each data point in TSQUARE. >> [pc,score,latent1,tsquare]=princomp(x) score = -0.4603 0.3288 -0.7684 -0.5837 1.1161 -0.9879 -0.5393 0.1579 -2.1222 1.1932 -0.4896 0.3437 ..... ..... sco = >> sco=x*P -0.4603 0.3288 -0.7684 -0.5838 1.1161 -0.9880 -0.5393 0.1580 -2.1223 1.1933 -0.4896 0.3437 EE-240/2009 >> help princomp PRINCOMP Principal Component Analysis (centered and scaled data). [PC, SCORE, LATENT, TSQUARE] = PRINCOMP(X) takes a data matrix X and returns the principal components in PC, the so-called Z-scores in SCORES, the eigenvalues of the covariance matrix of X in LATENT, and Hotelling's T-squared statistic for each data point in TSQUARE. >> [pc,score,latent1,tsquare]=princomp(x) pc = -0.7563 0.6543 -0.6543 -0.7563 score = -0.4603 0.3288 -0.7684 -0.5837 1.1161 -0.9879 -0.5393 0.1579 -2.1222 1.1932 -0.4896 0.3437 ..... ..... latent1 = 2.3249 0.2474 >> eig( x'*x /(N-1)) ans = tsquare = 0.5281 1.6315 4.4813 0.2260 7.6929 0.5806 ..... 0.2474 2.3249 EE-240/2009 >> help princomp PRINCOMP Principal Component Analysis (centered and scaled data). [PC, SCORE, LATENT, TSQUARE] = PRINCOMP(X) takes a data matrix X and returns the principal components in PC, the so-called Z-scores in SCORES, the eigenvalues of the covariance matrix of X in LATENT, and Hotelling's T-squared statistic for each data point in TSQUARE. >> [pc,score,latent1,tsquare]=princomp(x) pc = -0.7563 0.6543 -0.6543 -0.7563 score = latent1 = -0.4603 0.3288 -0.7684 -0.5837 1.1161 -0.9879 -0.5393 0.1579 -2.1222 1.1932 -0.4896 0.3437 ..... ..... 2.3249 0.2474 T 2 T2 T2 tsquare = 0.5281 1.6315 4.4813 0.2260 7.6929 0.5806 ..... caracteriza a região de confiança 100 % mN 1N 1 F m, N m NN m EE-240/2009 0.6 0.6 0.4 0.4 0.2 0.2 xnew2 x2 Não gaussianidade = OK? 0 0 -0.2 -0.2 -0.4 -0.4 -0.6 -0.6 -0.6 -0.4 -0.2 0 x1 0.2 0.4 0.6 -0.6 -0.4 -0.2 0 xnew1 0.2 0.4 0.6 EE-240/2009 5 5 4 4 3 3 2 2 1 1 xnew2 x2 Não gaussianidade = OK? 0 0 -1 -1 -2 -2 -3 -3 -4 -4 -5 -5 -4 -3 -2 -1 0 x1 1 2 3 4 5 -5 -5 -4 -3 -2 -1 0 xnew1 1 2 3 4 5 EE-240/2009 5 5 4 4 3 3 2 2 1 1 x2 xnew2 Não gaussianidade = OK? 0 0 -1 -1 -2 -2 -3 -3 -4 -4 -5 -5 -4 -3 -2 -1 0 xnew1 1 2 3 4 5 -5 -5 -4 -3 -2 -1 0 x1 1 2 3 4 5 EE-240/2009 >> xm=mean(xx); >> [lin,col]=size(xx); >> xm=repmat(xm,lin,1); >> xx=xx-xm; 5 5 4 4 3 3 2 2 1 0 x2 x2 1 -1 -1 -2 -2 -3 -3 -4 -5 -5 0 -4 -4 -3 -2 -1 0 x1 1 2 3 4 5 -5 -5 -4 -3 -2 -1 0 x1 1 2 3 4 5 EE-240/2009 11 10 9 8 7 6 x1 3 5 4 2 3 2 1 1 0 20 40 60 80 100 k 120 140 160 180 200 0 20 40 60 80 100 k 120 140 160 180 200 11 -1 10 9 -2 8 7 -3 -3 -2 -1 0 x1 1 2 3 6 x2 x2 0 0 5 4 3 2 1 0 EE-240/2009 x= -0.4326 -1.6656 0.1253 0.2877 -1.1465 1.1909 1.1892 -0.0376 0.3273 0.1746 -0.1867 0.7258 -0.5883 2.1832 -0.1364 0.1139 1.0668 0.0593 -0.0956 -0.8323 N >> [u,sigma,v]=svd(x) sigma = u= 0.4460 -0.0726 -0.4458 0.1146 -0.0222 -0.2271 -0.6853 -0.0450 0.0758 0.2334 0.4456 -0.1012 0.3769 -0.5628 -0.1814 -0.0149 -0.0321 0.0488 -0.5182 0.1651 -0.2868 0.4908 0.1650 -0.4848 0.1055 -0.0323 0.6421 0.2784 0.0623 0.1922 0.7577 -0.0733 0.0253 -0.0595 0.9785 -0.1378 0.0660 0.0063 -0.4187 0.2038 0.0205 -0.0384 0.0320 0.0076 0.1585 -0.2137 -0.0664 0.1094 -0.0129 0.0120 0.0392 -0.2554 -0.0995 0.0123 -0.0107 0.9434 -0.1705 -0.0097 0.0039 0.0625 0.1066 -0.7703 -0.3056 0.0424 -0.0308 -0.1717 0.4827 -0.0300 0.0167 0.1882 -0.0386 -0.0485 -0.0404 0.0239 0.0038 -0.0146 -0.0445 0.9956 0.0200 0.0107 0.4553 0.0638 0.2425 -0.2207 -0.0560 0.0532 0.1649 0.0281 0.8044 -0.0038 -0.1743 0.2687 0.0424 0.0494 0.0280 0.0484 0.1443 0.0029 0.0516 0.9309 3.2980 0 0 2.0045 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 v= 0.2876 -0.9578 -0.9578 -0.2876 NN EE-240/2009 Muito Obrigado! EE-240/2009