Classification Adriano Joaquim de O Cruz ©2002 NCE/UFRJ [email protected] Classification Technique that associates samples to classes previously known. May be Crisp or Fuzzy Supervised MLP trained Non supervised K-NN e fuzzy K-NN not trained *@2001 Adriano Cruz *NCE e IM - UFRJ Classification ‹#› K-NN and Fuzzy K-NN Classification Classes Methods identified by patterns Classifies Previous by the k nearest neighbours knowledge about the problem classes It is not restricted to a specific distribution of the samples *@2001 Adriano Cruz *NCE e IM - UFRJ Classification ‹#› Classification Crisp K-NN Crisp K-NN Supervised clustering method (Classification method). Classes are defined before hand. Classes are characterized by sets of elements. The number of elements may differ among classes. The main idea is to associate the sample to the class containing more neighbours. *@2001 Adriano Cruz *NCE e IM - UFRJ Classification ‹#› Crisp K-NN Class 1 w2 Class 2 w4 w3 w1 Class 3 w5 w9 w13 w14 w10 s w8 Class 4 w11 w7 w12 w6 Class 5 3 nearest neighbours, and sample s is closest to pattern w6 on class 5. *@2001 Adriano Cruz *NCE e IM - UFRJ Classification ‹#› Crisp K-NN Consider W={w1, w2, ..., wt} a set of t labelled data. Each object wi is defined by l characteristics wi=(wi1, wi2, ..., wil). Input of y unclassified elements. k the number of closest neighbours of y. E the set of k nearest neighbours (NN). *@2001 Adriano Cruz *NCE e IM - UFRJ Classification ‹#› Crisp K-NN Let t be the number of elements that identify the classes. Let c be the number of classes. Let W be the set that contain the t elements Each cluster is represented by a subset of elements from W. *@2001 Adriano Cruz *NCE e IM - UFRJ Classification ‹#› Crisp K-NN algorithm set k {Calculating the NN} for i = 1 to t Calculate distance from y to xi if i<=k then add xi to E else if xi is closer to y than any previous NN then delete the farthest neighbour and include xi in the set E *@2001 Adriano Cruz *NCE e IM - UFRJ Classification ‹#› Crisp K-NN algorithm cont. Determine the majority class represented in the set E and include y in this class. if there is a draw, then calculate the sum of distances from y to all neighbours in each class in the draw if the sums are different then add xi to class with smallest sum else add xi to class where last minimum was found *@2001 Adriano Cruz *NCE e IM - UFRJ Classification ‹#› Classification Fuzzy K-NN Fuzzy K-NN The basis of the algorithm is to assign membership as a function of the object’s distance from its K-nearest neighbours and the memberships in the possible classes. J. Keller, M. Gray, J. Givens. A Fuzzy KNearest Neighbor Algorithm. IEEE Transactions on Systems, Man and Cybernectics, vol smc-15, no 4, July August 1985 *@2001 Adriano Cruz *NCE e IM - UFRJ Classification ‹#› Fuzzy K-NN Classe 2 Classe 1 Classe 3 w2 w4 w5 w13 w1 w9 w3 w10 w11 Classe 4 *@2001 Adriano Cruz w8 w7 w12 *NCE e IM - UFRJ w14 w6 Classe 5 Classification ‹#› Fuzzy K-NN Consider W={w1, w2, ..., wt} a set of t labelled data. Each object wi is defined by l characteristics wi=(wi1, wi2, ..., wil). Input of y unclassified elements. k the number of closest neighbours of y. E the set of k nearest neighbours (NN). i(y) is the membership of y in the class i ij is the membership in the ith class of the jth vector of the labelled set (labelled wj in class i) . *@2001 Adriano Cruz *NCE e IM - UFRJ Classification ‹#› Fuzzy K-NN Let t be the number of elements that identify the classes. Let c be the number of classes. Let W be the set that contain the t elements Each cluster is represented by a subset of elements from W. *@2001 Adriano Cruz *NCE e IM - UFRJ Classification ‹#› Fuzzy K-NN algorithm set k {Calculating the NN} for i = 1 to t Calculate distance from y to xi if i<=k then add xi to E else if xi is closer to y than any previous NN then delete the farthest neighbour and include xi in the set E *@2001 Adriano Cruz *NCE e IM - UFRJ Classification ‹#› Fuzzy K-NN algorithm Calculate i(y) using for i = 1 to c // number of classes 1 ij 2 ( m 1) j 1 y x j i ( y ) k 1 2 ( m 1) j 1 y x j k *@2001 Adriano Cruz *NCE e IM - UFRJ Classification ‹#› Computing ij(y) ij(y) can be assigned class membership in several ways. They can be given complete membership in their known class and non membership in all other. Assign membership based on distance from their class mean. Assign membership based on the distance from labelled samples of their own class and those from other classes. *@2001 Adriano Cruz *NCE e IM - UFRJ Classification ‹#› Classification ICC-KNN System ICC-KNN System Non-Parametric Statistical Pattern Recognition System Associates FCM, fuzzy KNN and ICC Evaluates data disposed on several class formats *@2001 Adriano Cruz *NCE e IM - UFRJ Classification ‹#› ICC-KNN System Divided in two modules First module (training) chooses the best patterns to use with K-NN chooses the best fuzzy constant and best number of neighbours (K) Second module (classification) uses fuzzy k-nn to classify *@2001 Adriano Cruz *NCE e IM - UFRJ Classification ‹#› ICC-KNN First Module Classification Module Finds structure on data sample Divided into two phases First phase of training Finds the best patterns for fuzzy K-NN • FCM – Applied to each class using many numbers of categories • ICC – Finds the best number of categories to represent each class *@2001 Adriano Cruz *NCE e IM - UFRJ Classification ‹#› ICC-KNN First Phase Results of applying FCM and ICC Patterns for K-NN which are the centres of the chosen run of FCM Number of centres which are all the centres of the number of categories resulting after applying ICC to all FCM runs *@2001 Adriano Cruz *NCE e IM - UFRJ Classification ‹#› ICC-KNN Second Phase Second phase of training Evaluates the best fuzzy constant and the best number of neighbours so to achieve best performance on the K-NN tests several m and k values finds m and k for the maximum rate of crisp hits *@2001 Adriano Cruz *NCE e IM - UFRJ Classification ‹#› ICC-KNN Pattern Recognition Module Distributes each data to its class Uses the chosen patterns, m and k to classify data *@2001 Adriano Cruz *NCE e IM - UFRJ Classification ‹#› ICC-KNN block diagram Classification Module Pattern Recognition Module U1cmin Class 1 FCM ICC w1 U1cmáx W, Uw UScmin Class s FCM Fuzzy K-NN m k Fuzzy K-NN W Uw ICC ws UScmáx Not classified Data *@2001 Adriano Cruz *NCE e IM - UFRJ Classification ‹#› ICC-KNN Let R={r1,r2,...,rn} be the set of samples. Each sample ri belongs to one of s known classes. Let Uic be the inclusion matrix for the class i with c categories. Let Vic be the centre matrix for the class i with c categories. Let wi be equal to the best Vic of each class Let W be the set of sets of centres wi *@2001 Adriano Cruz *NCE e IM - UFRJ Classification ‹#› ICC-KNN algorithm Classification Module First phase of training Step 1. Set m Step 2. Set cmin and cmáx Step 3. For each s known class Generate the set Rs with the points from R belonging to the class s For each category c in the interval [cmin , cmáx] Run FCM for c and the set Rs generating Usc and Vsc Calculate ICC for Rs e Usc End Define the patterns ws of class s as the matrix Vsc that maximizes ICC Step 4. Generate the set W = {w1, ..., ws} *@2001 Adriano Cruz *NCE e IM - UFRJ Classification ‹#› ICC-KNN algorithm Second phase of Training Step 5. Set mmin e mmáx Step 6. Set kmin e kmáx For each m from [mmin , mmáx] For each k from [kmin , kmáx] Run fuzzy K-NN for the patterns from th set W generating Umk Calculate the number of crisp hits for Umk Step 7. Choose m and k that yields the best crips hit figures Step 8. if there is a draw If the k’s are different Choose the smaller k else *NCE e IM - UFRJ *@2001 Adriano Cruz Classification Choose the smaller m ‹#› ICC-KNN algorithm Pattern Recognition Module Step 9. Apply fuzzy K-NN using patterns form the set W and the chosen parameters m and k to the data to be classified. *@2001 Adriano Cruz *NCE e IM - UFRJ Classification ‹#› ICC-KNN results *@2001 Adriano Cruz *NCE e IM - UFRJ Classification ‹#› ICC-NN results 2000 samples, 4 classes, 500 samples in each class Classes 1 and 4 – concave classes 2 and 3 – convex classes, elliptic format Classes *@2001 Adriano Cruz *NCE e IM - UFRJ Classification ‹#› ICC-KNN results First phase of training FCM applied to each class Training data 80% 400 samples from each class c = 3..7 and m = 1,25 ICC applied to results Classes 1 and 4 4 categories Classes 2 and 3 3 categories *@2001 Adriano Cruz *NCE e IM - UFRJ Classification ‹#› ICC-KNN results Second phase of Training Running fuzzy K-NN Patterns from first phase Random patterns k = 3 a 7 neighbours m = {1,1; 1,25; 1,5; 2} *@2001 Adriano Cruz *NCE e IM - UFRJ Classification ‹#› ICC-KNN results Conclusão:K-NN é mais estável em relação ao valor de m para os padrões da PFT *@2001 Adriano Cruz *NCE e IM - UFRJ Classification ‹#› ICC-KNN results Dados de Treinamento Padrões da PFT Padrões Aleatórios Classes 1 2 3 4 1 2 3 4 1 388 10 0 2 213 66 0 121 2 14 379 0 7 19 380 0 1 3 0 0 376 24 3 0 324 73 4 0 1 2 397 4 46 1 349 Training data Lines classes Columns classification m = 1,5 e k = 3 96,25% m = 1,1 e k = 3 79,13% (random patterns) *@2001 Adriano Cruz *NCE e IM - UFRJ Classification ‹#› ICC-KNN results Dados de Testes Classes Padrões da PFT Padrões Aleatórios 1 2 3 4 1 2 3 4 1 97 2 0 1 53 27 0 20 2 4 93 0 3 4 96 0 0 3 0 0 90 10 0 0 82 18 4 0 0 1 99 0 15 0 85 Test data Lines classes Columns classification Pad. PFT – 94,75% Pad. Aleat – 79% *@2001 Adriano Cruz *NCE e IM - UFRJ Classification ‹#› ICC-KNN x Others FCM, FKCN, GG e GK Fase de Treinamento (FTr) Dados de treinamento c = 4 e m = {1,1; 1,25; 1,5; 2} Associar as categorias às classes • Critério do somatório dos graus de inclusão o Cálculo do somatório dos graus de inclusão dos pontos de cada classe em cada categoria o Uma classe pode ser representada por mais de uma categoria *@2001 Adriano Cruz *NCE e IM - UFRJ Classification ‹#› ICC-KNN x Others Fase de Teste Dados de Teste Inicialização dos métodos com os centros da FTr Calcula o grau de inclusão dos pontos em cada categoria Classe representada por mais de 1 categoria Grau de inclusão = soma dos graus de inclusão dos pontos nas categorias que representam a classe *@2001 Adriano Cruz *NCE e IM - UFRJ Classification ‹#› ICC-KNN x Others ICC-KNN KNN A. FCM FKCN GG GK R 94,75% 79% 66% 66% 69% 84% N 95,75% 83% 70,75% 70,75% 69% 89,5% T 36,5s 23,11s 2,91s 2,59s 22,66s 18,14s GK para m = 2 84% FCM e FKCN 66% para m = 1,1 e m = 1,25 GG-FCM 69% para m = 1,1 e 1,25 GG Aleatório 57,75% para m = 1,1 e 25% para m = 1,5 *@2001 Adriano Cruz *NCE e IM - UFRJ Classification ‹#› GK *@2001 Adriano Cruz *NCE e IM - UFRJ Classification ‹#› GK GK Classes 1 2 3 4 1 77 6 0 17 2 6 94 0 0 3 0 0 97 3 4 0 0 32 68 *@2001 Adriano Cruz *NCE e IM - UFRJ Classification ‹#› Classification KNN+Fuzzy Cmeans System KNN+Fuzzy C-Means algorithm The idea is an two-layer clustering algorithm First an unsupervised tracking of cluster centres is made using K-NN rules The second layer involves one iteration of the fuzzy c-means to compute the membership degrees and the new fuzzy centres. Ref. N. Zahit et all, Fuzzy Sets and Systems 120 (2001) 239-247 *@2001 Adriano Cruz *NCE e IM - UFRJ Classification ‹#› First Layer (K-NN) Let X={x1,…,xn} be a set of n unlabelled objects. c is the number of clusters. The first layer consists of partitioning X into c cells using the fist part of K-NN. Each cell i is (1<=i<=c) represented as Ei (yi, K-NN of yi, Gi) Gi is the center of cell Ei and defined as Gi *@2001 Adriano Cruz xk xk Ei k 1 *NCE e IM - UFRJ Classification ‹#› KNN-1FCMA settings Let X={x1,…,xn} be a set of n unlabelled objects. Fix c the number of clusters. Choose m>1 (nebulisation factor). Set k = Integer(n/c –1). Let I={1,2,…,n} be the set of all indexes of X. *@2001 Adriano Cruz *NCE e IM - UFRJ Classification ‹#› KNN-1FCMA algorithm step 1 Calculate G0 for i = 1 to c Search in I for the index of the farthest object yi from Gi-1 For j = 1 to n Calculate distance from yi to xj if j <= k then add xj to Ei else if xi is closer to y than any previous NN then delete the farthest neighbour and include xi in the set Ei *@2001 Adriano Cruz *NCE e IM - UFRJ Classification ‹#› KNN-1FCMA algorithm cont. Include yi in the set Ei . Calculate Gi. Delete yi index and the K-NN indexes of yi from I. if I then for each remaining object x determine the minimum distance to any centre Gi of Ei. classify x to the nearest centre. update all centres. *@2001 Adriano Cruz *NCE e IM - UFRJ Classification ‹#› KNN-1FCMA algorithm step2 Compute the matrix U according to 1 2 m 1 c d ik ik l 1 d lk Calculate all fuzzy centres using n vij m ( ) ie xej e 1 n m ( ) ie e 1 *@2001 Adriano Cruz *NCE e IM - UFRJ Classification ‹#› Results KNN-1FCMA Data Elem c Misclassification rate Number of Iterations avg FCMA KNN-1FCMA FCMA S1 20 2 0 0 11 S2 60 3 1 0 8 S3 80 4 2 0 10 S4 120 6 13 13 19 IRIS23 150 2 14 13 10 IRIS 100 3 16 17 12 *@2001 Adriano Cruz *NCE e IM - UFRJ Classification ‹#›