Clustering Approach for Unsupervised Image Classification using Genetic Algorithm

number: 
1870
إنجليزية
department: 
Degree: 
Author: 
Kholood Jamal Moulod
Supervisor: 
Dr. Bara'a Ali Attea
Dr. Sawsan Kamal Thamer
year: 
2008

Clustering is a discipline devoted to find and describe cohesive or homogeneous chunks in data, the clusters. An example of clustering problem is the automatic revealing of meaningful parts in a digitalized image. The motivation for the focus on data clustering is the fact that data clustering is an important process in pattern recognition and machine learning. Clustering algorithms are used in many applications such as image segmentation, vector and color image quantization, compression, etc. Therefore, finding an efficient clustering algorithm is very important for researchers in many different disciplines.The primary objective of this thesis is to utilize Genetic Algorithm (GA) as a clustering tool for the unsupervised classification of grayscale image data. It presents two variants of GA: the first variant is based on the anonical GA while the second variant is based on compact GA, cGA. The characteristics components of each algorithm are presented in term of individual and population representation, fitness function evaluation, evolution (selection, crossover, and update) operators, and stopping condition. These two genetic algorithms are then coupled with one popular local-search cluster algorithm, known as K-means algorithm. By coupling, the objective is to harness the power of each algorithm: GA search exploration power and K-means search exploitation power. Moreover, the canonical mechanism of perturbation operators symbolized by both crossover and mutation is imitated in a modified version of cGA in an attempt to improve its search power.To show the applicability of the presented clustering algorithms, Human medical MRI and land sat images, are used in the experiments. Also, the experiments considered different number of clusters. Comparison results are reported in qualitative terms (i.e. visually) and in quantitative terms using ntization error, weighted error (sum of cluster compactness, clusters separation, and quantization error), and compactness-separation ratio. Results demonstrate that cross-fertilization between the two algorithms is of being benefit in image data clustering, and it outperforms K-means and genetic-based algorithms when they operated individually. Additionally and more interestingly, the modified cGA outperforms the traditional cGA, which leverage the influence of the added perturbation operators including two-point crossover and binary mutation.