Please use this identifier to cite or link to this item:
http://hdl.handle.net/20.500.12358/18783
Title | Enhancing and Combining a Recent K-means Family of Algorithms for Better Results |
---|---|
Untitled | |
Abstract |
Clustering is widely used to explore and understand large collections of data. K-means clustering method is one of the most popular approaches due to its ease of use and simplicity to implement. In this thesis, the researcher introduces Distance-based Initialization Method for K-means clustering algorithm (DIMK-means) which is developed to select carefully a set of centroids that would get high accuracy results compared to the random selection of standard K-means clustering method in choosing initial centroids, which gets low accuracy results. This initialization method is as fast and as simple as the K-means algorithm itself with almost the same low cost, which makes it attractive in practice. The researcher also Introduces Density-based Split- and -Merge K-means clustering Algorithm (DSMK-means) which is developed to address stability problems of K-means clustering, and to improve the performance of clustering when dealing with datasets that contain clusters with different complex shapes and noise or outliers. Based on a set of many experiments, this research concluded that the developed algorithms are more capable to finding high accuracy results compared with other algorithms especially as they can process datasets containing clusters with different shapes, densities, non-linearly separable, or those with outliers and noise. The researcher chose the experiments datasets from artificial and real-world examples off the UCI Machine Learning Repository. |
Authors | |
Supervisors | |
Type | رسالة ماجستير |
Date | 2013 |
Language | English |
Publisher | الجامعة الإسلامية - غزة |
Citation | |
License | ![]() |
Collections | |
Files in this item | ||
---|---|---|
file_1.pdf | 3.527Mb |