Please use this identifier to cite or link to this item:
|Title||An Improvement for DBSCAN Algorithm for Best Results in Varied Densities|
Density-Based Spatial Clustering of Applications with Noise (DBSCAN) is a base algorithm for density based clustering. It can find out the clusters of different shapes and sizes from a large amount of data, which is containing noise and outliers. However, it fails to handle the local density variations that exist within the cluster. In this thesis, an enhancement of DBSCAN algorithm is proposed, which detects the clusters of different shapes, sizes that differ in local density. We introduce three new algorithms. Our first proposed algorithm Vibration Method DBSCAN (VMDBSCAN) first finds out the “core” of each cluster – clusters generated after applying DBSCAN -. Then it “vibrates" points toward cluster that has the maximum influence on these points. The second proposed algorithm is Dynamic Method DBSCAN (DMDBSCAN). It selects several values of the radius of a number of objects (Eps) for different densities according to a k-dist plot. For each value of Eps, DBSCAN algorithm is adopted in order to make sure that all the clusters with respect to corresponding density are clustered. And for the next process, the points that have been clustered are ignored, which avoids marking both denser areas and sparser ones as one cluster. The last algorithm Vibration and Dynamic DBSCAN (VDDBSCAN) combines the first and second algorithms to produce best clustering results. It begins by searching for each level of density its corresponding Eps, then it will use DBSCAN to find all clusters, finally, it will use vibration method of VMDBSCAN to solve the problem of splitting clusters. Experimental results are obtained from artificial data sets and three real data sets from UCI. These data sets are of varied densities to match our goal for testing the proposed algorithms. The final results show that our algorithms get a good results with respect to the original DBSCAN algorithm and DVBSCAN algorithm. We obtain the correct number of clusters of artificial data sets. In the real data sets, the error rate is decreased when merging VMDBSCAN with DMDBSCAN and reach 9.76 % for IRIS data set, while when using DVBSCAN it was 17.22 %. For Haberman data set it reach 12.54 %, while when using DVBSCAN it was 32.65 %. For Glass data set it reach 33.43 %, while when using DVBSCAN it was 41.23 %.
|Publisher||الجامعة الإسلامية - غزة|
|Files in this item|