Please use this identifier to cite or link to this item:
|Title||New Method to Improve Mining of Multi-Class Imbalanced Data|
Class imbalance is one of the challenging problems for data mining and machine learning techniques. The data in real-world applications often has imbalanced class distribution. That is occur when most examples are belong to a majority class and few example belong to a minority class. In this case, standard classifiers tend to classify all examples as a majority class and completely ignore the minority class. For this problem, researchers proposed a lot of solutions at both data and algorithmic levels. Most efforts concentrate on binary class problems. However, binary class is not the only scenario where the class imbalance problem prevails. In the case of multi-class data sets, it is much more difficult to define the majority and minority classes. Hence, multi class classification in imbalanced data sets remains an important topic of research. In our research, we proposed new approach based on SOMTE (Synthetic Minority Over-sampling TEchnique) and clustering which is able to deal with imbalanced data problem involving multiple classes. We implemented our approach by using open source machine learning tools: Weka, and RapidMiner. The experimental results show our approach is effective to deal with the multi class imbalanced data sets, and can improve the classification performance of minority class and its performance on the whole data set. In the best case, our F-measure improved from 66.91 to 95.18. We compared our approach with other approaches and we find our approach achieved best F-measure results in most cases.
|Publisher||the islamic university|
|Files in this item|