DOI: 10.12928/jti.v7i2.
Software Defect Prediction Using Synthetic Minority Over-sampling Technique and Extreme Learning Machine
Abstract
Software testing is one of the crucial processes in software development life cycle which will influence the software quality. One of the strategies to help testing process is predicting the part or module of software which is prone to defect. Then, the testing process can be more focused to those parts. In this research a classifier model for predicting software defect was built. One of the most important problems in software defect prediction is imbalance data distribution between samples of positive class (prone to defect) and of negative class. Therefore, in this research SMOTE is implemented to handle imbalance data problem and extreme learning machine is implemented as a classification algorithm. As a comparison to SMOTE-ELM, a modification of ELM which directly copes with imbalance problem, weighted-ELM, is also observed. This research used NASA MDP dataset PC1, PC2, PC3 and PC4. The results of experiment using 10-fold cross validation show that directly classification using ELM obtain the worse result compared to SMOTE-ELM and weighted-ELM. When the value of imbalance ratio is not very small, the SMOTE-ELM is better than weighted-ELM. When the value of imbalance ratio is very small, the g-mean of weighted-ELM is higher than the g-mean of SMOTE-ELM, but the accuracy of weighted-ELM is lower than the accuracy of SMOTE-ELM. Therefore, in this software defect prediction case it can be concluded that SMOTE is effective to increase the generalization performance of classifier in minority class as long as the value of imbalance ratio is not very small.