全称：IEEE信息科技和技术会议(IEEE Cyber Science and Technology Congress）
报告题目：DYCUSBoost: Adaboost-based imbalanced learning using dynamic clustering and undersampling
Ensemble learning is a powerful approach to classifying imbalanced data in machine learning. Adaboost as one of Ensemble learning, which often modified to deal with imbalanced problem. However, due to the variation of sample weights in Adaboost algorithm, the distribution of datasets is not consistent for each weak classifier. As a result, feature space-based resampling fails to reflect the transformation of distribution. Aiming at this problem, this paper proposes DYCUSBoost, an Adaboost-based imbalanced learning approach using dynamic clustering and undersampling. In DYCUSBoost, the clustering process is synchronized with the iteration of Adaboost, where clusters formed in different periods of Adaboost are adjusted, which make DYCUSBoost grasp the transformation of the distribution. The undersampling method assesses the importance of each cluster, and make important ones collect more samples. Through experimental verification, DYCUSBoost demonstrates desirable performance in terms of commonly-accepted evaluating metrics, e.g., AUC, G-Mean, F-Measure, etc. Moreover, the prediction stability of DYCUSBoost outperforms most undersampling methods.