A Purified Stacking Ensemble Framework for Cytology Classification
30th International Conference on Multimedia Modeling, 2024
Linyi Qian, Qian Huang, Yulin Chen, and Junzhou Chen
Abstract
Cancer is one of the fatal threats to human beings. However, early detection and diagnosis can significantly reduce death risk, in which cytology classification is indispensable. Researchers have proposed many deep learning-based methods for automated cancer diagnosis. Nevertheless, due to the similarity of pathological features in cytology images and the scarcity of high-quality datasets, neither the limited accuracy of single networks nor the complex architectures of ensemble methods can meet practical application needs. To address the issue, we propose a purified Stacking ensemble framework, which employs three homogeneous convolutional neural networks (CNNs) as base learners and integrates their outputs to generate a new dataset by a k-fold split and concatenation strategy. Then a distance weighted voting technique is applied to purify the dataset, on which a multinomial logistic regression model with a designed loss function is trained as the meta-learner and performs the final predictions. The method is evaluated on the FNAC, Ascites, and SIPaKMeD datasets, achieving accuracies of 99.85%, 99.24%, and 99.75%, respectively. The experimental results outperform the current state-of-the-art (SOTA) methods, demonstrating its potential for reducing screening workload and helping pathologists detect cancer.
Framework of the purified Stacking ensemble
The overall workflow of the purified Stacking ensemble, where KFSC represents k-fold split and concatenation, DW-Voting represents distance weighted voting and MLR with AW-Softmax represents multinomial logistic regression model with adaptive weighted softmax loss function.
Comparison with other methods.
Comparison with other methods, where P represents the number of parameters, S represents inference speed, Acc represents accuracy, Pre represents precision and Rec represents recall.
