Abstract:In order to control the noise for semisupervised classification algorithms learning from pseudolabeled samples, based on Gauss mixture model and pseudovalidation set, a new method was proposed to quantify and analyze the noise with the consideration of the case that distribution noise was hard to quantify and usually ignored in existing research. According to the generalization error analysis in the presence of noise, a traceable strategy for iterative classifier training was proposed to decrease the impact of noise among pseudolabeled samples. Combining the training strategy with ensemble learning, an ensemble selflearning (ESL) algorithm was proposed to further improve the generalization ability of classification algorithms. The proposed method was compared with other stateofart algorithms on six open datasets. The results show that the largest mean accuracy is achieved by the proposed method with the highest accuracy on 75% experimental datasets.