Abstract:Lots of features in high-dimensional data are redundant or irrelevant. To tackle this problem, the concept of feature selection is introduced. In the meantime, many problems in machine learning involve examples that are naturally comprised of multiple views and with a limited number of labels. Multi-view learning and semi-supervised learning become the hotspots in machine learning. Hence authors investigate how to select relevant features with minimum redundancy from multi-view data with a limited number of labels, and propose a semi-supervised feature selection and clustering framework. To remove redundant and irrelevant features, authors exploit relations among views and relations among features in each view, and use a limited number of labeled data to help feature selection. The proposed framework in multi-view datasets is systematically evalated, and the results demonstrate the effectiveness and potential of the proposed method.