Abstract:Sequential information bottleneck (sIB) algorithm is one of the widely used clustering algorithms. The sIB algorithm applies the joint probability model to describe data, which has good ability to express the relationship between data samples and data attributes. However, the sIB algorithm suggests that all data attributes are equally important, which influences the clustering effect. To address the issue, the paper proposes the weighting joint probability model. The proposed model applies the mutual information measurement to the important level of data attributes so that to highlight representative attributes and depress redundancy attributes. Experiments on UCI datasets show that the proposed the weighting joint probability model (WJPM) sIB algorithm based on WJPM improves the F1 measure by 5.90% than the sIB algorithm.