Abstract:Image feature is the key to content-based image retrieval (CBIR). Most of the used manual features are difficult to effectively represent the features of the breast mass, and there is a semantic gap between the underlying features and the high-level semantics. In order to improve the retrieval performance of CBIR, this paper uses deep learning to extract the high-level semantic features of images. Because the deep convolution features of mammograms have some redundancies and noises in the spatial and feature dimensions, this paper optimizes the spatial and semantic features of depth features based on the vocabulary tree and inverted files, and constructs two different depth semantic trees. In order to fully exert the discriminative power of deep convolution features, the weight of tree nodes is refined according to the local characteristics of breast image depth features, and two node weighting methods are proposed to obtain better retrieval results. In this paper, 2 200 regions of interest (ROIs) are extracted from the digital database for screening mammography (DDSM) as datasets. The experimental results show that the proposed method can effectively improve the retrieval accuracy and the classification accuracy of the mass region of interest, and has good scalability.