WU Ye , HE Lanxiang , ZHANG Xinyuan , FU Yunhe , LIU Xiaoming , HE Jianzhong
2024, 39(4):776-793. DOI: 10.16337/j.1004-9037.2024.04.002
Abstract:Diffusion magnetic resonance imaging (dMRI) represents an advanced medical imaging modality that yields intricate insights into tissue microstructure by assessing the diffusion of water molecules within biological tissues, which is progressively integrated into clinical practices for diagnosis and treatment. Notably, within traditional Chinese medicine (TCM), dMRI has demonstrated unique potential and significance, providing an empirical foundation for TCM’s “differentiation and treatment”. Its utility extends beyond precise disease diagnosis to encompass disease progression monitoring and treatment efficacy evaluation, aligning with TCM’s principles of “preventive treatment”and “individualized treatment”. Nonetheless, the assimilation of dMRI into TCM encounters notable challenges. This review article delves into the recent applications of dMRI within TCM, scrutinizing its prospects and constraints. By fostering interdisciplinary partnerships between medical and engineering disciplines, particularly in the realm of TCM-intelligent imaging technology, this study aims to propel the application and evolution of dMRI within TCM’s diagnostic and therapeutic domains.
LIU Kaiwen , JIN Yingying , WANG Shouju
2024, 39(4):794-812. DOI: 10.16337/j.1004-9037.2024.04.003
Abstract:Neoadjuvant chemotherapy has become a standard treatment strategy for breast cancer, and magnetic resonance imaging (MRI) is the preferred imaging method for assessing the response of breast cancer to neoadjuvant chemotherapy. Although MRI can provide detailed information of tumor, including location, size, and microenvironment, the precise assessment of neoadjuvant chemotherapy of breast cancer suffers from the diverse changes in tumors present in MRI images. Artificial intelligence methods based on machine learning and deep learning have demonstrated the ability to recognize complex patterns in MRI data. Through clinical radiologic feature analysis, radiomics analysis, and habitat analysis, artificial intelligence technology has significantly enhanced the performance and efficiency of assessments for breast cancer neoadjuvant chemotherapy, aiding in the realization of personalized treatment strategies. This paper introduces the MRI data and performance indicators in assessing breast cancer neoadjuvant chemotherapy, summarizes the progress of artificial intelligence applications in this field, and discusses the current challenges and potential future research directions for artificial intelligence technology in practical applications.
SUN Hailin , YAN Jiadong , ZHANG Rong , KENDRICK Keith , JIANG Xi
2024, 39(4):813-826. DOI: 10.16337/j.1004-9037.2024.04.004
Abstract:Brain functional connectivity (FC) networks serve as potential neuroimaging biomarkers for the auxiliary diagnosis and treatment of autism spectrum disorder (ASD). However, most existing models are merely based on neuroimaging data and neglect individual clinical indicators, leading to the loss of disorder-specific information. And, ASD is a spectrum disorder exhibiting significant individual differences in terms of clinical indicators. Therefore, these traditional generative models are limited in generating accurate individual FC of ASD that reflects specific clinical symptoms. To address this limitation, a novel clinical-indicator-aware Wasserstein generative adversarial network (CI-WGAN) is proposed to generate individual FC of ASD. The proposed model introduces an effective guidance mechanism based on individual clinical indicators to generate individualized FC networks. Extensive experiments are performed on ABIDE I dataset, one of the largest publicly available ASD brain imaging datasets. The results show that the generated FC of the proposed method achieves promising peak signal-to-noise ratio (PSNR) of 19.037, structural similarity (SSIM) of 0.236 and mean absolute error (MAE) of 0.178, showing satisfying improvements of 3%, 12% and 2% respectively compared to the traditional models. Additionally, the representational similarity analysis (RSA) are performed between the generated FC and two independent clinical indicators. The results show that the RSA values based on the proposed method increase by 0.1 and 3.7 times compared to those based on traditional models, demonstrating that the FC generated via the proposed CI-WGAN contains more individual symptom information of ASD. In summary, the proposed CI-WGAN model achieves high-quality generation of individual FC, and provides a powerful tool for the early diagnosis and personalized treatment of ASD.
HAO Xiaoke , HE Zilong , LU Xinchu , MA Mingming , LIU Shiyu
2024, 39(4):827-842. DOI: 10.16337/j.1004-9037.2024.04.005
Abstract:In recent years, functional brain networks have been used in the diagnosis of brain disorders such as autism spectrum disorder (ASD). Existing studies have shown that combining resting-state functional magnetic resonance imaging (rs-fMRI) data as well as non-imaging information to form a population graph, and then learning and classifying the data by using graph neural network (GNN) is very effective in the diagnosis of ASD. However, most studies still face two challenges: First, the construction of functional connectivity matrices using methods such as Pearson correlation coefficient cannot effectively identify and analyze localized brain regions and biomarkers associated with diseases; second, it is difficult to efficiently learn multi-scale information about node features in population graphs on GNN. To solve these problems, a multi-scale residual fusion graph convolutional networks (MSRF-GCN) based on the attention mechanism is proposed. The algorithm efficiently localizes and identifies brain regions useful for diagnosis by designing a functional connection generator to extract temporally relevant features with remote dependencies. Meanwhile, the multi-scale information in the population graph is learned by designing a multi-scale residual fusion algorithm. The Edge Sparse strategy is also introduced to increase the sparsity of node connections by randomly discarding edges in the initial population graph, which in turn reduces the risk of overfitting during training. The effectiveness of MSRF-GCN in the diagnosis of ASD is demonstrated by the results of experiments performed on the autism brain imaging data exchange (ABIDE) program.
GONG Rongfang , HUANG Linya , ZHU Qi , LI Shengrong
2024, 39(4):843-862. DOI: 10.16337/j.1004-9037.2024.04.006
Abstract:The multi-modal brain network, which integrates the brain structural and functional networks, can effectively extract the complementary information from different modalities, significantly improving the diagnostic accuracy of neurological diseases such as epilepsy. However, due to the long acquisition time and high acquisition cost of multi-modal data collection, it often faces the problem of modality missingness in practical applications, leading to decreased diagnostic accuracy and generalization ability of the model. To address the issue of modality data completely missing, we propose a method based on graph learning methods and cycle-consistent generative adversarial networks, named Graph-CycleGAN method. This method captures feature information between different brain regions in the brain network by introducing graph neural networks, such as graph convolutional neural networks and graph attention mechanisms. Besides, it strengthens the feature extraction ability of the generative framework and realizes the mutual generation of brain structural network and functional network. In addition, to address the lack of diagnostic result-based evaluations for the quality of generated data, this paper proposes a classification model that integrates real and generated brain networks. Experimental results on the epilepsy dataset indicate that the proposed Graph-CycleGAN method can effectively realize the generation of missing brain network by utilizing the existing modality information.
JIAO Ruike , ZHANG Xiaofeng , YE Chuyang
2024, 39(4):863-873. DOI: 10.16337/j.1004-9037.2024.04.007
Abstract:White matter fiber tract segmentation methods provide crucial neural pathway reference information for brain connectivity analysis by identifying white matter tracts connecting distinct brain regions. Traditional segmentation methods predominantly depend on diffusion magnetic resonance imaging (dMRI), but the lengthy acquisition time of dMRI severely restricts its clinical applicability. To address this limitation, this paper introduces a white matter fiber tract segmentation approach based on T1-weighted imaging. This method leverages the structural tensor of T1-weighted images to infer potential fiber orientations, thereby enhancing the segmentation accuracy of white matter tracts. Moreover, the proposed method incorporates privileged information from dMRI during model training to guide the learning process, thus improving the performance of the white matter tract segmentation model, and the segmentation of challenging tracts is improved significantly, with a 5% improvement in Dice score for the left fornix (FX_left) and a 6% improvement in Dice score for the right fornix (FX_right). This approach mitigates the limitations of conducting neural pathway analysis in the absence of dMRI, broadening the application scope of neural pathway analysis.
YANG Yinkai , WAN Peng , SHI Hang , XUE Haiyan , SHAO Wei
2024, 39(4):874-885. DOI: 10.16337/j.1004-9037.2024.04.008
Abstract:In recent years, liver cancer has become a disease that seriously threatens human health, and multi-modal ultrasound imaging is one of the important diagnostic tools for it. Similar to how clinicians use multi-modal ultrasound to diagnose liver cancer, using multi-modal fusion methods to integrate the image features of each ultrasound modality is expected to improve the accuracy of liver cancer diagnosis. However, the existing multi-modal fusion methods often isolate the feature information of each modality during the fusion process, failing to fully consider the intra-modal sample similarity and inter-modal semantic consistency, while ignoring modality uncertainty. Therefore, this paper proposes a liver cancer diagnosis method based on multi-modal ultrasound contrast learning, aiming to make full use of the feature information of each ultrasound modality to improve the diagnostic accuracy. Specifically, this method employs supervised contrastive learning to deeply explore modality features, capturing both the similarity information among samples within the modality and the semantic consistency information across different modalities. In addition, this method introduces a measure of modality uncertainty based on Subjective Logic, enabling dynamic fusion of modality information and exhibiting good robustness. Evaluation of multimodal ultrasound imaging shows that the proposed method achieves an 85.21% diagnostic accuracy, demonstrating performance improvement compared to other mainstream multimodal fusion methods.
Lan Tianxu , Zhu Qiuming , Bai Yunpeng , Lin Zhipeng , Wu Qihui , Duan Hongtao , LYU Bing
2024, 39(4):886-897. DOI: 10.16337/j.1004-9037.2024.04.009
Abstract:Antenna pattern measurement is an important part of antenna measurement. Addressing the difficulty of outdoor measurement of antenna pattern, this paper presents an outdoor measurement scheme for antenna three-dimensional pattern based on the generative adversarial network. An unmanned aerial vehicle (UAV) is used to collect the antenna pattern data, modify the collected data, and obtain the direct path data when the receiving antennas match with the polarization of the antenna to be measured. Finally, the three-dimensional pattern of the antenna is reconstructed by using the trained generative adversarial network. Simulation results show that the proposed scheme can complete the measurement of antenna three-dimensional pattern efficiently and accurately, showing practical value.
Wang Jiadong , Zhang Weike , Zhang Pan
2024, 39(4):898-907. DOI: 10.16337/j.1004-9037.2024.04.001
Abstract:In this paper, a method based on frequency group coding signal is proposed. Based on the linear frequency modulation (LFM) signal, the frequency group coding signal is constructed, which makes the pulse carrier frequency sequence of the transmitting signal a certain randomly. The anti-interference ability of the waveform is guaranteed. At the same time, a coherent processing method for the corresponding encoded signal is designed to address the issues of main lode broadening and side lode lifting problems caused by non coherent phase of frequency agile signals. Firstly, high-resolution distance compensation is applied to the compressed signal of the echo pulse, and then intra pulse coherence processing is achieved through velocity interpolation traversal and distance consistency correction. Finally, the advantages of the coding signal carrier frequency sequence are used to realize the intergroup coherent accumulation of the pulse group. In the simulation experiment, the coding signal of the building is verified compared with the advantages of LFM signal in the anti-interference, and the validity of the proposed method is compared to the sparse reconstruction algorithm based on compressive sensing.
ZHOU Xuan , GE Qi , SHAO Wenze
2024, 39(4):908-921. DOI: 10.16337/j.1004-9037.2024.04.011
Abstract:Aiming at the problem of low detection accuracy caused by complex background and dense distribution of small size targets in unmanned aerial vehicle (UAV), this paper proposes a small target detection algorithm based on high resolution feature enhancement. Firstly, a high-resolution feature enhancement network is proposed, which expands the scale of the output feature map by reducing the sub-sampling times of the backbone. At the same time, the bilinear interpolation is introduced to reduce the loss of feature information after up-sampling, thereby preserving more semantic and detailed features. Secondly, the spatial pyramid pooling-fast module combined with the cross stage partial structure is embedded in the backbone to enhance the information fusion of local and global features, so as to obtain a larger receptive field. Finally, the mosaic-mixup data enhancement method is used to enhance the complexity of image background and improve the generalization ability of the model. Experimental results on the public dataset VisDrone 2019 show that compared with other mainstream algorithms such as the “ you only look once ”(YOLO) series, the mean average precision of the proposed algorithm has significantly improved. The advantages of the proposed algorithm have been verified in different scenarios, indicating that the algorithm has strong practicality for dense small target detection tasks in UAV aerial images.
2024, 39(4):922-932. DOI: 10.16337/j.1004-9037.2024.04.012
Abstract:Traditional image captioning methods use only the visual and semantic information of the current moment to generate prediction words without considering the visual and semantic information of the past moments, which leads to the output of the model to be relatively homogeneous in terms of temporal dimension. As a result, the generated captioning is lacking in terms of accuracy. To address this problem, an image captioning method that fuses multi-temporal dimensional visual and semantic information is proposed, which effectively fuses visual and semantic information of past moments and designs a gating mechanism to dynamically select both kinds of information. Experimental validation on the MSCOCO dataset shows that the method is able to generate captioning more accurately, and the performance is considerably improved in all evaluation metrics when compared with the most current state-of-the-art image captioning methods.
WU Peng , ZHANG Sunjie , WANG Yongxiong , CHEN Yuanfeng , QIN Haiwang
2024, 39(4):933-943. DOI: 10.16337/j.1004-9037.2024.04.013
Abstract:Image inpainting based on deep learning has made a lot of remarkable progress. However, when there is a large area mask, due to the lack of reasonable prior information guidance, the repair results often appear artifacts and blurred textures. Therefore, we propose an image inpainting algorithm that combines prior features with image predictive filtering. It consists of two branches: Image filtering kernel prediction branch and feature inference and image filtering branch. The features are extracted from the decoder part of the image filter kernel prediction branch. The multi-scale external spatial feature fusion is used to reconstruct the mask region features, and the decoding stage is passed to another branch as a prior feature to provide richer semantic information for image inpainting. Then, a spatial feature-aware inference block is introduced in the feature inference and image filtering branches, which can filter out the distracting features and capture the informative long-distance image context for inference. Finally, the image prediction filter kernel is used to filter and eliminate artifacts. Compared with other repair networks on CelebA and Places2 datasets, the superiority of the method in repair quality is proved.
TAO Zhiyong , DOU Miaosen , LI Heng , LIN Sen
2024, 39(4):944-953. DOI: 10.16337/j.1004-9037.2024.04.014
Abstract:Effective acquisition of point cloud features is the key to analyzing and processing 3D point cloud scenes. To address the problem that current deep learning methods have inadequate feature information extraction and difficulty in capturing deep semantic information, a fusion fine-grained feature encoding network is proposed to improve the accuracy of point cloud classification and segmentation tasks. First, the feature extraction module contains two sub-modules, one is the dilation graph convolution module, which can extract richer geometric information than graph convolution; and the other is the fine-grained feature encoding module, which can capture detailed features of local regions. Second, the two modules are dynamically fused by learnable parameters to efficiently learn the contextual information of each point. Finally, all the extracted features are summed and pass the channel-wise affinity attention module, assisting the feature map to avoid redundancy by emphasizing its distinct channels. Point cloud classification experiment is performed on the ModelNet40 and ScanObjectNN datasets, and the overall accuracy is 93.3% and 80.0%, respectively. The mean intersection over union (mIoU) is 85.6% for part segmentation experiments on the ShapeNet Part dataset. Experimental results show that the proposed method performs better than the current mainstream methods.
2024, 39(4):954-966. DOI: 10.16337/j.1004-9037.2024.04.015
Abstract:In the early screening of colorectal cancer, diagnostic efficiency and accuracy can be improved by automated polyp detection and segmentation of colonoscopy images. Due to the complexity of internal environment of intestines and the limitation of image quality, automated polyp segmentation is still a challenging problem. Aiming at this problem, this paper proposes a dual-decoding model for polyp segmentation using Transformer and null convolution to achieve feature fusion (FTDC-Net). ResNet50 is used as an encoder in order to be able to better extract deep image features. The Transformer coding module is used, which has a self-attention mechanism to capture long distance dependencies between the inputs, and different dilated-convolutions are used in the model to expand the sensory field of the model to allow the model to capture a larger range of information in the colonoscopy image. The decoding part of the network model in this paper uses a dual-decoding structure, including an autoencoder branch that reconstructs the inputs and a coding branch for segmenting the results. The output of the autoencoder is used in the model to generate an attention map as an attention mechanism. This map will be used to guide the segmentation results. Experimental validation is carried out on the Kvasir-SEG and ETIS-LARIBPOLYPDB standard datasets, and experimental results show that FTDC-Net can effectively segment colon polyps, and achieves a high level of improvement in all evaluation metrics compared to the current mainstream polyp segmentation models.
MA Jia , WU Haifeng , LI Shunliang
2024, 39(4):967-983. DOI: 10.16337/j.1004-9037.2024.04.016
Abstract:The use of resting-state functional magnetic resonance imaging technology to obtain functional connection (FC) of brain regions is widely used in classification studies of mild cognitive impairment (MCI). However, the classification of whole-brain FC usually has the problems of information redundancy and feature dimension disaster. Therefore, a new method of “G-Lasso + feature compression” is proposed to solve the above problems. Firstly, the blind source separation technology is used to obtain the active signal time series of the whole brain functional brain region, and the FC sparse network is constructed by G-Lasso. Secondly, the sparse FC of MCI, normal subjects and all subjects on the group average is calculated, and the cluster Class 1—Class 3 center decision is performed in combination with the Euclidean distance to obtain the difference feature information between clusters. Finally, the sparse FC of each participant is expressed as a linear combination of the cluster center, and the compressed FC is obtained as the key feature to complete the classification. The results show that the proposed method obtains significant differences in inter-cluster features after Class decision and provides effective sign information. The classification accuracy of the key features obtained by further compressing (89.8%) is 5%—10% higher than that of the sparse method alone. The results show that in order to solve the problems of whole-brain FC, feature selection and dimensionality reduction need to be considered, but there are many uncertain factors, and “sparse + compression” can be appropriately combined.
HUANG Jianhui , MA Di , ZHANG Li
2024, 39(4):984-995. DOI: 10.16337/j.1004-9037.2024.04.017
Abstract:Autism spectrum disorder (ASD) stands as one of the most prevalent and genetically inherited neurodevelopmental disorders, characterized by a multitude of clinical symptoms, notably featuring social communication deficits. Effective identification of biomarkers holds paramount significance in facilitating early interventions for ASD. Many current methods leverage multi-site imaging data to augment sample size, thereby enhancing diagnostic accuracy. However, the heterogeneity of data across multiple sites, resulting from variations in imaging devices, imaging parameters, and data processing workflows, is frequently overlooked. To overcome the above problem, this paper proposes a graph structure learning method for multi-site autism diagnosis based on multi-view low-rank subspace (MVLL-GSL). Firstly, the multiple views of brain network are constructed for each sample, encompassing diverse topological information. Subsequently, samples from different classes are projected into their respective low-rank subspaces to mitigate the impact of data heterogeneity. Finally, the integration of graph structure learning with multi-task graph embedding learning, incorporating prior subnetworks and multi-view consistency regularization constraints, aims to extract more discriminative and coherent features from multi-view low-rank subspaces. The autism public ABIDE (Autism brain imaging data exchange) database is used to verify the proposed method. Experimental results show that the MVLL-GSL method improves the performance of ASD disgnosis and explains the association of different prior sub-networks with ASD pathogenesis.
TANG Lu , YANG Xilin , WANG Xiangrui , HU Qianyuan , ZHENG Hui
2024, 39(4):996-1008. DOI: 10.16337/j.1004-9037.2024.04.018
Abstract:Decoding knee motion intention is crucial for the wearable comfort in lower extremity exoskeleton robots. Patients with neurological disorders are often accompanied with lower limb movement disorders assessed by surface electromyography (sEMG) signals. To integrate the motion assessment and joint angle prediction for these patients, a novel CNN-LSTM framework based on the attention mechanism is proposed to predict the knee joint angle for three daily motions, i.e., horizontal walking, going uphill, and going up stairs, through 10-channel sEMG signals. The prediction error indicators, i.e., the root mean squared error (RMSE), the mean absolute error (MAE), and the coefficient of determination (R2) reach 2.74, 2.50, and 0.97, respectively, outperforming the traditional network. Furthermore, the ablation experiments show the three indicators have decreased by 20.47%, 34.36% and 6.59% on average, respectively. The proposed end-to-end prediction framework based on the attention mechanism can reach the highest prediction accuracy, providing a reference for the human-robot interaction scheme of the lower limb exoskeleton robot system.
MA Xiao , LU Xiaoguang , ZHANG Zhe , SUO Chenhao , YANG Lei
2024, 39(4):1009-1019. DOI: 10.16337/j.1004-9037.2024.04.019
Abstract:The physical health status of civil aviation personnel is an important factor affecting aviation safety, among which respiration and heart rate are extremely important indicators of health. To address the limitations and interference of contact or wearable measurement systems on personnel during working, linear frequency-modulated continuous wave (FMCW) radar can be used to achieve non-contact measurement. Since vital sign signals have the characteristics of time-varying and non-stationary, to solve the problem of mode aliasing in empirical mode decomposition (EMD) in signal decomposition, the time-varying filtering based on EMD (TVF-EMD) can adaptively adjust the local cutoff frequency of the signal, effectively improving the signal separation performance and solving the mode aliasing problem. By using the intrinsic mode functions (IMF) components decomposed by TVF-EMD to reconstruct the time-domain signal corresponding to the heartbeat, the frequency and inter-beat interval (IBI) of the heartbeat signal can be estimated, and further the relevant indicators of heart rate variability (HRV) can be estimated. Simulation experiments and actual measured data processing results show that TVF-EMD can effectively separate respiration and heartbeat signals from millimeter wave radar measurement signals. At the same time, a simulation analysis of the decomposition effects of TVF-EMD and EMD methods from the aspects of mode aliasing degree and signal separation performance has been conducted, and the results show that TVF-EMD can effectively solve the mode aliasing problem. Therefore, the TVF-EMD method can accurately and effectively extract vital sign information from millimeter wave radar measurement signals, provide accurate time-domain information for IBI estimation and HRV analysis, and has a broad application prospect.
Wang Xin , Wei Chuyuan , Zhang Lei , Wan Shanshan
2024, 39(4):1020-1032. DOI: 10.16337/j.1004-9037.2024.04.020
Abstract:The current named entity recognition task based on the pre-training-fine-tuning model has a gap between pre-training and fine-tuning, which makes it difficult to effectively model the relationship between entities and contexts, and the current Chinese named entity recognition methods cannot obtain sufficient character or word meanings. To address above problems, this paper proposes a named entity recognition method based on cue learning and incorporating multi-level feature information. Firstly, the cue text is constructed based on the cue learning mechanism, and then the character, word and entity-level feature information of the input text is spliced with it, which is taken as the input of the pre-trained model to effectively capture the semantic information between the contexts, narrow the gap between the pre-trained model and the downstream task, and improve the perceptive ability of the model for named entity recognition. The proposed method makes full use of prior knowledge to increase the learning ability of the model and improve the effectiveness of named entity recognition in the complex and variable semantic environment of Chinese. The F1 values reach 97.09%, 96.68%, 83.44%, 97.48% and 76.05% on the People’s Daily, MSRA, Weibo, Resume and CMeEE datasets, respectively. Experimental results show that the proposed method is generally better than the current mainstream Chinese named entity recognition methods.
Cao Yingying , Huan Zhan , Chen Zhen , Chen Ying
2024, 39(4):1033-1042. DOI: 10.16337/j.1004-9037.2024.04.021
Abstract:Bearing fault types are complex, and it is difficult to obtain enough training samples for each fault type under different working conditions. Convolutional neural network with training interference (TICNN)with wide convolutional kernel is introduced as the subnetwork of the Siamese network used to extract features, reducing the impact of industrial environment noise. Siamese network is a structure commonly used for few-shot learning. By inputting the same or different categories of samples for training, the mapping relationship between different attribute samples and features is learned, and the similarity between samples is used as measure index. The test sample is classified by finding the class of the nearest neighbor. Experimental results on the standard Case Western Reserve University (CWRU) bearing fault diagnosis benchmark dataset show that, in the case of limited data, the proposed model shows better results in fault diagnosis. The performance of the proposed few shot learning model exceeds the baseline model with a reasonable noise level when testing with the least training data in different noise environments, and the accuracy of fault diagnosis reaches 94.41%. When evaluating on test sets with new fault types or new working conditions, the proposed model also performs well.
Quick search
Volume retrievalYou are the visitor 
Mailing Address:29Yudao Street,Nanjing,China
Post Code:210016 Fax:025-84892742
Phone:025-84892742 E-mail:sjcj@nuaa.edu.cn
Supported by:Beijing E-Tiller Technology Development Co., Ltd.
Copyright: ® 2026 All Rights Reserved
Author Login
Reviewer Login
Editor Login
Reader Login
External Links