Recommended articles

1 Research Progress of Signal Processing Methods for Nonstationary Sea Clutter

FU Bin , BAI Yechao

2024, 39(6):1310-1325. DOI: 10.16337/j.1004-9037.2024.06.002

[Abstract](1112) [HTML](570) [PDF 2.54 M](721)

Abstract:
With regard to signal detection problems in sea clutter background， traditional methods can not achieve optimal performance due to that sea clutter is an example of nonstationary signal and its statistical characteristics vary over time. The existing nonstationary signal processing methods mainly include two categories： methods based on statistical models and methods based on time series analysis. From a statistical point of view， the most commonly used method is modeling sea clutter by compound Gaussian（CG） distribution. From the perspective of time series analysis， there are many models to describe nonstationary signals including time-varying autoregressive （TVAR） model， generalized autoregressive conditional heteroskedasticity （GARCH） model and stochastic volatility （SV） model. We make comparisons of these methods mentioned above and evaluate if they could be applied to detection in sea clutter background. All of the methods can accurately describe part of the characteristics of a nonstationary sea clutter signal to some extent. However， there exist difficulties if we try to design easy-to-implement detectors. Further research about modeling the characteristics of nonstationary signals is needed for signal detection in sea clutter background.

2 A Survey on Sound Acquisition Theories and Application Methods of Distributed Microphone Arrays

ZHANG Jie , HU De , ZHANG Xiaolei , Ling Zhenhua

2024, 39(5):1085-1113. DOI: 10.16337/j.1004-9037.2024.05.004

[Abstract](1976) [HTML](1452) [PDF 2.65 M](1342)

Abstract:
Over the past few decades of development， microphone array technology is becoming more mature， which has been applied to various human-machine interaction systems， e.g.， video-conferencing， intelligent television， mobile telephony， hearing aids. However， in realistic noisy or distant interaction scenarios， the sound acquisition quality （SAQ） of conventional topology-constrained microphone arrays cannot be guaranteed. With the wide range of using wireless intelligent terminal devices， distributed microphone array （DMA） or so-called wireless acoustic sensor network （WASN） provides more possibilities of improving the SAQ for speech interaction systems in complex and open domains， and shows a superiority in array organization， application experience and scene coverage. Recently， DMA exhibits a good applicable potential in many speech interaction tasks， which almost cover the tasks that conventional microphone array can handle. This survey will mainly summarize some existing important sound acquisition theories and application methods of DMA， including principles of array organization， utility evaluation of microphone nodes and the application methods in combination of downstream speech tasks. Finally， we will briefly discuss some key challenges and developing trends of the road of DMA to practical usages.

3 State of the Art and Prospects of Deep Learning-Based Speaker Verification

LI Jianchen , HAN Jiqing

2024, 39(5):1062-1084. DOI: 10.16337/j.1004-9037.2024.05.003

[Abstract](1389) [HTML](981) [PDF 1.60 M](1232)

Abstract:
With the development of deep learning， speaker verification has made great progress. Compared with other biometric identification technologies， this technology has advantages of remote operation， low cost， easy human-computer interaction， etc.， thus it shows a wide range of application prospects in the fields of public security， criminal investigation， and financial services. A systematic overview of the development lineage of deep learning-based speaker verification techniques is provided. Firstly， the development history and research status of deep learning-based speaker representation model are introduced in four aspects： Model input and structure， pooling layer， supervised loss function， and self-supervised learning and pre-training model. Then， the challenges faced by speaker verification are discussed， such as cross-domain mismatch problems like noise interference， channel mismatch and far-field speech， and the corresponding domain adaptation and domain generalization methods are outlined. Finally， the further research directions are presented.

4 Research Situation and Prospects of Multi-speaker Separation and Target Speaker Extraction

BAO Changchun , YANG Xue

2024, 39(5):1044-1061. DOI: 10.16337/j.1004-9037.2024.05.002

[Abstract](2394) [HTML](1665) [PDF 2.33 M](1245)

Abstract:
As a cutting-edge technology in speech signal processing， speech separation has significant research value and broad application prospects. Typically， the signal captured by the microphones contains speech signals from multiple speakers， noise and reverberation. To improve the user experience and the performance of backend devices， it is necessary to perform speech separation. Speech separation originated from the well-known cocktail party problem. It aims to separate the speech signals from the mixed signal. In recent years， researchers have proposed a large number of speech separation methods， which have significantly improved separation performance. This paper systematically reviews and summarizes these methods. First， based on whether the auxiliary information of the target speaker is leveraged， speech separation is divided into two categories， i.e.， multi-speaker separation and target speaker extraction. Second， these methods are introduced in detail， following the progression from conventional approaches to deep learning-based techniques. Finally， the existing challenges in speech separation are discussed and prospective research in the future are highlighted.

5 Artificial Intelligence-Assisted Magnetic Resonance Imaging in Assessment of Neoadjuvant Chemotherapy for Breast Cancer: A Review

LIU Kaiwen , JIN Yingying , WANG Shouju

2024, 39(4):794-812. DOI: 10.16337/j.1004-9037.2024.04.003

[Abstract](1344) [HTML](1265) [PDF 2.75 M](1220)

Abstract:
Neoadjuvant chemotherapy has become a standard treatment strategy for breast cancer， and magnetic resonance imaging （MRI） is the preferred imaging method for assessing the response of breast cancer to neoadjuvant chemotherapy. Although MRI can provide detailed information of tumor， including location， size， and microenvironment， the precise assessment of neoadjuvant chemotherapy of breast cancer suffers from the diverse changes in tumors present in MRI images. Artificial intelligence methods based on machine learning and deep learning have demonstrated the ability to recognize complex patterns in MRI data. Through clinical radiologic feature analysis， radiomics analysis， and habitat analysis， artificial intelligence technology has significantly enhanced the performance and efficiency of assessments for breast cancer neoadjuvant chemotherapy， aiding in the realization of personalized treatment strategies. This paper introduces the MRI data and performance indicators in assessing breast cancer neoadjuvant chemotherapy， summarizes the progress of artificial intelligence applications in this field， and discusses the current challenges and potential future research directions for artificial intelligence technology in practical applications.

6 Opportunities and Challenges of Diffusion MRI in Traditional Chinese Medicine

WU Ye , HE Lanxiang , ZHANG Xinyuan , FU Yunhe , LIU Xiaoming , HE Jianzhong

2024, 39(4):776-793. DOI: 10.16337/j.1004-9037.2024.04.002

[Abstract](1170) [HTML](1401) [PDF 937.74 K](1013)

Abstract:
Diffusion magnetic resonance imaging （dMRI） represents an advanced medical imaging modality that yields intricate insights into tissue microstructure by assessing the diffusion of water molecules within biological tissues， which is progressively integrated into clinical practices for diagnosis and treatment. Notably， within traditional Chinese medicine （TCM）， dMRI has demonstrated unique potential and significance， providing an empirical foundation for TCM’s “differentiation and treatment”. Its utility extends beyond precise disease diagnosis to encompass disease progression monitoring and treatment efficacy evaluation， aligning with TCM’s principles of “preventive treatment”and “individualized treatment”. Nonetheless， the assimilation of dMRI into TCM encounters notable challenges. This review article delves into the recent applications of dMRI within TCM， scrutinizing its prospects and constraints. By fostering interdisciplinary partnerships between medical and engineering disciplines， particularly in the realm of TCM-intelligent imaging technology， this study aims to propel the application and evolution of dMRI within TCM’s diagnostic and therapeutic domains.

7 Domain-Specific Foundation-Model Customization: Theoretical Foundation and Key Technology

Chen Haolong , Chen Hanzhi , Han Kaifeng , Zhu Guangxu , Zhao Yichen , Du Ying

2024, 39(3):524-546. DOI: 10.16337/j.1004-9037.2024.03.003

[Abstract](3114) [HTML](2882) [PDF 2.11 M](3174)

Abstract:
As ChatGPT and other foundation-model-based products demonstrate powerful general performance， both academia and industry are actively exploring how to adapt these models to specific industries and application scenarios， a process known as the customization of domain-specific foundation models. However， the existing general-purpose foundation models may not fully accommodate the patterns of domain-specific data or fail to capture the unique needs of the field. Therefore， this paper aims to discuss the methodology for customizing domain-specific foundation models， including the definition and types of foundation models， the description of their general architecture， the theoretical foundations behind the effectiveness of foundation models， and several feasible methods for constructing domain-specific foundation models. By presenting this content， we hope to provide guidance and reference for researchers and practitioners in the customization of domain-specific foundation models.

8 Research Progress in Evaluation Techniques for Large Language Models

ZHAO Ruizhuo , QU Zichang , CHEN Guoying , WANG Kunlong , XU Zhewei , KE Wenjun , WANG Peng

2024, 39(3):502-523. DOI: 10.16337/j.1004-9037.2024.03.002

[Abstract](2317) [HTML](1023) [PDF 1.54 M](3530)

Abstract:
With the widespread application of large language models， the evaluation of large language models has become crucial. In addition to the performance of large language models in downstream tasks， some potential risks should also be evaluated， such as the possibility that large language models may violate human values and be induced by malicious input to trigger security issues. This paper analyzes the commonalities and differences between traditional software， deep learning systems， and large model systems. It summarizes the existing work from the dimensions of functional evaluation， performance evaluation， alignment evaluation， and security evaluation of large language models， and introduces the evaluation criteria for large models. Finally， based on existing research and potential opportunities and challenges， the direction and development prospects of large language models evaluation technology are discussed.

9 Recent Advancement in Multi-granulation Three-Way Decisions

Qian Jin , Zheng Mingchen , Zhou Chuanpeng , Liu Caihui , Yue Xiaodong

2024, 39(2):361-375. DOI: 10.16337/j.1004-9037.2024.02.009

[Abstract](980) [HTML](650) [PDF 2.79 M](1475)

Abstract:
Multi-granulation three-way decisions utilizes three-way decision theory to analyze and process complex problems from multiple of views and levels， gradually becoming an efficient and reliable intelligent decision-making method. This paper reviews the research work on multi-granulation three-way decisions， mainly introduces multi-granulation fusion strategy， multiview three-way decisions， and multilevel three-way decisions， discusses multi-granulation three-way decisions from both qualitative and quantitative perspectives， illustrates the relationships between different multi-granulation three-way decisions models， and points out several problems for the existing multi-granulation three-way decisions. The obtained results can provide some references for the deep research in this field.

10 Research Progress on Application of Computational Imaging in Holographic Storage Phase Retrieval

HAO Jianying , LIN Yongkun , LIU Hongjie , CHEN Ruixian , SONG Haiyang , LIN Dakui , LIN Xiao , TAN Xiaodi

2024, 39(2):297-311. DOI: 10.16337/j.1004-9037.2024.02.004

[Abstract](936) [HTML](829) [PDF 5.25 M](4917)

Abstract:
Holographic storage technology， as a kind of data storage technology with three-dimensional volume storage and two-dimensional data transmission， is characterized by high storage density and fast data transmission， which is one of the powerful solutions for long-term storage of massive data. The traditional holographic storage method is limited by the fact that the photodetector only responds to intensity， and is usually modulated by pure amplitude coding. However， utilizing only amplitude information cannot fully exploit the advantages of holography itself， and how to decode the phase information in a simple， fast， stable and accurate way is a real problem faced by holographic storage technology. Computational imaging opens a new way to solve phase retrieval problem for holographic storage technology because of its algorithmic versatility， high perceptual dimension characteristics and so on. This paper mainly reviews some work in recent years on solving the phase retrieval problem of holographic storage using computational imaging technology from the perspectives of iterative computational phase retrieval and deep learning phase retrieval. Analyses are conducted on the work from the perspectives of improving storage density， data reading speed， and data reading stability. Finally， we make an outlook on the future development of this direction.

11 Point Spread Function Engineering in Computational Imaging Technology

QIAO Minda , BAI Linge , WANG Shuheng , WANG Tianyu , DONG Xue , XIANG Meng , LIU Fei , LIU Jinpeng , SHAO Xiaopeng

2024, 39(2):271-296. DOI: 10.16337/j.1004-9037.2024.02.003

[Abstract](2253) [HTML](1897) [PDF 8.93 M](2111)

Abstract:
This paper focuses on the new connotation and application of point spread function （PSF）of optical imaging in computational imaging. Firstly， the conception of PSF in traditional optical imaging and the key role of PSF in optical system design are introduced， and several algorithms for imaging restoration using PSF and imaging evaluation indices are briefly explained. On this basis， the connotation of PSF is re-examined from the perspective of information transfer under the framework of computational imaging， and relevant researches in the field of computational imaging are summarized from the two aspects of narrow and generalized optical systems. Finally， the application prospect and development trend of PSF engineering technology are prospected.

12 Research Progress of Computational Enhanced Optical Coherence Tomography

Qiao Zhengyu , Huang Yong , Hao Qun

2024, 39(2):248-270. DOI: 10.16337/j.1004-9037.2024.02.002

[Abstract](1264) [HTML](848) [PDF 8.05 M](1659)

Abstract:
Optical coherence tomography （OCT） has become an important non-invasive three-dimensional imaging technology with a wide range of applications. Novel demands have occurred for the OCT technology due to the developing application scenario requirement， such as resolution improvement， depth-of-focus decoupling， aberrations correction， and anisotropic resolution correction. Over the past decades， computational imaging methods have been demonstrated effective in improving previous performance parameters. This paper focuses on the four performance improvement demands and reviews several representative computational methods. The analysis compares the strengths and weaknesses of respective solutions and outlooks future trends of computation-enhanced OCT technology， with the aim to provide references for the further study and its applications.

13 Research Progress of Adversarial Attack and Defense for Signal Modulation Recognition

Jiang Han , Hu Lin , Li Wen , Jiao Yutao , Xu Yuhua , Xu Yifan

2023, 38(6):1235-1256. DOI: 10.16337/j.1004-9037.2023.06.001

[Abstract](1930) [HTML](1464) [PDF 1.90 M](2312)

Abstract:
The hot research topic of adversarial sample attacks on modulation recognition is reviewed. Firstly， we introduce the concepts and terms related to modulation recognition adversarial samples. Then we review and sort out the related research results on adversarial sample attacks and defense methods， and classify the existing adversarial attack methods and explain their generation mechanisms. Finally， based on the existing research， potential opportunities and challenges， and the advantages of artificial intelligence algorithms， the technical directions and development prospects of adversarial attacks in next-generation intelligent wireless communications are presented.

14 Large Language Model ChatGPT: Evolution and Application

XIA Runze , LI Piji

2023, 38(5):1017-1034. DOI: 10.16337/j.1004-9037.2023.05.002

[Abstract](1808) [HTML](1485) [PDF 2.15 M](1692)

Abstract:
This paper comprehensively analyzes the technical origins and evolution of ChatGPT by reviewing the development of deep learning， language models， semantic representation and pre-training techniques. In terms of language models， the early N-gram statistical method gradually evolved into the neural network language models. Researches and advancements on machine translation also led to the emergence of Transformer， which in turn catalyzed the development of neural network language models. Recording semantic representation and pre-training techniques， there has been an evolution from early statistical methods such as TF-IDF， pLSA and LDA， to neural network-based word vector representations like Word2Vec， and then to pre-trained language models， like ELMo， BERT and GPT-2. The pre-training frameworks have become increasingly sophisticated， providing rich semantic knowledge for models. The emergency of GPT-3 revealed the potential of large language models， but hallucination problems like uncontrollable generation， knowledge fallacies and poor logical reasoning capability still existed. To alleviate these problems， ChatGPT aligned further with humans on GPT-3.5 through instruction learning， supervised fine-tuning， and reinforcement learning from human feedback， continuously improving its capabilities. The emergency of large language models like ChatGPT signifies this field entering a new developmental stage， opening up new possibilities for human-computer interaction and general artificial intelligence.

15 An Overview of Audio Steganography Methods: From Tradition to Deep Learning

ZHANG Xiongwei , GE Xiaoyi , SUN Meng , SONG GONG Kunkun , LI Li

2023, 38(5):995-1016. DOI: 10.16337/j.1004-9037.2023.05.001

[Abstract](1778) [HTML](1441) [PDF 1.93 M](2531)

Abstract:
As a widely used medium in the cyberspace， digital audio serves as an excellent cover for carrying secret information and is often employed in the construction of covert communication systems that prioritize real-time performance， low complexity， and imperceptibility. Audio steganography， one of the key techniques for ensuring network information security and confidential communication， has attracted increasing attention from scholars. This paper presents a systematic review of the development context of audio steganography methods. Firstly， we introduce the basic contents of audio steganography， and summarize the problem description， evaluation indicators， common data formats， and tools. Secondly， according to different embedding domains， traditional audio steganography methods are classified into time domain methods， transform domain methods and compression domain methods， and their advantages and disadvantages are analyzed. Furthermore， based on different steganographic covers， the deep learning-based steganography methods are categorized into embedding cover-based， generating cover-based， and coverless audio steganography， then the three steganography methods are compared and analyzed. Finally， suggestions for further research directions in audio steganography are pointed out.

16 Automatic Sleep Staging Based on Deep Learning: A Review

LIU Ying , CHU Haoran , ZHANG Haowei

2023, 38(4):759-776. DOI: 10.16337/j.1004-9037.2023.04.002

[Abstract](2117) [HTML](1753) [PDF 5.02 M](2423)

Abstract:
Sleep staging is a vital process for analyzing polysomnographic recordings， which plays a key role in sleep monitoring and diagnosis of sleep disorders. Traditional manual sleep staging requires expertise， which is cumbersome and time-consuming. Deep learning constructs models by simulating the mechanism of human brain to interpret information， and has powerful automatic feature extraction and feature expression functions. Applying deep learning method to the research of sleep staging does not rely on manually designed features and can realize the automation of sleep staging. This article emphasizes on some typical automatic sleep staging studies since 2017， and conducts a systematic review of deep learning model applied in automatic sleep staging from two aspects of single-view and multi-view input. Then， the difficulties of deep learning model based on multi-view input are analyzed and its potential research value is pointed out. Finally， possible future research direction is discussed.

17 Contrast-Enhanced Ultrasound Analysis Based on Machine Learning: A Survey

WAN Peng , LIU Han , ZHAO Junyong , XUE Haiyan , LIU Chunrui , SHAO Wei , KONG Wentao , ZHANG Daoqiang

2023, 38(4):741-758. DOI: 10.16337/j.1004-9037.2023.04.001

[Abstract](1406) [HTML](1425) [PDF 3.62 M](1891)

Abstract:
Contrast-enhanced ultrasound （CEUS） is a powerful diagnostic tool that enhances blood flow signals from tumor micro-vessels through the peripheral venous injection of ultrasound contrast agents. This enables clinical physicians to dynamically evaluate tumor angiogenesis in real-time. CEUS imaging is widely used for the diagnosis， postoperative evaluation， and treatment planning of multiple organs. In recent years， deep learning techniques have made considerable progress， offering new opportunities for the intelligent analysis of dynamic CEUS. Deep learning methods have widened the scope of clinical applications largely， improving its efficacy of diagnosis and treatment. However， similar to the traditional ultrasound imaging， CEUS is faced with the challenges of interference from speckle noise， respiratory motion， and low standardization， making the analysis of spatial-temporal information of dynamic perfusion become difficult. This paper systematically reviews recent research on the intelligent analysis of CEUS， covering clinical applications such as benign-malignant differentiation， malignant grading， therapeutic prediction， and the selection of diagnosis and treatment plans. We summarize the latest advances of radiomic and deep learning methods in the area of CEUS imaging analysis， and highlight the limitations of current research and future directions for development.

18 Review on Optimization of Resources in UAV Swarm Networks

TIAN Chang , Jia Qian , Chen Runfeng , Wang Haichao , Li Guoxin , Jiao Yutao

2023, 38(3):506-524. DOI: 10.16337/j.1004-9037.2023.03.002

[Abstract](1916) [HTML](1549) [PDF 1.59 M](2408)

Abstract:
Unmanned aerial vehicle（UAV） swarms have become critical equipment for performing complex tasks due to their flexibility， low cost， and the ability to carry various sensors. Their application depends on timely and efficient communication. Therefore， the research on UAV swarm communication networks has also received widespread attention in recent years. The inherent characteristics of UAV swarms， such as high mobility， high information interaction， and low energy storage， impose various severe challenges on the management of communication resources. This paper summarizes the application scenarios， advantages， and characteristics of the UAV swarm communication network， and extracts the challenges faced by resource optimization. From the perspectives of strategies and methods， this paper summarizes the existing resource optimization schemes， and sorts out the technical difficulties， such as communication performance improvement in large-scale cluster scenarios， timely decision update in high-complex environments， and communication satisfaction improvement in multi-heterogeneous requirements. Finally， the technical direction and development prospects of the UAV swarm communication network are prospected based on the research status， potential application value and the application advantages of emerging technologies.

19 Recent Advances in Cross Modal Image Text Retrieval

Zhang Feifei , Ma Zewei , Zhou Ling , Meng Lingtao

2023, 38(3):479-505. DOI: 10.16337/j.1004-9037.2023.03.001

[Abstract](2238) [HTML](1724) [PDF 3.48 M](4002)

Abstract:
With the rapid development of Internet technology， the volume of different types of data has grown tremendously， such as texts and images. How to obtain valuable information from such heterogeneous but semantic related multimodal data is particularly important. Cross-modal retrieval is an essential way to meet users’ requirements for obtaining different information on the Internet， which can effectively deal with the multimodal data. In recent years， cross modal retrieval has become a hot issue in both academic and industrial area. In this paper， we make a comprehensive overview of the image-text cross modal retrieval task， including definitions， challenges， and detailed discussions about the existing methods. Specifically， we first divide the existing methods into three main categories：（1） traditional methods，（2） methods based on deep learning； and （3） Hash based representation method. Then， we introduce the commonly used cross-modal retrieval benchmarks and discuss the existing methods on these benchmarks in detail. Finally， the future development direction of image-text cross modal retrieval task is prospected.

20 Review of Multi-source Information Fusion Methods Based on Granular Computing

Xu Weihua , Huang Xudong , Cai Ke

2023, 38(2):245-261. DOI: 10.16337/j.1004-9037.2023.02.002

[Abstract](1411) [HTML](1430) [PDF 1.33 M](2585)

Abstract:
Multi-source data is a complex data type that integrates multiple information sources or data sets. Its main feature is that different information sources imply different knowledge structures， and represent and describe samples and relationships between samples from different perspectives. How to fuse and integrate multi-source data cooperatively and how to quickly mine the overall decision-making knowledge for users from different viewpoints have become a scientific problem that needs to be solved urgently in the field of data science. Classical rough set theory， multi-granularity method， evidence theory and information entropy are common and effective multi-source information fusion methods， which have been widely concerned and achieved fruitful results. Therefore， this paper summarizes the work of multi-source information fusion based on granular computing， reviews the basic concepts and main research ideas of each information fusion method， and puts forward some problems in the field of multi-source information fusion. The obtained results can provide a theoretical reference for the follow-up research in this field.

21 Research on Marine Extreme Meteorology Forecast

LIU An’an , LI Tianbao , SONG Dan , LI Wenhui , SUN Zhengya , YUAN Chunxin

2023, 38(2):231-244. DOI: 10.16337/j.1004-9037.2023.02.001

[Abstract](1324) [HTML](1486) [PDF 1.05 M](2558)

Abstract:
Extreme marine weather phenomena have an important impact on the coastal area. Researchers have made great progress in marine extreme meteorology prediction with the help of marine big data and deep learning algorithms. In this paper， taking typical multi-scale marine extreme weather phenomena—El Ni?o， typhoon， short-term precipitation as examples， we firstly introduce the mainstream marine extreme meteorology forecast algorithms in recent years， which are mainly divided into numerical model-based methods and artificial intelligence-based algorithms. Then，we analyze the challenges and opportunities of marine extreme meteorology prediction， and summarize the research advances of various methods in detail. And， we discuss the advantages and disadvantages of existing algorithms through experiments. Finally， we briefly look forward to the development direction of marine intelligent meteorology prediction based on marine big data.

22 Deep Learning Based Salient Object Detection: A Survey

SUN Han , LIU Yishan , LIN Yuhan

2023, 38(1):21-50. DOI: 10.16337/j.1004-9037.2023.01.002

[Abstract](2588) [HTML](1316) [PDF 5.89 M](4884)

Abstract:
Salient object detection has been widely used in computer vision tasks such as image understanding， semantic segmentation， and object tracking by simulating the human visual system to find the most attractive targets for visual attention. With the rapid development of deep learning technology， salient object detection research has made great breakthroughs. This paper presents a comprehensive and systematic survey of salient object detection based on RGB images， RGB-D/T （Depth/Thermal） images， and light field images in the past five years. Firstly， the task characteristics and research difficulties of the three research branches are analyzed. Then the research technical route of each branch is expounded and the advantages and disadvantages are analyzed. At the same time， the mainstream datasets and common performance evaluation indexes of three kinds of research branches are introduced. Finally， possible future research trends are prospected.

23 Recent Advances in Visual Question Answering and Reasoning

ZHANG Feifei , ZHANG Jianqing , QU Sijia , ZHOU Wanting

2023, 38(1):1-20. DOI: 10.16337/j.1004-9037.2023.01.001

[Abstract](1686) [HTML](1339) [PDF 1.95 M](3191)

Abstract:
With the rapid development of the social media and human-computer interaction， the volume of multimedia data， such as video， image and text， has grown tremendously. Therefore， researchers have focused their attention on the multi-modal intelligence research. As an essential and fundamental research topic in the multi-modal intelligence and artificial intelligence area， some scientific research results on the visual question answering and reasoning task have been successfully implemented in the fields of human-computer interaction， intelligent medical care， and unmanned driving. This paper makes a comprehensive overview of the related algorithms of visual question answering and reasoning， meanwhile classifies and analyzes the existing methods. Firstly， we introduce the definition of the visual question answering and reasoning task， and briefly describe the main challenges of this task. Then， we summarize the existing methods that focus on attention mechanism， graph network， model pretraining， external knowledge and explainable reasoning mechanism. After that， we comprehensively introduce the common visual question answering and reasoning benchmarks and discuss the existing methods on these benchmarks in detail. Finally， we prospect future directions of the visual question answering and reasoning task.

24 New Paradigm of Electromagnetic Spectrum Space Situation Cognition: Spectrum Semantic and Spectrum Behavior

Zhou Bo , Ma Xinyi , Kuang Tingyan , Li Jie

2022, 37(6):1198-1207. DOI: 10.16337/j.1004-9037.2022.06.002

[Abstract](1751) [HTML](1598) [PDF 1.88 M](4596)

Abstract:
With the increasing scarcity of spectrum resources， the increasing severity of radio regulations， and the increasing fierceness of electromagnetic warfare， the research of the electromagnetic spectrum situational has to transit from spectrum perception to spectrum cognition. This paper reviews the current research status of the theories and methods for spectrum situational cognition from four aspects： The mathematical models and fundamental mechanisms for a multi-layer spectrum situational cognition system， the extraction and fusion of cross-region multi-dimensional electromagnetic features， the efficient completion and prediction for spectrum situation， and the precise inference and intention judgment for spectrum-related behaviors. To address the complex multi-domain multi-dimensional electromagnetic spectrum environment and diversified tasks， it is critical to develop a framework of spectrum situational cognition from “data to semantics” “semantics to behaviors” “a single region to cross regions” “a single layer to multi layers”. This framework builds up foundations， both theoretically and technically， for efficient spectrum sharing in space-terrestrial integrated information networks， efficient radio regulations， and advantages in the electromagnetic spectrum warfare.

25 Survey on Theory and Applications of Radio Frequency Machine Learning for Electromagnetic Spectrum Space

Zhou Fuhui , Zhang Zitong , Ding Rui , Xu Ming , Yuan Lu , Wu Qihui

2022, 37(6):1179-1197. DOI: 10.16337/j.1004-9037.2022.06.001

[Abstract](1793) [HTML](1055) [PDF 1.24 M](3975)

Abstract:
For the problem that spectrum resources is increasingly scare in electromagnetic spectrum space， the radio frequency machine learning （RFML） is purposed to design special machine learning models by introducing domain knowledge. It has the advantages of fast， few sample or even zero sample， interpretability and high performance. The state-of-the-art RFML in wireless communication is analyzed from the five layers， which are physical layer， data link layer， network layer， transmission layer and application layer. Moreover， based on the existing achievements， four RFML frameworks （serial/parallel/coupled/feedback dual-driven framework） are summarized by the interaction relationship of the data-driven model and the knowledge-driven model. Finally， the key challenges and open issues are identified and elaborated to facilitate the RFML research and practical applications.

26 A Survey on Application of Deep Learning in Photoacoustic Image Reconstruction from Limited-View Sparse Data

SUN Zheng , HOU Yingsa

2022, 37(5):971-983. DOI: 10.16337/j.1004-9037.2022.05.001

[Abstract](1629) [HTML](1093) [PDF 4.04 M](4275)

Abstract:
Photoacoustic imaging （PAI） is a newly emerging hybrid functional imaging modality. High-quality image reconstruction is the key to improve the imaging accuracy. Incomplete photoacoustic（PA） measurements usually lead to the reduction in the imaging depth and the quality of images which are rendered by using conventional reconstruction techniques such as back projection （BP）， time reversal （TR）， and delay and sum （DAS）. The iterative algorithms are capable of solving this issue to a certain extent at the cost of high computational burden and a properly selected regularization tool. In recent years， deep learning （DL） has exhibited promising performances in the field of medical imaging. It has also shown great potential in reconstructing images with high quality and high efficiency. This paper provides a survey on PA image reconstruction from sparely sampled data in a limited view based on DL. The current methods are summarized and classified， and their advantages and limits are also discussed.

27 Overview on Routing Protocols for Flying Ad-Hoc Networks

Zhang Min , Dong Chao , Yang Peng , Feng Simeng , Wu Qihui , QUEK Q S T

2022, 37(5):952-970. DOI: 10.16337/j.1004-9037.2022.05.002

[Abstract](2361) [HTML](2148) [PDF 1.48 M](4130)

Abstract:
With the development of unmanned aerial vehicle （UAV） software and hardware technology， the flying Ad-Hoc networks （FANETs） formed by the self-organization of multiple UAV clusters have received more and more attention from academia and industry. Its flexible deployment and rapid response capabilities enable it to complete a variety of tasks efficiently and inexpensively. Moreover， the UAV routing protocol is one of the most critical methods to improve the quality of service （QoS）. The mobility and dynamics of FANETs make the design of routing protocols face severe challenges. Conventional mobile routing protocols cannot sufficiently meet the routing requirements of FANETs. Therefore， researchers have proposed a variety of UAV self-organizing network routing protocols from the perspectives of topology， geography， and layering， aiming to overcome mobility and improve network QoS. This paper facuses on UAV Ad-Hoc networks， categorizes and summarizes routing protocols from different routing decision-making methods， and prospects future research directions.

28 Survey of Interpretable Deep TSK Fuzzy Systems

Wang Shitong , Xie Runshan , Zhou Erhao

2022, 37(5):935-951. DOI: 10.16337/j.1004-9037.2022.05.001

[Abstract](1979) [HTML](1280) [PDF 840.68 K](4033)

Abstract:
While the existing deep neural networks have earned great successes in various application scenarios，they are still facing black-box challenges that they are not very suitable for some application fields such as healthcare， finance and transportation. Therefore， explainable artificial intelligence （XAI） has been becoming a hot research topic in recent years. Among the existing XAI means， since fuzzy AI systems have the impressive ability to achieve an excellent trade-off between performance and interpretability，interpretable deep Takagi-Sugeno-Kang （TSK） fuzzy systems have been drawing more and more attentions. We first state the concept of the classical TSK fuzzy systems，then give a comprehensive overview of interpretable deep TSK fuzzy systems which are based on stacked generalization principle， including their structures，representative models and application scenarios， and finally discuss their future development direction according to their existing problems.

29 Survey on Insider Threat Detection Method

GUO Shize , ZHANG Lei , PAN Yu , TAO Wei , BAI Wei , ZHENG Qibin , LIU Yi , PAN Zhisong

2022, 37(3):488-501. DOI: 10.16337/j.1004-9037.2022.03.002

[Abstract](1076) [HTML](1518) [PDF 845.09 K](5424)

Abstract:
The internal network of the organization is not only faced with the threat of external attackers， but also faced with the insider threat including destruction of the organization network structure， internal information theft and various means of fraud. Because of the characteristics of concealment， destructiveness and diversification of attack methods， the insider threat poses a serious threat to the internal network. Therefore， it is very necessary to study the detection methods of insider threat. This paper analyzes the characteristics of insider threat and expounds the significance of studying the detection methods of insider threat. The existing insider threat detection methods are divided into three categories， namely， detection methods based on abnormal behavior， detection methods based on abnormal audit diary， and other detection methods. The current research status of each aspect is introduced respectively， and the progress of the research status of each aspect is summarized and analyzed.At last， the future research direction of insider threat detection methods is prospected.

30 Data Science : From Digital World to Digital Intelligent World

ZHANG Qinghua , GAO Yu , SHEN Qiuping

2022, 37(3):471-487. DOI: 10.16337/j.1004-9037.2022.03.001

[Abstract](1876) [HTML](1177) [PDF 1.63 M](10486)

Abstract:
With the development of big data， data has become a major strategic resource for countries and its social impact is increasingly obvious. Thus， data science is proposed to explore and study basic scientific problems contained in big data. In this paper， the development of big data， the rise and connotation of data science are first introduced. Second， the research status of big data and data science is analyzed， and the application of data in various industries is discussed. Third， the big data proving ground that is constructed to explore laws and problems of data science is briefly described. Finally， in order to promote the development of data science， accelerate the transformation of the real world to the digital world， and realize the intelligent life， the key issues of data science and the new thinking in digital world are discussed.

31 Survey on New Progresses of Deep Learning Based Computer Vision

LU Hongtao , LUO Mukun

2022, 37(2):247-278. DOI: 10.16337/j.1004-9037.2022.02.001

[Abstract](3992) [HTML](4480) [PDF 12.48 M](5797)

Abstract:
Deep learning has recently achieved great breakthroughs in some fields of computer vision. Various new deep learning methods and deep neural network models were proposed， and their performance was constantly updated. This paper makes a survey on the new progresses of applications of deep learning on computer vision since 2016 with emphases on some typical networks and models. We first investigate the mainstream deep neural network models for image classification including standard models and light-weight models. Then， we introduce some main methods and models for different computer vision fields including object detection， image segmentation and image super-resolution. Finally， we summarize deep neural network architecture searching methods.

32 Overview of Non-Line-of-Sight Imaging Technology Based on Transient Images

LIANG Yun , SONG Boyan

2022, 37(1):21-34. DOI: 10.16337/j.1004-9037.2022.01.002

[Abstract](1253) [HTML](2253) [PDF 3.26 M](3297)

Abstract:
Transient image is a fast image sequence in which a scene responds to light pulses. By capturing the time dimension information， the transient image realizes the use of the scene information contained in the time domain， and the non-line-of-sight imaging is the most typical application of transient images in the field of scene analysis. It is a technology for imaging objects or scenes outside the line of sight， and has emerged at home and abroad in recent years. According to different imaging mechanisms， this paper classifies different imaging methods of transient images， and compares a variety of non-line-of-sight imaging algorithms based on transient images according to different algorithm principles or implementation effects. Finally， the challenges of non-line-of-sight imaging technology based on transient images are summarized， and the future development direction is prospected.

33 Overview on Recognition Algorithms of Radar Active Jamming

ZHOU Hongping , WANG Ziwei , GUO Zhongyi

2022, 37(1):1-20. DOI: 10.16337/j.1004-9037.2022.01.001

[Abstract](2337) [HTML](3198) [PDF 1.18 M](5381)

Abstract:
In modern electronic warfare， the competition between electronic interference and anti-interference is becoming more and more fierce， which has become a hotspot in the radar countermeasure field to develop the identification algorithms for radar active jamming. This paper analyzes the radar active jamming recognition algorithm in details， and summarizes the general process of jamming identification methods in the world. Firstly， the types of common radar jamming are divided， and the jamming mechanism and the signal model of current common radar active jamming signal are introduced in details. Then from the feature-extraction means and the design of the classifiers， the flow of the jamming identification algorithm are analyzed comprehensively. Finally， the future development directions of the radar active jamming identification algorithms are prospected.

For Authors

Recommended articles