ZHANG Xiongwei , GE Xiaoyi , SUN Meng , SONG GONG Kunkun , LI Li
2023, 38(5):995-1016. DOI: 10.16337/j.1004-9037.2023.05.001
Abstract:As a widely used medium in the cyberspace, digital audio serves as an excellent cover for carrying secret information and is often employed in the construction of covert communication systems that prioritize real-time performance, low complexity, and imperceptibility. Audio steganography, one of the key techniques for ensuring network information security and confidential communication, has attracted increasing attention from scholars. This paper presents a systematic review of the development context of audio steganography methods. Firstly, we introduce the basic contents of audio steganography, and summarize the problem description, evaluation indicators, common data formats, and tools. Secondly, according to different embedding domains, traditional audio steganography methods are classified into time domain methods, transform domain methods and compression domain methods, and their advantages and disadvantages are analyzed. Furthermore, based on different steganographic covers, the deep learning-based steganography methods are categorized into embedding cover-based, generating cover-based, and coverless audio steganography, then the three steganography methods are compared and analyzed. Finally, suggestions for further research directions in audio steganography are pointed out.
2023, 38(5):1017-1034. DOI: 10.16337/j.1004-9037.2023.05.002
Abstract:This paper comprehensively analyzes the technical origins and evolution of ChatGPT by reviewing the development of deep learning, language models, semantic representation and pre-training techniques. In terms of language models, the early N-gram statistical method gradually evolved into the neural network language models. Researches and advancements on machine translation also led to the emergence of Transformer, which in turn catalyzed the development of neural network language models. Recording semantic representation and pre-training techniques, there has been an evolution from early statistical methods such as TF-IDF, pLSA and LDA, to neural network-based word vector representations like Word2Vec, and then to pre-trained language models, like ELMo, BERT and GPT-2. The pre-training frameworks have become increasingly sophisticated, providing rich semantic knowledge for models. The emergency of GPT-3 revealed the potential of large language models, but hallucination problems like uncontrollable generation, knowledge fallacies and poor logical reasoning capability still existed. To alleviate these problems, ChatGPT aligned further with humans on GPT-3.5 through instruction learning, supervised fine-tuning, and reinforcement learning from human feedback, continuously improving its capabilities. The emergency of large language models like ChatGPT signifies this field entering a new developmental stage, opening up new possibilities for human-computer interaction and general artificial intelligence.
JIN Xin , WU Bingya , XU Jianqiu
2023, 38(5):1035-1047. DOI: 10.16337/j.1004-9037.2023.05.003
Abstract:Moving objects database (MOD) manage spatial objects that continuously change their locations over time and have been studied in a wide range of applications. Although a number of relevante techniques have been proposed such as indexing and query algorithms, cache management in MOD has been ignored. This is essentially important for database performance. Traditional cache methods ignore the spatial-temporal characteristics of data and cannot achieve good performance. This paper proposes to fully exploit the query performance of trajectory data from the cache level. Firstly, based on the unique storage structure and read/write process of trajectory data, a cache access mechanism suitable for MOD is designed. Then, due to the lack of cache policies related to application scenarios and access modes in MOD, a cache replacement strategy is customized. Finally, a cache management tool MOCache for trajectory data is implemented. By utilizing MOCache, dynamic tracking of cache state changes is visualized after each query statement. Compared with traditional algorithms, the proposed cache replacement strategy improves the hit ratio to 76.56% and reduces query time, and using the cache tool to monitor historical state information can facilitate comprehensive feedback and analyze performance problems.
2023, 38(5):1048-1057. DOI: 10.16337/j.1004-9037.2023.05.004
Abstract:Unlike RGB images, pixels in depth images represent the distance from the acquisition device to the points of the scene, and the direct use of inpainting methods for the natural image can not effectively restore the scene structure of missing areas in deep images. This paper proposes a two-stage code structure generation counter-network to solve the problem of deep image inpainting. Unlike standard generative adversarial network (GAN) models, the generator network in this paper includes depth build
Wang Libing , Feng Yate , Wen Yimin
2023, 38(5):1058-1068. DOI: 10.16337/j.1004-9037.2023.05.005
Abstract:Vietnamese characters which are composed of Latin characters and diacritic symbols make recognition more challenging. On the one hand, diacritic symbols are more likely to lead to attention drift. On the other hand, Vietnamese characters include many categories, and the differences between characters are small, for example some characters only differ from diacritical symbols, which further increases difficulty of recognition. Based on the decoupled attention network (DAN) algorithm, this paper designs a visual feature and sequence feature fusion module (VSFM), which utilizes bidirectional gated recurrent unit (Bi-GRU) to model sequences in the horizontal and vertical directions, further alleviating attention drift and enhancing correlation between diacritics and Latin characters. And an enhanced decoupled text decoder module (ETDM) is designed, which employs more feature information to identify similar characters more effectively. A series of experiments validate the effectiveness of the proposed method.
BAI Menglin , ZHOU Fei , SHU Haofeng
2023, 38(5):1069-1078. DOI: 10.16337/j.1004-9037.2023.05.006
Abstract:Cross-scene and cross-device shooting greatly increases the data of pedestrians. However, due to the different postures and partial occlusion of pedestrians, it is difficult to avoid the introduction of sample noise. During the clustering process, it is easy to generate false pseudo-labels, resulting in label noise and affecting the optimization of the model. In order to reduce the influence of noise, the camera-aware distance matrix is ??applied to combat the sample noise problem caused by camera offset, and the noise-robust dynamic symmetric contrast loss is used to reduce label noise. Specifically, the distance matrix that measures the similarity of pedestrian features is changed before clustering, and the camera-aware distance matrix is used to enhance the accuracy of the intra-class distance measurement, reducing the negative impact of different perspectives on the clustering effect. Combined with the noise label learning method, a robust loss is designed, a dynamic symmetric contrast loss function is proposed, and a joint loss training is used to continuously refine the pseudo-labels. Experiments are carried out on DukeMTMC-reID and Market-1501 datasets to verify the effectiveness of the proposed method.
Hu Zhaohua , Liu Haonan , Lin Xiao
2023, 38(5):1079-1091. DOI: 10.16337/j.1004-9037.2023.05.007
Abstract:Object tracking algorithms based on Siamese networks usually adopt simple cross-correlation matching, but this simple matching method will introduce a lot of irrelevant information and weaken the response of the target region. Although the Siamese tracking network without anchor frame avoids the adjustment of anchor frame parameters, it cannot adapt well to the scale change of the target due to the loss of priori information. Therefore, aiming at the above problems, this paper proposes a object tracking matching enhancement algorithm SiamBM based on Siamese networks. By encoding the boundary frame coordinate information of the target, effective guidance information is provided for the tracking model. The discriminant ability of the tracking model is further improved by means of depth separable cross-correlation and cascade pixel matching cross-correlation. Multi-scale cross-correlation is adopted to enhance the scale adaptability of the tracking model. In the OTB100 dataset, the success rate and accuracy rate of SiamBM reached 0.684 and 0.906, respectively, which increased by 5.2% and 4.2% compared with the benchmark model. The experimental results show that compared with the current mainstream trackers, SiamBM has achieved quite competitive results and superior performance in various dataset indicators.
HU Yifan , QIN Ling , YANG Xiaojian
2023, 38(5):1092-1103. DOI: 10.16337/j.1004-9037.2023.05.008
Abstract:Aiming at the low accuracy of face detection caused by the high similarity between background and face and the small scale of face target, an improved face detection algorithm based on YOLOv3 is proposed. Firstly, the K-means clustering algorithm based on genetic algorithm is used to improve the influence of random initialization in the original algorithm and generate a prediction frame more in line with the target size. Secondly, the lightweight network is used to improve the original feature extraction network and improve the face detection speed. Finally, the frame regression loss is used to replace the YOLOv3 coordinate loss function and the confidence loss function is improved to improve the training convergence speed and result accuracy. The accuracy and speed of the designed face algorithm are improved on Wider Face dataset.
WANG Jian , CHENG Chufan , CHEN Fang
2023, 38(5):1104-1111. DOI: 10.16337/j.1004-9037.2023.05.009
Abstract:Early detection of COVID-19 allows medical intervention to improve the survival rate of patients. The use of deep neural networks (DNN) to detect COVID-19 can improve the sensitivity and speed of interpretation of chest CT for COVID-19 screening. However, applying DNN for the medical field is known to be influenced by the limited samples and imperceptible noise perturbations. In this paper, we propose a multi-loss hybrid adversarial function (MLAdv) to search the effective adversarial attack samples containing potential spoofing networks. These adversarial attack samples are then added to the training data to improve the robustness and the generalization of the network for unanticipated noise perturbations. Especially, MLAdv not only implements the multiple-loss function including style, origin, and detail losses to craft medical adversarial samples into realistic-looking styles, but also uses the heuristic projection algorithm to produce the noise with strong aggregation and interference. These samples are proven to have stronger anti-noise ability and attack transferability. By evaluating on COVID-19 dataset, it is shown that the augmented networks by using adversarial attacks from the MLAdv algorithm can improve the diagnosis accuracy by 4.75%. Therefore, the augmented network based on MLAdv adversarial attacks can improve the ability of models and is resistant to noise perturbations.
HUANG Yuqing , LI Huafeng , YUAN Ming , ZHANG Yafei
2023, 38(5):1112-1124. DOI: 10.16337/j.1004-9037.2023.05.010
Abstract:The existing super-resolution reconstruction algorithms of single image mostly pursue the peak signal-to-noise ratio (PSNR), and lack the attention to the details of image texture in the process of feature extraction, resulting in poor subjective perception of reconstructed images. In order to solve this problem, this paper proposes a single image super-resolution reconstruction algorithm based on convolutional neural network gradient and texture compensation. Specifically, three branches are designed for structure feature extraction, texture detail feature extraction and gradient compensation, and then the proposed fusion module is used to fuse the structure feature and texture detail feature. To prevent the loss of texture information in the reconstruction process, this paper proposes a texture detail feature extraction module to compensate the texture detail information of the image and enhance the texture retention ability of the network. At the same time, this paper uses the gradient information extracted by the gradient compensation module to enhance the structure information. In addition, this paper also constructs a deep feature extraction structure, combining channel attention and spatial attention to screen and enhance the information in the deep features. Finally, the second-order residual block is used to fuse the structure and texture features, so that the feature information of the reconstructed image is more perfect. The effectiveness and superiority of the proposed method are verified by comparative experiments.
2023, 38(5):1125-1141. DOI: 10.16337/j.1004-9037.2023.05.011
Abstract:The existing nonnegative matrix factorization methods mainly focus on learning global structure of the data, while ignoring the learning of local information. Meanwhile, for those methods that attempt to exploit local similarity, the manifold learning is often adopted, which suffers some issues. To solve this problem, a new method named the robust nonnegative matrix factorization with local similarity learning (RLS-NMF) is proposed. In this paper, a new local similarity learning method is adopted, which is starkly different from the widely used manifold learning. Moreover, the new method can simultaneously learn the global structural information of the data, and thus exploit the intra-class similarity and the inter-class separability of the data. To address the issues of outliers and noise effects in real word applications, the
WANG Xinxin , SONG Xiaoying , CHAI Li
2023, 38(5):1142-1150. DOI: 10.16337/j.1004-9037.2023.05.012
Abstract:Attention deficit hyperactivity disorder (ADHD) seriously affects children’s development, so extensive attention has been paid to its effective diagnosis. A new method for calculating graph similarity is proposed, which combines the topological information of brain networks with signals on the network. The Pearson correlation coefficient is used to construct the fully connected brain network. Based on the sparse representation, the node subnetwork is extracted from the underlying structure, and the similarity of the subnetwork is calculated according to the graph kernel function. Finally, the global index of brain network similarity is given. Experimental results of classifying ADHD-200 in the public dataset characterized by similarity between subjects show that the proposed method can distinguish ADHD patients and healthy people with 93.1% accuracy, and the classification performance is significantly superior than other existing methods. In addition, it is found that ADHD patients have stronger connections in brain regions, such as anterior central gyrus, thalamus, hippocampus and insula.
Li Ziyu , Ge Fen , Zhang Jindong , Zhao Jiachen
2023, 38(5):1151-1161. DOI: 10.16337/j.1004-9037.2023.05.013
Abstract:Aiming at the continuous intelligent anti-jamming decision-making and high real-time requirements of radar in high-dynamic environment, this paper constructs a deep Q network (DQN) model for radar intelligent anti-jamming decision-making, and proposes a hardware decision acceleration architecture based on field programmable gate array(FPGA). In this architecture, an on-chip access mode is designed for radar intelligent decision-making environment interaction to improve real-time performance, which simplifies the iterative process of continuous decision-making of the DQN agent through the on-chip quantitative storage and state iterative calculation for environment interaction. In the proposed architecture, both the parallel computing and pipeline control acceleration of agent deep neural network are adopted, which further improves the real-time performance of decision-making. Simulation and experimental results show that, on the premise of ensuring the accuracy of decision-making, the designed intelligent anti-jamming decision-making accelerator achieves a speedup of nearly 46 times in single decision-making and a speedup of nearly 84 times in continuous decision-making compared with the existing decision-making system based on the CPU platform.
LIU Tingci , YAO Gaofan , WU Wei , SONG Rongfang
2023, 38(5):1162-1171. DOI: 10.16337/j.1004-9037.2023.05.014
Abstract:Intelligent reflecting surface (IRS) is one of the most attractive key techniques to realize smart radio environments. It is effective to mitigate Doppler effect in the high-mobility environments by deploying IRSs. An IRS-assisted Doppler mitigation method has been proposed for high-mobility communications in the literature. However, the computational complexity of IRS phase optimization is high due to the maximum likelihood estimator for partial channel parameters. A simplified IRS phase optimization method is proposed, and the phase expression is derived. The channel improved parameters can be obtained by a low-complexity channel estimation method. Compared with the other scheme, the new scheme avoids the complex estimation methods, prevents the additional estimation errors, and effectively reduces the computational complexity. Numerical simulation results show that the new scheme can effectively reduce the program running time, while still achieving superior passive beamforming gain and strong robustness when pilot overhead is limited.
HOU Dacheng , ZHANG Haoyu , LIN Yifan , ZHANG Wanxiang
2023, 38(5):1172-1179. DOI: 10.16337/j.1004-9037.2023.05.015
Abstract:To speed up the antenna modeling and optimization, this paper conducts a modeling study for antenna parameter optimization by the commercially available antenna design software. Firstly, the back propagation(BP) neural networks are optimized by several commonly-used heuristic algorithms, and used to improve the antenna parameters. These parameters are compared and the best one is the one optimized by genetic algorithm BP (GABP). Secondly, the adaptive algorithm and simulated annealing algorithm is used to optimize GABP. Finally, the minimum error of the adaptive GABP algorithm for antenna parameter optimization is verified by simulation tests. The study provides a new method for antenna optimization in antenna design software with less errors. It has higher prediction accuracy and much faster fitting speed. The feasibility of this algorithm is also demonstrated by experimental comparison.
Gu Chuan , Guo Daoxing , WU Bingbing
2023, 38(5):1180-1190. DOI: 10.16337/j.1004-9037.2023.05.016
Abstract:Aiming at the path planning problem of unmanned aerial vehicle (UAV) with limited view ability in unknown environment, a particle swarm optimization (PSO) algorithm based on convex optimization is proposed to select path points. In the iterative optimization process, the fitness function of particle swarm is designed based on the trajectory, obstacle avoidance and the distance to the end point solved by the convex optimization. The trajectory between the path points is displayed after the optimal path point is obtained. The obtained trajectory is used as a part of simultaneous localization and mapping (SLAM) to build a more reliable environment map. Theoretical analysis and experimental simulation results show that compared with other intelligent algorithms and sample-based path planning algorithms, the proposed PSO based on convex optimization can effectively improve the efficiency of path planning and reduce the length of the planned path.
2023, 38(5):1191-1205. DOI: 10.16337/j.1004-9037.2023.05.017
Abstract:Data-driven makes it more convenient and effective for decision-makers to obtain information. Under the theoretical framework of graph model for conflict resolution, this paper firstly mines conflict strategies based on data-driven, and realizes the rational construction of conflict strategies. Secondly, considering that decision-makers’ choice of a certain strategy is more likely to be a possibility of being selected in real conflicts, this paper effectively integrates the hesitant fuzzy linguistic information with the theory of graph model for conflict resolution, and uses the hesitant fuzzy linguistic information for evaluation. Based on the rough set theory, the information of hesitant fuzzy semantic evaluation is aggregated to represent this possibility. Furthermore, a new option prioritizing method for graph model of conflict resolution based on hesitant fuzzy linguistic information is proposed. Finally, the cross-border water pollution of the Shu River are modeled and analyzed to compare the novel and classic methods, so as to verify the rationality of the method proposed in this paper.
ZHONG Zhaoman , HUANG Xianbo , XIONG Yulong
2023, 38(5):1206-1213. DOI: 10.16337/j.1004-9037.2023.05.018
Abstract:In order to accurately analyze the sentiment of Internet users towards different objects in breaking events, a method of fine-grained sentiment analysis of breaking events based on RoBERTa word embedding and interactive attention is proposed. By constructing a RoBERTa-CRF comment object extraction model, the extraction of comment objects related to breaking events is completed. The RoBBETa-IAN model is constructed using the interactive attention mechanism and pre-training model to achieve the sentiment analysis of comment objects. Finally, the sentiments of Internet users towards different objects in breaking events are analyzed and visualised. On the constructed Weibo news comment dataset, the F1 values of the RoBERTa-CRF comment object extraction model and the RoBERTa-IAN sentiment analysis model are 0.76 and 0.79 respectively.
DING Jingxian , LI Xiang , SUN Jizhou , ZHOU Hong
2023, 38(5):1214-1225. DOI: 10.16337/j.1004-9037.2023.05.019
Abstract:Expert recommendation is a research hotspot in the field of recommendation system. The rationality of expert information feature extraction directly affects the accuracy of recommendation. However, most expert recommendation methods donot build text graphs of feature relation for multi-source information, and ignore the correlation between attribute features. Additionally, most expert recommendation methods cannot expand the features of knowledge field according to the relevance of text graph. Therefore, we propose CMFBG, an expert recommendation method combining multi-features and bi-directional graph classification. Specifically, CMFBG obtains multi-feature information of experts through multi-source information fusion, and construct text graphs for different attribute features within categories. Then, CMFBG employs bidirectional encoder representation from transformer (BERT) and graph convolutional network (GCN) models to extract features and fuse them. Finally, CMFBG employs the bidirectional attention mechanism to enhance the extension of the source data to the graph features and realize the classification of the graph structure. The experimental analysis on the same expert data set shows that the precision of CMFBG is 91.71% higher than other algorithms in the task of graph classification.
Xia Zhengxin , Su Chong , Liu Yong
2023, 38(5):1226-1234. DOI: 10.16337/j.1004-9037.2023.05.020
Abstract:Document relationship extraction (DRE) is designed to identify the relationship between entities in multiple sentences, and entities may correspond to multiple mentions across sentence boundaries, in which the pronoun entity mention is a common grammatical phenomenon due to the connection between sentences, and is also an important factor affecting sentence reasoning. However, most of the previous studies focused on the relationship between common entity references, but paid little attention to the co-reference and relational capture of pronoun entity references. Therefore, we propose a contextual coreference entity dependency (CCED) model, that is, by integrating common entity and pronoun entity representation to build a context graph structure of co-referring entity dependency, and carry out global interactive reasoning between entity pairs on the graph, so as to model the interdependence of entity relations. We evaluated the CCED model in the public datasets DocRED, DialogRE and MPDD, respectively. The results showed that the CCED model improved Ign F1 performance by 0.55% on the DocRED dataset compared with DocuNet-BERT, the best baseline model. And F1 score performance increased by 0.35%. In terms of the DialogRE and MPDD datasets, the CCED model improved F1 performance by 1.02% in DialogRE test sets and ACC performance by 1.19% in MPDD test sets compared with COLN, the best-performing baseline model. The experimental results verify the effectiveness of the new model for document-level relationship extraction.
Quick search
Volume retrievalYou are the visitor 
Mailing Address:29Yudao Street,Nanjing,China
Post Code:210016 Fax:025-84892742
Phone:025-84892742 E-mail:sjcj@nuaa.edu.cn
Supported by:Beijing E-Tiller Technology Development Co., Ltd.
Copyright: ® 2026 All Rights Reserved
Author Login
Reviewer Login
Editor Login
Reader Login
External Links