Volume 38,Issue 6,2023 Table of Contents

Research Progress of Adversarial Attack and Defense for Signal Modulation Recognition

Jiang Han , Hu Lin , Li Wen , Jiao Yutao , Xu Yuhua , Xu Yifan

2023, 38(6):1235-1256. DOI: 10.16337/j.1004-9037.2023.06.001

Abstract (1931) HTML (1465) PDF 1.90 M (2328) Comment (0) Favorites

Abstract:The hot research topic of adversarial sample attacks on modulation recognition is reviewed. Firstly， we introduce the concepts and terms related to modulation recognition adversarial samples. Then we review and sort out the related research results on adversarial sample attacks and defense methods， and classify the existing adversarial attack methods and explain their generation mechanisms. Finally， based on the existing research， potential opportunities and challenges， and the advantages of artificial intelligence algorithms， the technical directions and development prospects of adversarial attacks in next-generation intelligent wireless communications are presented.

0+1
1+1
2+1
3+1
4+1
5+1
6+1
7+1
8+1
9+1
10+1
11+1

Dimension Reduced Fourth-Order Cumulant Near-Field Source Localization Method

LI Wanru , DENG Ke , YIN Qinye , ZHANG Yan

2023, 38(6):1257-1267. DOI: 10.16337/j.1004-9037.2023.06.002

Abstract (779) HTML (272) PDF 1.00 M (1045) Comment (0) Favorites

Abstract:For the problems of low degree of freedom and low accuracy in near-field source localization， a localization algorithm based on fourth-order cumulant matrix is proposed. Firstly， a high-dimensional virtual covariance matrix is constructed， where the equivalent steering vector contains both direction of arrival （DOA） and distance information. In angle estimation， a one-dimensional search method based on rank deficiency to search the reciprocal of the minimum singular value is proposed， where the computational burden is reduced. The degrees of freedom are increased and the characteristic that the high-order cumulant of Gaussian noise is zero is exploited to improve the estimation performance at low signal-to-noise ratio. In the estimation of distance， the distance information contained in the singular vector obtained by singular value decomposition in angle estimation can be directly exploited without additional calculation， and the distance is estimated by the least square method. Simulation results show that the method estimates the angle and distance information of the near-field source through the one-dimensional search only in a high-order cumulant matrix， which reduces the computational burden and improves the accuracy of the estimation compared with the existing algorithms. Moreover， the proposed method has twice as many degrees of freedom as the reduced-dimension MUSIC method.

0+1
1+1
2+1
3+1
4+1
5+1
6+1
7+1
8+1
9+1
10+1
11+1

Entropy-Rate Function of FIR Actuator

Zhang Desen , Xu Dazhuan , Liu Tian , Zhao Manman

2023, 38(6):1268-1275. DOI: 10.16337/j.1004-9037.2023.06.003

Abstract (675) HTML (312) PDF 1.18 M (952) Comment (0) Favorites

Abstract:The trade-off between control accuracy， bandwidth， and transmission rate in control systems is an important open question. This paper studies the trade-off relation between information transmission rate and distortion from the perspective of information theory. The actuator part of the control system is analyzed based on rate entropy function. An entropy rate analysis framework for a class of reproducible probability distribution function is established. The derived entropy rate inequality represents the relationship between posterior entropy and desired transmission rate in the finitre impulse rsponse（FIR） actuator. Based on the derived expressions， the mutual information data of the tracking system under different signal-to-noise ratios is simulated， and the simulation results are coordinated with the expression expectations.

0+1
1+1
2+1
3+1
4+1
5+1
6+1

Transmission Method of UAV Based on Cooperation of Relay and IRS

LUO Yijie , HOU Zhifeng , YANG Yang

2023, 38(6):1276-1285. DOI: 10.16337/j.1004-9037.2023.06.004

Abstract (938) HTML (392) PDF 1.01 M (1310) Comment (0) Favorites

Abstract:Unmanned aerial vehicle （UAV） communication and intelligent reflecting surface （IRS）-aided communication are typical air-ground transmission scheme in 6G mobile communication systems. It is available to enhance transmission performance of air-ground communication networks with cooperation of UAVs and IRSs. In this paper， by cooperating full-duplex UAV relay with passive reflecting of IRS， power control of the transmitter and reflecting element number of IRS are jointly optimized to improve energy efficiency. The simulation results show that the proposed method converges quickly and outperforms the UAV-relay-only and IRS-aided-only schemes.

0+1
1+1
2+1
3+1
4+1
5+1

Computation Offloading Algorithm for Multi-UAV Network Based on Edge Intelligence

WANG Xinyi , CHEN Zhijiang , LEI Lei , SONG Xiaoqin

2023, 38(6):1286-1298. DOI: 10.16337/j.1004-9037.2023.06.005

Abstract (1256) HTML (448) PDF 1.86 M (1198) Comment (0) Favorites

Abstract:In order to solve the problems of high cost， poor mobility and difficulty in coping with emergency in large-scale deployment of fixed edge computing nodes， a computing task offloading algorithm based on deep reinforcement learning is proposed to meet the needs of computing-intensive and delay-sensitive mobile services. Considering constraints such as the flight range， flight speed and system fairness benefits of multiple unmanned aerial vehicles （UAVs）， the method aims to minimize the weighted sum of the average computing delay of the network and the UAV energy consumption. This non-convex and non-deterministic polynomial（NP）-hard problem is transformed into a partially observed Markov decision process， and a multi-agent deep deterministic policy gradient algorithm is used for mobile user offloading decision and UAV flight trajectory optimization. Simulation results show that the proposed algorithm outperforms the baseline algorithm in terms of fairness of mobile service terminals， average system delay and total energy consumption of multiple UAVs. Especially， the proposed algorithm can obtain the optimal power consumption control under different computing performance. When the CPU frequency is 12.5 GHz， the energy consumption is 29.16% lower than the Cruise algorithm， and 8.67% lower than the advantage actor-critic（A2C） algorithm.

0+1
1+1
2+1
3+1
4+1
5+1
6+1
7+1
8+1
9+1
10+1
11+1
12+1
13+1

Achievable Rate Analysis for GSIC-Based Cell-Free Massive MIMO-NOMA Downlink Systems

LIU Chengcheng , TAI Qixin , LIU Liu , SONG Rongfang

2023, 38(6):1299-1306. DOI: 10.16337/j.1004-9037.2023.06.006

Abstract (648) HTML (359) PDF 1.03 M (920) Comment (0) Favorites

Abstract:The paper investigates the integration scheme of cell-free massive multiple-input multiple-output （MIMO） and non-orthogonal multiple access （NOMA）. NOMA based on group successive interference cancellation （GSIC） method is applied to the downlink cell-free massive MIMO systems. Furthermore， a novel grouping method according to user’s equivalent path loss is developed and the expression of per user’s achievable rate is derived. The simulation results show that the performance of downlink cell-free massive MIMO-NOMA systems based on GSIC is better than that of the cell-free massive MIMO-NOMA systems based on successive interference cancellation（SIC） in terms of achievable rate.

0+1
1+1
2+1
3+1
4+1
5+1
6+1
7+1

Optimal Design of Generalized Polynomial Broadband Beamformers with Robustness

Xu Zhiqiang , Chen Huawei

2023, 38(6):1307-1318. DOI: 10.16337/j.1004-9037.2023.06.007

Abstract (597) HTML (325) PDF 1.88 M (1132) Comment (0) Favorites

Abstract:Finite impulse response （FIR） filters are generally used in the design of traditional polynomial broadband beamformers. A generalized polynomial broadband beamformer is proposed by using the orthogonal basis filter instead of FIR filter of the traditional design in this paper. The proposed polynomial broadband beamformer can be regarded as an extension of the traditional polynomial broadband beamformers， so its structure is more flexible. In order to enhance the robustness of the generalized polynomial broadband beamformer， a robust optimization design method based on average performance optimization criterion is further proposed. The poles of the orthogonal basis filter are optimized by particle swarm optimization in this method， so that the performance of the polynomial broadband beamformer can be improved by using the freedom provided by the poles. By introducing the spatial derivative constraint of array response， the orientation deviation of beam mainlobe caused by finite polynomial interpolation is reduced. The simulation results show that， compared with the traditional design method of polynomial structure， the proposed design method can obtain better frequency invariant performance and robustness， effectively improve the orientation accuracy of beam mainlobe， and improve the directivity index of polynomial broadband beamformers.

0+1
1+1
2+1
3+1
4+1
5+1
6+1

A Space-Frequency Anti-jamming Algorithm Based on Variable Step LMS of Tongue-Like Curve Function

GUO Chenfeng , SHU Dongliang , LU Yin , JIN Xiaoqin

2023, 38(6):1319-1330. DOI: 10.16337/j.1004-9037.2023.06.008

Abstract (583) HTML (325) PDF 2.63 M (1074) Comment (0) Favorites

Abstract:To solve the problem that the space-frequency algorithm based on least mean square （LMS） cannot consider the anti-jamming performance and the convergence speed simultaneously， a space-frequency anti-jamming algorithm based on variable step LMS of tongue-like curve function is proposed as space-frequency variable step LMS of tongue-like curve function（TLCVSLMS） algorithm. On the basis of both anti-interference performance and convergence speed， the space frequency TLCVSLMS algorithm avoids the difficulty of artificially selecting a suitable fixed iterative step size factor for each frequency point， and makes more precise adjustments to the amplitude factor and shape factor of the tongue line function based on the signal power of different frequency points. Simulation results show that， when the anti-interference performance is close， the space-frequency TLCVSLMS algorithm has at least 400 fewer iterations than the space-frequency LMS algorithm， and the convergence speed of the space-frequency TLCVSLMS algorithm is faster. When the convergence speed is proximate， the anti-interference performance of the space-frequency TLCVSLMS algorithm is improved at least 3 dB than the space-frequency LMS algorithm.

0+1
1+1
2+1
3+1
4+1
5+1
6+1
7+1
8+1
9+1
10+1
11+1
12+1
13+1
14+1
15+1
16+1
17+1
18+1
19+1

Two-Dimensional Data Transmission Method with Constellation Rotation Mapping

LIU Fang , CHEN Lizhi , MU Lin , FENG Yongxin

2023, 38(6):1331-1341. DOI: 10.16337/j.1004-9037.2023.06.009

Abstract (636) HTML (515) PDF 1.47 M (1084) Comment (0) Favorites

Abstract:In order to increase the bits of binary data transmitted per second in direct sequence spread spectrum（DSSS） systems and enhance the security of information transmission， a mapping transmission mechanism is established， and a two-dimensional data transmission method with constellation rotation mapping is proposed. As the one-dimensional data is transmitted， the two-dimensional data is added， and the relationship model is established by using the M-ary conversion and constellation rotation. The constellation is selected according to the ratio between one-dimensional data rate and two-dimensional data rate， and then the two-dimensional data is converted into mapping data by constellation rotation mapping， so as to obtain the corresponding pseudo-code channel and achieve the transmission of one-dimensional data and the mapping transmission of two-dimensional data. The simulation results show that compared with the traditional DSSS system， the two-dimensional data transmission method with constellation rotation mapping can obtain higher data transmission rate and better error code rate performance， as well as meet the requirements of better confidentiality performance.

0+1
1+1
2+1
3+1
4+1
5+1
6+1
7+1
8+1
9+1
10+1
11+1
12+1
13+1
14+1
15+1
16+1
17+1
18+1

Maximum Generalized Correntropy Spectrum Sensing Based on Stochastic Resonance Under α Noise

LI Ruxue , LU Jin , LUO Cong

2023, 38(6):1342-1352. DOI: 10.16337/j.1004-9037.2023.06.010

Abstract (547) HTML (333) PDF 1.08 M (872) Comment (0) Favorites

Abstract:Spectrum sensing under α noise has become a hot topic in recent years. The statistical model of this noise has obvious impulse and trailing characteristics. The signal characteristics are not obvious enough under weak signal conditions. To this end， the maximum generalized correntropy spectrum sensing method based on stochastic resonance is proposed. This method uses the transition of particles in the stochastic resonance model between the two potential wells to transfer part of the energy of alpha noise into the signal to improve the signal output signal-to-noise ratio. The maximum generalized correntropy method is utilized to construct high-order statistics for spectrum sensing， detect the output signal after stochastic resonance and combine conjugate gradient descent method to achieve the optimal objective function. The simulations results demonstrate that the proposed algorithm can effectively improve the detection performance under the condition of low signal-to-noise ratio.

0+1
1+1
2+1
3+1
4+1
5+1
6+1
7+1
8+1
9+1

Blockchain-Based Collaborative Caching for Multi-edge Server Video Streaming

GUO Yong’an , ZHOU Yi , WANG Quan , WANG Yu’ao , CHENG Yao , ZHU Hao

2023, 38(6):1353-1368. DOI: 10.16337/j.1004-9037.2023.06.011

Abstract (820) HTML (333) PDF 1.90 M (1031) Comment (0) Favorites

Abstract:With the growth of Internet video traffic and the improvement of user requirements for experience quality， the traditional backbone network is facing great pressure. Moving edge cache technology can reduce latency， reduce backhaul link load， and improve video user experience quality. However， the finiteness of edge server cache resources， the dynamic nature of video requests， and the attention of users to the security of cached data pose new challenges to the research of edge cache strategy. To solve the above problems， this paper proposes a blockchain-assisted multi-edge server collaborative video stream cache optimization scheme. This paper constructs a four-layer network architecture composed of content delivery network （CDN） server， edge server， user device， and blockchain. We introduce the blockchain consensus mechanism to protect the charging video with insensitive request delay and ensure the data security of users. Based on the three-layer cache mechanism of local hit， proximity hit and CDN hit， the collaborative cache among edge servers is further strengthened by defining proximity hit reward factors， and the cache hit ratio of edge servers is improved. In this paper， we jointly consider the state of edge servers， the change in content popularity， and the resource allocation of cooperative cache among multi-edge servers. By establishing the minimum access latency， traffic cost， and system energy consumption optimization problem. The cooperative cache optimization algorithm of multi-agent proximal policy optimization （MAPPO） is used to solve the problem. The simulation results show that compared with the existing caching strategies， the proposed scheme can effectively improve the cache hit rate of video streaming and reduce energy consumption and delay.

0+1
1+1
2+1
3+1
4+1
5+1
6+1
7+1
8+1
9+1
10+1
11+1
12+1
13+1

IoT Resource Discovery Algorithm Based on Latent Factor Model

Shan Tao , Qian Qijie , Jing Shenqi , Ye Jiyuan , Guo Yong’an , Liu Yun

2023, 38(6):1369-1379. DOI: 10.16337/j.1004-9037.2023.06.012

Abstract (454) HTML (307) PDF 1.89 M (867) Comment (0) Favorites

Abstract:The traditional keyword-based “passive” semantic service search technology in the Internet will no longer be applicable to the internet of things （IoT） environment due to the sharp growth of sensors as well as the frequent change of device state. How to utilize and analyze a large amount of interactive information between users and devices to recommend the most relevant equipment resources according to users’ preference is the key of resource discovery algorithm in IoT. A representation model of user-device interaction based on hypergraph theory was presented and matched with corresponding representation matrix. Based on this model， the resource recommendation problem which can be transformed into a correlation degree prediction problem based on matrix decomposition was formulated. Then the alternating least squares （ALS） method in optimization theory was introduced here to tackle this optimal decomposition problem. Finally， the IoT resource recommendation algorithm based on latent factor model was proposed. The simulation proved that the proposed approach outperformed item-based collaborative filtering （ItemCF） algorithm in terms of root mean square error （RMSE） and mean absolute error （MAE）.

0+1
1+1
2+1
3+1
4+1
5+1
6+1
7+1
8+1
9+1
10+1
11+1
12+1
13+1
14+1

GPU-Based Real-Time Imaging Algorithm for Long-Track SAR

TAN Yunxin , HUANG Haifeng , LAI Tao , DAN Qihong , OU Pengfei

2023, 38(6):1380-1391. DOI: 10.16337/j.1004-9037.2023.06.013

Abstract (822) HTML (538) PDF 2.67 M (1213) Comment (0) Favorites

Abstract:To meet the fast imaging requirements of long-orbit ultra-high resolution W-band synthetic aperture radar（SAR）， this paper proposes a graphics processing unit（GPU）-based ω-K real-time imaging algorithm which adopts parallel architecture and double stream multithreading processing. The default stream processes data along the direction of the physical principle. Firstly， it parallelizes the rang compensation， error correction， zero filling and other operations， and then adopts one-layer nested interpolation method. By maintaining the upper and lower dependencies and synchronization management， it can achieve a speed ratio of about 30. The blocking stream starts at the same time as the default stream， generates the parameters and functions required by the default stream， and stores them into video memory before execution， which can greatly reduce the running time of the algorithm. Meanwhile， by setting events on the default stream， the two streams can be executed synchronously in parallel. Experimental results show that the total acceleration ratio of the algorithm can reach about 13， and the relative errors of amplitude and phase are close to 0， which not only has good real-time performance and focusing performance， but also maintains good imaging effect.

0+1
1+1
2+1
3+1
4+1
5+1
6+1
7+1
8+1
9+1
10+1
11+1
12+1
13+1
14+1
15+1
16+1

MAFDNet：A New Method of Image Adaptive Classification in Complex Environment

YE Jihua , LI Xin , CHEN Jin , JIANG Aiwen , HUA Zhizhang , WAN Wentao

2023, 38(6):1392-1405. DOI: 10.16337/j.1004-9037.2023.06.014

Abstract (500) HTML (384) PDF 2.65 M (882) Comment (0) Favorites

Abstract:In complex environments， difficult samples and simple ones often coexist. The existing classification methods are mainly designed for difficult samples， and the constructed network causes a waste of computing resources when it is used to classify simple ones. However， network pruning and weight quantization couldn’t take into account both accuracy and storage cost. To promote the efficiency of computing resources with better accuracy， focusing on the spatial redundancy of input samples， this paper proposes an adaptive image classification network MAFDNet in complex environment， introduces the confidence as the classification accuracy of judgment， and puts forward the adaptive loss function composed of the content loss， fusion loss and classification loss at the same time. MAFDNet consists of three subnets. The input images are first sent to the low-resolution subnet， which effectively extracts low-resolution features. Samples with high confidence are first identified and removed from the network in advance， while samples with low confidence need to enter the subnet with higher resolution in turn. The high resolution subnet in the network has the ability to identify difficult samples. MAFDNet combines resolution adaptive and depth adaptive. Through experiments， the top-1 accuracy of MAFDNet is improved in CIFAR-10， CIFAR-100 and ImageNet data sets under the same computing resource conditions.

0+1
1+1
2+1
3+1
4+1
5+1
6+1
7+1
8+1
9+1

Hyperspectral Image Fusion via Deep Unfolding and Dual-stream Networks

LIU Cong , YAO Jiahao

2023, 38(6):1406-1421. DOI: 10.16337/j.1004-9037.2023.06.015

Abstract (736) HTML (463) PDF 3.02 M (1088) Comment (0) Favorites

Abstract:Hyperspectral image fusion algorithms based on deep learning typically stack multiple convolutional layers to learn mapping relationships， which suffer from the problems of not fully utilizing the characteristics of the task and lack of interpretability. To address these problems， this paper proposes a deep network combining deep unfolding and dual-stream networks. Firstly， an image fusion model is established using convolutional sparse coding， which maps low-resolution hyperspectral images （LR-HSI） and high-resolution multispectral images （HR-MSI） into a low-dimensional subspace. In the design of the fusion model， we consider the common information of LR-HSI and HR-MSI as well as the unique information of LR-HSI， and add HR-MSI to the model as auxiliary information. Next， the fusion model is unfolded into a learnable interpretable deep network. Finally， the dual-stream network is used to get more accurate high-resolution hyperspectral images （HR-HSI）. Experiments prove that the network obtains excellent results in the hyperspectral image fusion task.

0+1
1+1
2+1
3+1
4+1
5+1
6+1
7+1
8+1
9+1
10+1
11+1
12+1
13+1

Facial Expression Recognition Under Complex Scenes Based on Multi-region Detection Network

PAN Xinchen , QIN Ling , YANG Xiaojian

2023, 38(6):1422-1433. DOI: 10.16337/j.1004-9037.2023.06.016

Abstract (727) HTML (302) PDF 1.86 M (982) Comment (0) Favorites

Abstract:Facial expressions are the most intuitive representation of human emotional states， and convolutional neural networks have shown excellent performance in facial expression recognition. However， occlusion and pose changes in complex scenes are still two major problems in automatic facial expression recognition， which significantly changes the appearance of faces and affects the final recognition results. Aiming at the problems of occlusion and pose change in facial expression recognition， a facial expression recognition method based on dual attention and multi-region detection network is proposed. Dual attention is used to improve the feature extraction capability of the overall network， enabling the network to focus on more detailed feature information. Multi-region detection is used to adaptively capture important local regions in facial expression recognition of occlusion and pose changes， and suppress the negative effects of occlusion and pose changes. Finally， the effectiveness of the proposed method is verified on three public natural scene facial expression datasets AffectNet， RAF-DB and SFEW.

0+1
1+1
2+1
3+1
4+1
5+1
6+1
7+1
8+1
9+1
10+1
11+1

Improved Lightweight Traffic Sign Detection Algorithm of YOLOv5

Jia Zihao , Wang Wenqing , Liu Guangcan

2023, 38(6):1434-1444. DOI: 10.16337/j.1004-9037.2023.06.017

Abstract (666) HTML (405) PDF 3.82 M (1063) Comment (0) Favorites

Abstract:With the rapid development of science and technology and artificial intelligence， people are more and more inclined to driverless technology. Considering the safety problem， aiming at the real-time detection of traffic signs during driving， the algorithm is improved on the basis of YOLOv5 model， and a lightweight traffic sign detection algorithm is proposed. The attention mechanism is added to the feature fusion part of the model， which can make the model more prominent target features. Then a lightweight sub-pixel convolution layer is added in front of the detection layer to effectively improve the resolution of the detection feature map without increasing the amount of computation. Finally， the loss function CIoU （Complete intersection over union） is improved， which speeds up the convergence speed of the network， and the convergence effect is better than that before the improvement. The experimental results show that the accuracy of this model reaches 90.6%， which is 14.5% higher than the basic network， and the detection speed reaches 70 frames / s， which basically meets the real-time accurate detection of traffic signs.

0+1
1+1
2+1
3+1
4+1
5+1
6+1
7+1
8+1
9+1
10+1
11+1
12+1
13+1

ValidFlow: Unsupervised Image Defect Detection Based on Normalizing Flows

ZHANG Lanyao , CHEN Xiaoling , ZHANG Damin , CEN Yigang , ZHANG Linna , HUANG Yansen

2023, 38(6):1445-1457. DOI: 10.16337/j.1004-9037.2023.06.018

Abstract (858) HTML (448) PDF 2.10 M (1231) Comment (0) Favorites

Abstract:The CS-Flow method based on normalizing flows has achieved good results in the field of defect detection， but its way of repeatedly stacking single coupling blocks increases the complexity of the network. Therefore， we propose a network ValidFlow composed of two coupling blocks stacking： Feature advection flow （FA flow） and feature blending flow （FB flow）. In the subnetwork of FA flow， the short-cut branch of up and down sampling is removed and depth-separable convolution is introduced. The subnetworks within FB flow are fused across scales at three scales. This allows ValidFlow to reduce the number of parameters while keeping the information well mixed. Compared with the existing methods on MVTec AD，MTD and DAGM datasets， it can be seen that on MVTec AD datasets， the average AUROC of ValidFlow in 15 categories is 99.2%， and the AUROC of ValidFlow in four categories is 100%. On the MTD dataset， AUROC achieves 99.6%. At the same time， compared with CS-Flow， ValidFlow has 207.61M fewer parameters and 22 higher reasoning speed FPS. On the DAGM dataset， the average AUROC of the 10 categories is 99.0%， which is very close to the monitored method in terms of performance.

0+1
1+1
2+1
3+1
4+1
5+1
6+1
7+1
8+1
9+1
10+1
11+1

Multi-scale Expressive Chinese Speech Synthesis

GAO Jie , XIAO Dajun , XU Xialing , LIU Shaohan , YANG Qun

2023, 38(6):1458-1468. DOI: 10.16337/j.1004-9037.2023.06.019

Abstract (763) HTML (478) PDF 1.51 M (1117) Comment (0) Favorites

Abstract:Common methods for enhancing the expressiveness of synthesized speech typically involve encoding the reference audio as a fixed-dimensional prosody embedding. This embedding is then fed into the decoder of the speech synthesis model along with the text embedding， thereby introducing prosody information into the speech synthesis process. However， this approach only captures prosody information at the global level of speech， neglecting fine-grained prosody details at the word or phoneme level. Consequently， the synthesized speech may still exhibit unnatural pronunciation and flat intonation in certain words. To tackle these issues， this paper introduces a multi-scale expressive Chinese speech synthesis method based on Tacontron2. Initially， two variational auto-encoders are employed to extract global-level prosody information and phoneme-level pitch information from the reference audio. This multi-scale variational information is then incorporated into the speech synthesis model. Additionally， during the training process， we minimize the mutual information between the rhyme embedding and the pitch embedding. This step aims to eliminate intercorrelation between different feature representations and to separate distinct feature representations. Experimental results demonstrate that our proposed method enhances the subjective mean opinion score by 2% and reduces the F₀ frame error rate by 14% compared to the single-scale expressive speech synthesis method. The findings suggest that our method generates speech that is more natural and expressive.

0+1
1+1
2+1
3+1
4+1
5+1
6+1
7+1
8+1
9+1

Speech Steganalysis Method for Echo Hiding Based on Image of Cepstrum

Tang Junhao , Du Qingzhi , Long Hua , Shao Yubin , Li Yimin

2023, 38(6):1469-1481. DOI: 10.16337/j.1004-9037.2023.06.020

Abstract (748) HTML (479) PDF 2.88 M (1362) Comment (0) Favorites

Abstract:After echo hiding， the cepstrum coefficient of a speech signal will peak at the echo delay. The traditional echo hiding steganalysis mainly uses the statistical characteristics of the cepstrum coefficient as the steganalysis feature. However， the peak value of the cepstrum coefficient of the steganography signal is not obvious when the echo amplitude is low， and the detection performance of the method based on the statistical characteristics is unsatisfactory. This paper combines cepstrum analysis with image recognition technology， and proposes an steganalysis method for speech echo hiding based on cepstrum image. The speech signal is divided into frames and windowed for cepstrum calculation. Then， the image is generated with time as the horizontal axis， cepstrum sequence points as the vertical axis， and cepstrum coefficient amplitude as the gray level. The generated cepstrum image is used as the steganalysis input， and residual neural network is used as the classifier for echo hiding steganalysis. The experimental results show that the detection accuracy of the three classical echo hiding algorithms reaches 98.2%， 98.6% and 96.1% respectively at low echo amplitude. The detection accuracy of this method at low echo amplitude is greatly improved compared with the traditional echo hiding steganalysis method， which solves the problem that the traditional echo hiding steganalysis method has unsatisfactory detection effect at low echo amplitude.

0+1
1+1
2+1
3+1
4+1
5+1
6+1
7+1
8+1
9+1
10+1
11+1
12+1
13+1
14+1

Research and Application of Imbalanced Credit Data Classification Based on NaN-Bicluster SMOTE

HE Liang , XU Haiyan , CHEN Lu

2023, 38(6):1482-1494. DOI: 10.16337/j.1004-9037.2023.06.021

Abstract (409) HTML (209) PDF 1.80 M (950) Comment (0) Favorites

Abstract:To assess borrower’s credit risk using imbalanced data， we propose an improved SMOTE， called NaN-Bicluster SMOTE， which is based on synthetic minority oversampling technique （SMOTE）， natural neighbor （NaN） and bicluster. Firstly， we use parameterless NaN to set logical rules for sampling sample selection， avoiding the instability caused by r nearest neighbor partitioning of samples. Secondly， based on the neighbor relationship of stable structure， we set logical rules that specify security range to avoid samples becoming noise samples. Then， we use bicluster to mine local rules， synthetic samples inherit local rules， and synthetic formula is improved. Finally， we apply several sampling methods and machine learning models， carry out various experiments of NaN-Bicluster SMOTE and comparative models on Prosper’s credit data， and further use statistical testing methods to verify the performance of NaN-Bicluster SMOTE.

0+1
1+1
2+1
3+1
4+1
5+1
6+1

For Authors

Quick search

Volume retrieval

External Links