• Volume 34,Issue 4,2019 Table of Contents
    Select All
    Display Type: |
    • >人工智能(机器学习与模式识别)
    • Using speech and text features fusion to improve speech emotion recognition

      2019, 34(4).

      Abstract (1057) HTML (0) PDF 0.00 Byte (1288) Comment (0) Favorites

      Abstract:Emotion recognition has an important significance in human-computer interaction. The purpose of this study was to improve the accuracy of emotion recognition by fusing speech and text features. Speech features were acoustic features and phonological features ,and the text features were the traditional Bag-of-Words (BoW) features based on emotion dictionary and and N -gram model. We used these features to emotion recognition and compared their performance on the IEMOCAP data-sets. We also compared the effects of different features fusion methods,including feature-layer fusion and decision-layer fusion. Experiment results show that the performance of the fusion of speech and text features is better than that of single features; the performance of the decision-layer fusion of speech and text features is better than that of feature-layer fusion. At the same time, based on the CNN classifier, UAR of the decision-layer fusion with three features reach 68.98%, surpassing the previous best results on the IEMOCAP data sets.

    • A Blind Enhancement Algorithm for Throat Microphone SpeechBased on LSTM Recurrent Neural Networks

      2019, 34(4).

      Abstract (1417) HTML (0) PDF 0.00 Byte (1570) Comment (0) Favorites

      Abstract:Throat microphones have been used in a variety of strong noise scenarios due to their excellent anti-noise characteristics. However, the generated speech has some shortcomings such as much higher energy in middle frequency and severe loss of high frequency, which have greatly affected the speech quality and intelligibility. In order to improve the speech quality, a blind speech enhancement algorithm based on long short memory recurrent neural networks (LSTM-RNN) is proposed. In contrast to previous estimation algorithms based on low-dimensional spectral envelope features, this algorithm first models the relationship of the high-dimensional logarithmic amplitude spectrum between the throat and air-conducted microphone speech directly, and this kind of neural networks can impressively capture the context information to reconstruct the signal. Secondly, the estimated speech amplitude spectrum is processed by non-negative matrix factorization (NMF), which can effectively suppress the over-smoothing problem and further improve the speech quality. The simulation results of LLR, LSD, PESQ show that this algorithm can effectively improve the speech quality of throat microphones.

    • Vietnamese Multi-category Words Disambiguation Combined with Language Features

      2019, 34(4):577-584. DOI: 10.16337/j.1004-9037.2019.04.003

      Abstract (820) HTML (856) PDF 741.17 K (1427) Comment (0) Favorites

      Abstract:Multi-category words disambiguation directly affects the part of speech (POS) tagging accuracy. This paper proposed a statistical disambiguation method combined with linguistic characteristics of Vietnamese multi-category words. First, the paper builds Vietnamese multi-category words dictionary and Vietnamese multi-category words corpus, and selects effective feature sets for multi-category words by analyzing of Vietnamese language and multi-category words. Secondly, the paper takes into account the advantages of adding any features of CRFs model, introduces the syntactic and lexical features excepting the features of words and POS, and then builds up the disambiguation model. Finally, testing is carried out on the real multi-category category words corpus, and the accuracy is 87.23%. Experimental results show that the proposed Vietnamese multi-category words disambiguation model is effective and feasible, which can improve the correct rate of POS tagging.

    • Group Behavior Recognition Method Based on ActionVLAD Pooling and Hierarchical Deep Learning Network

      2019, 34(4):585-593. DOI: 10.16337/j.1004-9037.2019.04.004

      Abstract (715) HTML (1523) PDF 3.95 M (1280) Comment (0) Favorites

      Abstract:In group behavior recognition, the entire group behavior can be inferred by detecting the behavior of each person in the group over a period of time. An end-to-end deep learning network combined with action vector of locally aggregated descriptor (ActionVLAD) pooling layer and multi-layer long short time memory (LSTM) is constructed to solve the group behavior recognition problem. Based on the input of traditional single image information (Red Green Blue, RGB) as a deep learning network, dense optical flow information (Dense_flow)is added to describe the motion between video frames as the input of the two-stream network. The feature information is modeled by the underlying LSTM, and the individual behavior is represented by the fused two stream features. While the ActionVLAD pooling layer can fuse features at different time and different positions of the picture, which can better integrate personal information. Finally the top LSTM is connected with the Softmax classifier, in which group activity is judged by the merged personal information. The test on Collective activity dataset obtains an average recognition accuracy of 82.3%.

    • Transcriptome Expression Analysis of ISO-seq Data with Non-Full-Length Reads Reserved

      2019, 34(4):594-604. DOI: 10.16337/j.1004-9037.2019.04.005

      Abstract (800) HTML (1345) PDF 1.55 M (1355) Comment (0) Favorites

      Abstract:ISO-seq data based on single molecule sequencing are widely used in novel isoform detection due to its long read length in recent years. Most of the current researches only utilize full-length reads, thus lots of information in the non-full-length reads is lost. To address this problem, two models, DSIDP and MCIDP, are proposed in this paper to predict the structure of isoforms and calculate their expression levels with non-full-length reads reserved. Both models establish a predictive isoform set from full-length reads and calculate their expression levels with all reads including non-full-length reads and full-length reads.DSIDP maps all reads to the set and solves the multi-mapping problem with Dirichlet sampling. Utilizing Markov chains to simulate alternative splicing between gene exons, MCIDP can also predict isoforms that have no supports of full-length reads in raw data. Both models are validated on simulation and real data.

    • Low Heart Rate Variability Heart Sound Segmentation Method Using DHMM

      2019, 34(4):605-614. DOI: 10.16337/j.1004-9037.2019.04.006

      Abstract (768) HTML (1531) PDF 1.59 M (1932) Comment (0) Favorites

      Abstract:Aiming at the problem that the existing heart sound localization segmentation method has limited precision, a method of modeling and segmentation of heart sound signals with low heart rate variability is proposed. Firstly, the effective intrinsic mode function (IMF) component of the ensemble empirical mode decomposition (EEMD) is used to characterize the heart sound signal to improve the analyzability of heart sound signals. Then, the Gaussian mixture model(GMM) is established by the Gaussian constraint relationship between the basic heart sound and the non-basic heart sound. Next, the hidden Markov model (HMM) is optimized and the duration-dependent hidden Markov model (DHMM) is established, which can describe the segmtaention model more concisely and reduce the algorithm's complexity. Finally, the s1, systolic phases, s2, and diastolic phases are distinguished by time domain features. The proposed algorithm is compared with the classical Hilbert method and logistic regression hidden semi-Markov model(LRHSMM). Experimental results show that the proposed algorithm has better evaluation indicators such as detection accuracy and calculation time.

    • Blind Enhancement Algorithm for Throat Microphone Speech Based on LSTM Recurrent Neural Networks

      2019, 34(4):615-624. DOI: 10.16337/j.1004-9037.2019.04.007

      Abstract (741) HTML (1297) PDF 2.38 M (1760) Comment (0) Favorites

      Abstract:Throat microphones have been used in a variety of strong noise scenarios due to their excellent anti-noise characteristics. However, the generated speech has some shortcomings such as much higher energy in middle frequency and severe loss of high frequency, which have greatly affected the speech quality and intelligibility. In order to improve the speech quality, a blind speech enhancement algorithm based on long short memory recurrent neural networks (Long short term memory recurrent neural networks, LSTM-RNN) is proposed. In contrast to previous estimation algorithms based on low-dimensional spectral envelope features, this algorithm first models the relationship of the high-dimensional logarithmic amplitude spectrum between the throat and air-conducted microphone speech directly, and this kind of neural networks can impressively capture the context information to reconstruct the signal. Secondly, the estimated speech amplitude spectrum is processed by non-negative matrix factorization (Non-negative matrix factorization, NMF), which can effectively suppress the over-smoothing problem and further improve the speech quality. The simulation results of LLR, LSD, PESQ show that this algorithm can effectively improve the speech quality of throat microphones.

    • Using Speech and Text Features Fusion to Improve Speech Emotion Recognition

      2019, 34(4):625-631. DOI: 10.16337/j.1004-9037.2019.04.008

      Abstract (709) HTML (2364) PDF 623.44 K (2136) Comment (0) Favorites

      Abstract:Emotion recognition has an important significance in human-computer interaction. The purpose of this study was to improve the accuracy of emotion recognition by fusing speech and text features. Speech features were acoustic features and phonological features, and the text features were the traditional Bag-of-Words (BoW) features based on emotion dictionary and N-gram model. We used these features to emotion recognition and compared their performance on the IEMOCAP data-sets. We also compared the effects of different features fusion methods, including feature-layer fusion and decision-layer fusion. Experiment results show that the performance of the fusion of speech and text features is better than that of single features; the performance of the decision-layer fusion of speech and text features is better than that of feature-layer fusion. At the same time, based on the CNN classifier, UAR of the decision-layer fusion with three features reaches 68.98%, surpassing the previous best results on the IEMOCAP data sets.

    • Improved Hybrid Iterative Detection Algorithm for Uplink Massive MIMO System

      2019, 34(4):642-648. DOI: 10.16337/j.1004-9037.2019.04.010

      Abstract (671) HTML (1136) PDF 650.37 K (1414) Comment (0) Favorites

      Abstract:The task of signal detection is to estimate the users’ sending signal through the signal received by the base station. For uplink massive multiple-input-multiple-output (MIMO) system, a hybrid iterative detection algorithm called SDGS based on steepest descent (SD) algorithm and Gauss Seidel (GS) iteration is proposed. The algorithm can solve the problem of matrix inverse in minimum mean square error (MMSE) algorithm and reduce the computational complexity from O(K3) to O(K2), where K is the number of users. Meanwhile, the SD algorithm has a good convergence direction, which speeds up convergence in the iteration. In this paper, a further improved method to compute the log likelihood ratio (LLR) is proposed to improve the detection performance while the complexity is kept of at the level of O(K2). Simulation results show that the improved hybrid iterative algorithm can converge rapidly and approach the performance of the MMSE algorithm with only a small number of iterations.

    • Underwater Acoustic Emission System Based on Differential Time Delay Shift Coding

      2019, 34(4):649-658. DOI: 10.16337/j.1004-9037.2019.04.011

      Abstract (607) HTML (880) PDF 2.27 M (1482) Comment (0) Favorites

      Abstract:Differential time delay shift coding can effectively suppress the interference of multipath channel, so it is widely used in underwater acoustic communication to realize the reliable transmission of information. In this paper, an underwater acoustic emission system based on differential delay difference coding is designed. According to the characteristics and requirements of differential time delay shift coding, the symbol is generated by waveform storage direct reading, which is transmitted and stored in the memory. Then the system is configured with communication coding parameters. Finally, the communication information is encoded through the differential time delay shift coding and transmitted by the digital-to-analog converter and low-pass filter and transducer. Lake communicaton experiment shows that the acoustic emission system can ensure the accuracy of the coded information and the reliability of the transmitted information under different parameters. So this system has certain practicality.

    • Ultra-broadband Unidirectional Waveguide Based on Magnetic Domain Wall

      2019, 34(4):659-664. DOI: 10.16337/j.1004-9037.2019.04.012

      Abstract (553) HTML (626) PDF 967.51 K (1323) Comment (0) Favorites

      Abstract:A new ultra-broadband unidirectional waveguide is proposed based on magnetic domain wall of magnetic-optical (MO) materials. The basic model of the proposed waveguide is a layered structure consisting of metal-YIG-YIG-metal,where the YIG is yttrium iron garnet under anti-parallel static external magnetic fields. Theoretical analysis and simulation results show that our system supports one-way electromagnetic (EM) modes within not only the inherent photonic bandgap of YIG, but also the new photonic bandgap arising from the finite thickness of YIG. Both of the two photonic bandgaps support complete one-way EM modes,which can be immune to scattering and back-reflection. Because of simple structure,robust immunity as well as ultra-broadband one-way operating frequency band,the proposed waveguide is an effective way to realize all photonic integrated circuit.

    • CCSDS Adaptive Transmission System Based on FPGA

      2019, 34(4):665-672. DOI: 10.16337/j.1004-9037.2019.04.013

      Abstract (639) HTML (937) PDF 1014.51 K (1386) Comment (0) Favorites

      Abstract:To transmit information with fixed rate under the environment of deep space communication, the communication quality and system efficiency will be affected. When the channel quality changes during the communication, adjusting the transmission parameters of the communication system in real time can guarantee the performance indexes of the system such as bit error rate(BER) meet the requirements, and improve the efficiency of communication system. Combining the requirements of the proximity-1 space communication protocols, a symbol rate adaptive transmission system scheme that meets the requirements of CCSDS protocol is designed by using SNR of the channel estimation as the standard for measuring the quality of the channel. The hardware implementation platform is based on Xilinx Kintex-7 FPGA chip. Experimental results show that the system can estimate SNR above 0 dB in the case of additive Gauss white noise, and adjust the symbol rate adaptively. Compared to fixed rate transmission, both BER and throughput performance of the symbol rate adaptive transmission system have been improved, which can provide a useful reference for the design of deep space communication system.

    • Pilot Design Schemes for Compressed Sensing-Based MIMO-OFDM Channel Estimation

      2019, 34(4):673-681. DOI: 10.16337/j.1004-9037.2019.04.014

      Abstract (859) HTML (1016) PDF 651.37 K (1624) Comment (0) Favorites

      Abstract:As a combination of MIMO and OFDM systems, multiple input multiple output-orthogonal frequency division multiplexing (MIMO-OFDM) system has high band utilization and can effectively combat multipath effects in wireless channels. In the paper, we studies sparse channel estimation and pilot optimization problems for MIMO-OFDM systems. The channel estimation problem in MIMO-OFDM systems is transformed into the sparse signal reconstruction problem in compressed sensing (CS) theory. The pilot optimization is based on minimizing the mutual coherence of the measurement matrix. In combination with existing stochastic sequential search (SSS) and extension scheme 2(ES2) algorithms and pilot shift mechanism, a fast pilot optimization algorithm stochastic sequential search-shift mechanism (SSS-SM) is proposed. The algorithm has lower computational complexity, and the computation time is not affected by the number of transmit antennas. The pilot design results obtained by SSS-SM algorithm and ES2 algorithm are applied to the channel estimation of MIMO-OFDM system. Simulation results show that SSS-SM can achieve the same channel estimation performance as ES2 with less computational complexity. In the case of high signal-to-noise ratio (SNR), the mean square error (MSE) of SSS-SM is about averaged 3 dB lower than that of ES2, which shows that the method has advantages over high SNR.

    • Microphone Array Direction of Arrival Estimation Based on Block Sparse Feature

      2019, 34(4):682-688. DOI: 10.16337/j.1004-9037.2019.04.015

      Abstract (799) HTML (1020) PDF 660.62 K (1404) Comment (0) Favorites

      Abstract:Different from traditional direction of arrival (DOA) estimation algorithms such as steered response power phase transform (SRP-PHAT) algorithm and delay-and-sum (DS) algorithm, the compressed sensing (CS) microphone arrays DOA algorithm transforms the sound source localization into the reconstruction problem of sparse signal to achieve better performance. However, in practical application environment, the direction vector of the far-field sound source tends to exhibit block sparseness due to the sound source directivity, the spatial reverberation and other reasons, which leads to the performance degradation of traditional sparse recovery algorithms such as orthogonal matching pursuit (OMP) algorithm. In this paper, the block approximated l0 is introduced into the microphone array CS DOA algorithm. Under the CS framework, the block approximated l0 sparse recovery is used to reconstruct the direction vector of the sound source to obtain DOA. Experimental results show that the proposed algorithm is capable of yielding higher positioning accuracy compared with traditional algorithms and traditional sparse recovery algorithm using OMP algorithm.

    • High Performance Hardware Architecture of Lattice-Based Cryptography and Its FPGA Implementation

      2019, 34(4):689-696. DOI: 10.16337/j.1004-9037.2019.04.016

      Abstract (937) HTML (1039) PDF 864.78 K (1562) Comment (0) Favorites

      Abstract:With the development of quantum computers, conventional public cryptographic schemes such as RSA and elliptic curve cryptography(ECC) are under serious threat. To resist the quantum attacks, lattice-based cryptography has attracted research attention, in which the ring-learning with error (R-LWE) lattice encryption algorithm has great application potential in the field of encryption because of its easy implementation and quantum attack resistance. From the perspective of hardware application, a parallel circuit architecture of polynomial multiplication in R-LWE encryption scheme is proposed and implemented. The number theoretic transforms (NTT) method and two parallel butterfly operation units are used. The results show that the proposed algorithm can improve the performance by up to 42% with slightly increased hardware resource.

    • PARALIND Decomposition-Based Coherent Direction of Arrival Estimation Algorithm for Electro-magnetic Vector Array

      2019, 34(4):697-705. DOI: 10.16337/j.1004-9037.2019.04.017

      Abstract (597) HTML (777) PDF 703.96 K (1029) Comment (0) Favorites

      Abstract:The paper combines the parameter estimation problem of electromagnetic vector array with the parallel profiles with linear dependencies (PARALIND) model, and proposes a PARALIND decomposition-based coherent direction of arrival(DOA) estimation algorithm for electromagnetic vector array. The proposed algorithm is effective for coherent sources angle estimations without the need for peak searching. In addition, the corresponding correlated matrix also can be obtained and it is effective for both uniform and non-uniform linear array. Compared with traditional forward backward spatial smoothing-estimation of signal parameters via rotational invariance techniques(FBSS-ESPRIT) algorithm and forward backward spatial smoothing -propagator method (FBSS-PM) algorithm, the proposed approach has better angle estimation performance and moreover it can distinguish closely-spaced coherent sources and achieve their angle estimations effectively.

    • An Efficient Method for Target Tracking in Large Area

      2019, 34(4):706-714. DOI: 10.16337/j.1004-9037.2019.04.018

      Abstract (597) HTML (1590) PDF 832.46 K (1328) Comment (0) Favorites

      Abstract:Aiming to decrease the time complexity of both the train and the prediction process in indoor target tracking using radio fingerprints in wireless sensor networks, a novel method is proposed in this paper, which is an ideal solution for portable devices tracked in large area. A local match of the anchors and reference positions is introduced before applying the weighted K-nearest neighbors, which significantly decreases the complexity of positioning target. The Kalman filter is used forward with the previous positioning based on acceleration information to further increase the estimation accuracy. The performance of the method is studied thoroughly, indicating an average tracking accuracy of 1.4 m with the reference positions uniformly distributed at 10 m distance and the noise standard deviation of 16. Result shows that the proposed method is applicable for mobile target tracking in indoor environment.

    • Ultra-high Speed Digital Phase-Locked Amplifier System Based on Clock Tree Mechanism

      2019, 34(4):715-722. DOI: 10.16337/j.1004-9037.2019.04.019

      Abstract (553) HTML (783) PDF 940.89 K (1349) Comment (0) Favorites

      Abstract:In the ultra-high speed digital phase-locked amplifier (PLA) system, the trade-off between sampling rate and sampling accuracy can be solved by using the conventional time-interleaved parallel analog-to-digital converter (ADC) structure. However, this system is very vulnerable to the impact of sampling clock jitter in each channel. Based on the analysis of the relationship between sampling clock jitter and effective sampling digits and dynamic range, a high-speed digital phase-locked amplifier system is realized by using the parallel ADC alternating sampling structure based on clock tree mechanism. Experimental results show that under the same test conditions, the signal-to-noise ratio of this system is increased by about 17.5 dB compared with that of commercial PLA manufactured by foreign mainstream manufacturers.

    • PSO-DEC-IFSVM Classification Algorithm for Unbalanced Data

      2019, 34(4):723-735. DOI: 10.16337/j.1004-9037.2019.04.020

      Abstract (814) HTML (861) PDF 824.79 K (1020) Comment (0) Favorites

      Abstract:For the unbalanced datasets, the traditional fuzzy support vector machine (FSVM) algorithm classification effect is not obvious, and the introduced parameters are not optimized. Therefore, this paper proposes an improved fuzzy support vector machine(IFSVM)algorithm based on particle swarm optimization(PSO)algorithm, i.e. PSO-DEC-IFSVM algorithm. First, the algorithm is used to design fuzzy membership function considering the distance from training sample to its center, the tightness around the sample and the amount of information of the sample, and then IFSVM algorithm is combined with different error costs(DEC)algorithm for obtaining the DEC-IFSVM algorithm. Finally the PSO algorithm is used to optimize the introduced parameters in the DEC-IFSVM algorithm. Experiments show that the PSO-DEC-IFSVM algorithm has better positive and negative classification effect and stronger robustness than the existing FSVM algorithm and its improved algorithm for the six unbalanced data sets, such as Pima in UCI public data set.

    • Static Decoupling Algorithm of Six-Axis Wrist Force Sensor Based on Multi-output Support Vector Regression

      2019, 34(4):736-743. DOI: 10.16337/j.1004-9037.2019.04.021

      Abstract (745) HTML (1371) PDF 720.11 K (1599) Comment (0) Favorites

      Abstract:As the measurement accuracy of six-axis wrist force sensor is affected by interdimensional coupling, a decoupling algorithm based on multi-output support vector regression(MSVR) is proposed. The static calibration experiment is carried out on the six-axis wrist force sensor designed by the lab, and the data processing and analysis are proceeded both on the decoupling algorithm based on MSVR and the traditional decoupling algorithm based on least square theory solving calibration matrix. Experimental results show that the decoupling algorithm proposed in this paper has features of stability, high reliability and high precision, and can effectively inhibit interdimensional coupling interference.

    • Design of Temperature Environment Simulation System for Outer Space Based on PID Control Method and Inverter Power Adjusting Technology

      2019, 34(4):744-752. DOI: 10.16337/j.1004-9037.2019.04.022

      Abstract (590) HTML (849) PDF 877.40 K (1139) Comment (0) Favorites

      Abstract:Considering the request of high dose radiation born by the temperature environment simulation system for outer space, the structure model of the space temperature and radiation environment simulating system is built. Refrigerating unit and heating components are used to heat and refrigerate the antifreeze. The remote temperature control and anti-radiation design of the system are achieved through the circulation of antifreeze and gas. Temperature control process is divided into three stages, including refrigerating stage, heating stage and heat insulating stage. Increment PID control method, inverter power adjusting technology and fuzzy control method are adopted in the temperature control of environment simulation system, and the excellent control effect is achieved. Experimental results show that the actual temperature control curve is in accordance with the given one, and can satisfy the request of the space temperature environment simulation system.

Quick search
Search term
Search word
From To
Volume retrieval