• Volume 39,Issue 2,2024 Table of Contents
    Select All
    Display Type: |
    • Research Progress of Computational Enhanced Optical Coherence Tomography

      2024, 39(2):248-270. DOI: 10.16337/j.1004-9037.2024.02.002

      Abstract (1189) HTML (812) PDF 8.05 M (1307) Comment (0) Favorites

      Abstract:Optical coherence tomography (OCT) has become an important non-invasive three-dimensional imaging technology with a wide range of applications. Novel demands have occurred for the OCT technology due to the developing application scenario requirement, such as resolution improvement, depth-of-focus decoupling, aberrations correction, and anisotropic resolution correction. Over the past decades, computational imaging methods have been demonstrated effective in improving previous performance parameters. This paper focuses on the four performance improvement demands and reviews several representative computational methods. The analysis compares the strengths and weaknesses of respective solutions and outlooks future trends of computation-enhanced OCT technology, with the aim to provide references for the further study and its applications.

    • Point Spread Function Engineering in Computational Imaging Technology

      2024, 39(2):271-296. DOI: 10.16337/j.1004-9037.2024.02.003

      Abstract (2125) HTML (1817) PDF 8.93 M (1815) Comment (0) Favorites

      Abstract:This paper focuses on the new connotation and application of point spread function (PSF)of optical imaging in computational imaging. Firstly, the conception of PSF in traditional optical imaging and the key role of PSF in optical system design are introduced, and several algorithms for imaging restoration using PSF and imaging evaluation indices are briefly explained. On this basis, the connotation of PSF is re-examined from the perspective of information transfer under the framework of computational imaging, and relevant researches in the field of computational imaging are summarized from the two aspects of narrow and generalized optical systems. Finally, the application prospect and development trend of PSF engineering technology are prospected.

    • Research Progress on Application of Computational Imaging in Holographic Storage Phase Retrieval

      2024, 39(2):297-311. DOI: 10.16337/j.1004-9037.2024.02.004

      Abstract (883) HTML (794) PDF 5.25 M (4729) Comment (0) Favorites

      Abstract:Holographic storage technology, as a kind of data storage technology with three-dimensional volume storage and two-dimensional data transmission, is characterized by high storage density and fast data transmission, which is one of the powerful solutions for long-term storage of massive data. The traditional holographic storage method is limited by the fact that the photodetector only responds to intensity, and is usually modulated by pure amplitude coding. However, utilizing only amplitude information cannot fully exploit the advantages of holography itself, and how to decode the phase information in a simple, fast, stable and accurate way is a real problem faced by holographic storage technology. Computational imaging opens a new way to solve phase retrieval problem for holographic storage technology because of its algorithmic versatility, high perceptual dimension characteristics and so on. This paper mainly reviews some work in recent years on solving the phase retrieval problem of holographic storage using computational imaging technology from the perspectives of iterative computational phase retrieval and deep learning phase retrieval. Analyses are conducted on the work from the perspectives of improving storage density, data reading speed, and data reading stability. Finally, we make an outlook on the future development of this direction.

    • >Special Topic of Computational Imaging
    • Fast 3D Imaging of Small Solar System Bodies Based on FFBP Algorithm

      2024, 39(2):312-323. DOI: 10.16337/j.1004-9037.2024.02.005

      Abstract (546) HTML (603) PDF 3.64 M (740) Comment (0) Favorites

      Abstract:Radar imaging technology has attracted increasing attention in the field of deep space exploration due to its fast, non-destructive, and high-resolution characteristics. To address the issue of low computational efficiency in synthetic aperture radar (SAR) 3D imaging, a fast factorized back-projection (FFBP) 3D imaging algorithm suitable for slow flyby observation modes is proposed leveraging the weak gravity and rapid spin characteristics of small solar system bodies. Initially, the equivalent motion model under slow flyby mode is analyzed, extending the imaging domain from a 2D polar coordinate system to a 3D spherical coordinate system. An in-depth analysis of aperture division and image fusion issues within the 3D FFBP algorithm is conducted, deriving rules for 2D sub-aperture division and recursive image fusion methods, along with a detailed implementation process. Finally, the effectiveness of the algorithm is validated through numerical simulations and measured data. Experimental results show that the proposed imaging algorithm significantly enhances computational efficiency. Compared to the back-projection (BP) algorithm, it can achieve a speedup ratio of 30—50 times while obtaining imaging performance comparable to the classical BP algorithm.

    • Fourier Single-Pixel Imaging Method Based on Adaptive Sampling of Spectral Features

      2024, 39(2):324-336. DOI: 10.16337/j.1004-9037.2024.02.006

      Abstract (804) HTML (596) PDF 6.03 M (903) Comment (0) Favorites

      Abstract:The improvement of imaging efficiency in Fourier single-pixel imaging (FSI) is mainly achieved with the help of optimized reconstruction algorithms and optimized sampling methods. However, with a limited number of samplings, FSI cannot accurately sample critical frequencies, resulting in poor imaging quality. To solve this problem, a strategy for adaptive sampling of spectral features is proposed. First, the degree of concentration of energy in the Fourier domain is investigated as a way to determine the optimal radius of low-frequency equidistant pre-sampling, and further, the corresponding Fourier coefficients are measured by means of pre-sampling the low-frequency components to estimate the key spectral positions, which ultimately realizes the image reconstruction. Compared with the adaptive sampling method based on energy continuity in the high-frequency direction, this method can adaptively select better sampling paths for different spectral feature targets, obtain the key Fourier coefficients, and then improve the imaging quality, with a peak signal-to-noise ratio increase of 2.28 dB and a structural similarity increase of 15.83%. Therefore, this method has the advantage of efficient spatial information acquisition in response to FSI of unknown feature targets, and is expected to be applied in single-pixel fast real-time imaging.

    • High Dynamic Range 3D Recontruction Based on Event Information and Deep Learning

      2024, 39(2):337-347. DOI: 10.16337/j.1004-9037.2024.02.007

      Abstract (891) HTML (972) PDF 3.90 M (879) Comment (0) Favorites

      Abstract:Three-dimentional(3D) measurement of high dynamic range (HDR) surfaces using optical 3D imaging technology, such as metal parts, black objects, and translucent objects, remains a challenging problem. Currently, traditional methods have limitations in reconstructing HDR scenes with low reflection and translucent areas, as well as difficulty in eliminating internal reflection noise of translucent objects. Existing deep learning-based methods typically use strong laser intensification, which can potentially damage the sample and result in overexposure of the acquired image, necessitating tedious adjustments to the laser intensity. To address these issues, this paper proposes a 3D measurement method for HDR scenes utilizing an event camera and the deep learning algorithm. By asynchronously recording the brightness changes of individual pixels, the event camera is with a high dynamic range response, and thus has the ability to fully capture the laser fringe of HDR scenes. In addition, we introduce a deep convolutional neural network (DCNN) to eliminate the noises caused by the reflections inside transparent objects and overexposure area of high reflection from metallic objects, while enhancing the weak laser stripes on the surface. Experimental results demonstrate that the proposed method can successfully achieve high-quality 3D reconstruction of HDR scenes utilizing low-power line laser scanning.

    • Three-Dimensional Reconstruction Method for Single-View Optical Remote Sensing Images Based on Semantic Segmentation and Residual U-Net Fusion

      2024, 39(2):348-360. DOI: 10.16337/j.1004-9037.2024.02.008

      Abstract (697) HTML (776) PDF 6.12 M (852) Comment (0) Favorites

      Abstract:Three-dimensional (3D) reconstruction from single-view remote sensing images is an unsolvable problem, which often requires a lot of manual experience to supplement the missing information to construct a complete 3D model. To solve this problem, a 3D reconstruction method of single-view remote sensing image based on semantic segmentation and fusion residual U-Net is proposed. The method includes two stages: Semantic segmentation and height estimation of single-view remote sensing images. In the semantic segmentation stage, U-Net is used to determine the property of ground objects. On this basis, U-Net is improved to estimate the height of remote sensing image. The anchoring height regression is combined with semantic features to improve the reconstruction accuracy. Specifically, in order to improve U-Net, the feature extraction capability of encoder is enhanced by embedding residual blocks with different numbers and channels, and the decoder output layer is modified to adapt to the height regression task, so as to achieve pixel-to-pixel prediction of digital surface model (DSM) height values of remote sensing images. The results of root mean square error (RMSE) of 2.751 m and mean absolute error (MAE) of 1.446 m are obtained on the published US3D data set, and the reconstructed results are superior to those of other networks, confirming that the method can realize 3D estimation based on single-view remote sensing images and can reconstruct the distribution structure of ground objects.

    • Recent Advancement in Multi-granulation Three-Way Decisions

      2024, 39(2):361-375. DOI: 10.16337/j.1004-9037.2024.02.009

      Abstract (915) HTML (616) PDF 2.79 M (1277) Comment (0) Favorites

      Abstract:Multi-granulation three-way decisions utilizes three-way decision theory to analyze and process complex problems from multiple of views and levels, gradually becoming an efficient and reliable intelligent decision-making method. This paper reviews the research work on multi-granulation three-way decisions, mainly introduces multi-granulation fusion strategy, multiview three-way decisions, and multilevel three-way decisions, discusses multi-granulation three-way decisions from both qualitative and quantitative perspectives, illustrates the relationships between different multi-granulation three-way decisions models, and points out several problems for the existing multi-granulation three-way decisions. The obtained results can provide some references for the deep research in this field.

    • >Research Papers
    • Distributed Sparse Soft Large Margin Clustering

      2024, 39(2):376-384. DOI: 10.16337/j.1004-9037.2024.02.010

      Abstract (406) HTML (366) PDF 712.48 K (583) Comment (0) Favorites

      Abstract:Soft large margin clustering (SLMC) has been proved to achieve better clustering performance and interpretability than other algorithms, such as K-Means. However, when facing large scale distributed data storage, computing involved kernel matrix requires large time cost. One of the effective strategies to reduce this time cost is to use random Fourier feature transform to approximate the kernel function, and the feature dimension on which approximating accuracy depends is often too high, which implies the risk of overfitting. This paper embeds the sparsity into kernel SLMC and combines the alternating direction method of multipliers (ADMM) with SLMC. Finally, we propose a distributed sparse soft large margin clustering algorithm (DS-SLMC) to overcome scalability problem and achieve better interpretability through sparsity.

    • Semi-supervised Multi-label Classification Method for Financial Events

      2024, 39(2):385-394. DOI: 10.16337/j.1004-9037.2024.02.011

      Abstract (528) HTML (460) PDF 1.09 M (671) Comment (0) Favorites

      Abstract:With the continuous development of the digital financial service industry, the Internet and financial service systems have accumulated a large amount of text data. The automatic classification of financial events described in the financial text is a realistic demand of financial technology, and also a widespread concern in the field of natural language processing and machine learning. At present, the deep learning method has been widely used in text classification. Addressing the issues of lack of labeled data in multi label classification of financial events in text data, frequent resource consumption of existing deep learning methods, and failure to explore the specific characteristics of financial event texts, a semi-supervised multi-label classification method of financial events is proposed by using ALBERT, TextCNN and other presentation tools, introducing the subject word attention mechanism. Firstly, the problem of insufficient labeled data is alleviated through unsupervised data augmentation (UDA) methods; Secondly, the subject word attention mechanism is introduced, and the ALBERT dynamic word vector representation method is used to represent the words in the text; Then, TextCNN is used to represent the text comprehensively; Finally, cross entropy and KL divergence are used to measure the loss of labeled data and unlabeled data to train the model. The effectiveness of the proposed method is verified on the financial text dataset.

    • Adaptive Transmissivity Correction Algorithm for Defogging Combining Image Texture Information

      2024, 39(2):395-405. DOI: 10.16337/j.1004-9037.2024.02.012

      Abstract (449) HTML (494) PDF 5.71 M (796) Comment (0) Favorites

      Abstract:Image defogging algorithm is widely used in outdoor intelligent monitoring and traffic navigation fields. After defogging, the image clarity is improved to enhance the recognition effect of the target. Dark channel and its improved algorithm have errors in transmittance estimation in bright gray areas such as sky, and are prone to distortion and blurred image details, which will affect image recognition in intelligent transportation field. An adaptive transmittance defogging method is proposed to compensate the transmissivity. Logarithmic transformation is used to obtain logarithmic compensation operator to adjust the transmissivity in the depth of field area. The confidence of dark channel is calculated according to the richness of image information, and the texture compensation operator is constructed combining the image texture information. It can effectively improve the image distortion after defogging. Compared with other defogging algorithms, the proposed algorithm has improved the average gradient, signal-to-noise ratio (SNR), information entropy and other objective indicators. The image quality has been effectively improved with good transmission compensation effect for the gray bright area, clear and natural image details and moderate brightness.

    • An End-to-End Singing Voice Synthesis Method with Excitation and Vibrato Modeling

      2024, 39(2):406-415. DOI: 10.16337/j.1004-9037.2024.02.013

      Abstract (483) HTML (656) PDF 1.96 M (612) Comment (0) Favorites

      Abstract:In recent years, singing voice synthesis technology has developed rapidly, and end-to-end singing voice synthesis (VISinger) based on variational inference and normalizing flow has become mainstream. But there is still a certain gap between its effect and the sound quality of real persons, which is mainly reflected in the discontinuous hearing of pitch, poor synthesis of vibrato, and unstable articulation in the synthesized singing voice.We propose three main improvements. Firstly, to address the problem of fundamental frequency stability, we propose to add an excitation module in the decoder to explicitly provide the fundamental frequency information to the decoder in the form of an excitation signal; secondly, to address the problem of unnatural vibrato synthesis, we add a vibrato prediction module to explicitly model the vibrato in the song using flow with variational data augmentation; thirdly, we further add a ReZero strategy to the frame prior network. Experimental results show that increasing the excitation signal can improve the stability of the synthesized fundamental frequency, the vibrato modeling has a significant enhancement effect on the recovery of vibrato, and the ReZero strategy has a certain improvement on the training speed and articulation stability. Subjective evaluation demonstrates that the proposed model has a significant advantage over VISinger in the naturalness of singing voice synthesis, with mean opinion score (MOS) reaching 3.95, and also has a significant advantage over the two-stage modeling method DiffSinger+HiFiGAN, proving the effectiveness of the proposed method.

    • Audio Adversarial Examples Generation Method Based on Self-attention Mechanism

      2024, 39(2):416-423. DOI: 10.16337/j.1004-9037.2024.02.014

      Abstract (683) HTML (777) PDF 1.40 M (835) Comment (0) Favorites

      Abstract:With the widespread of personal speech and development of automatic speaker recognition algorithms, personal privacy protection is in a high-risk situation. Audio adversarial examples can protect personal voiceprint features through disabling automatic speaker recognition algorithms while the subjective hearing of the human ear remains unchanged. We improve the typical adversarial attacks algorithm FoolHD with multi-head self-attention mechanism, and we call it FoolHD-MHSA. First, convolutional neural networks are introduced as the encoder to extract adversarial perturbation spectrograms. Second, we use self-attention mechanism to extract correlation features of different parts of perturbation spectrogram from a global perspective , focus the network on the important information and suppress the useless information. Finally, the processed perturbation spectrogram is steganographed into the input spectrogram with a decoder to get adversarial example spectrogram. Experimental results show that FoolHD-MHSA can generate adversarial examples with higher attack success rate and average PESQ score than FoolHD.

    • Speech Emotion Recognition with Multi-task Learning

      2024, 39(2):424-432. DOI: 10.16337/j.1004-9037.2024.02.015

      Abstract (847) HTML (784) PDF 1.60 M (924) Comment (0) Favorites

      Abstract:In recent speech emotion recognition, researchers attempt to identify emotion from speech signals using deep learning models. However, traditional single-task learning-based models do not pay enough attention to speech acoustic emotional information, resulting in low accuracy of emotion recognition. In view of this, this paper proposes a multi-task learning, end-to-end speech emotion recognition network to mine acoustic emotion in speech and improve the accuracy of emotion recognition. In order to avoid the loss of information caused by using frequency domain features, this paper adopts the Wav2vec2.0 as the backbone network of the model to extract the acoustic and semantic features of speech, and the attention mechanism is used to integrate the two kinds of features as self-supervised features. To make full use of the acoustic sentiment information in speech, using emotion-related phoneme recognition as an auxiliary task, a multi-task learning model is used to mine acoustic sentiment in self-supervised features. Experimental results on the public dataset IEMOCAP show that, the proposed multi-task learning model achieves a weighted accuracy rate of 76.0% and an unweighted accuracy rate of 76.9%, with significantly improved model performance compared to the traditional single-task learning model. Meanwhile, ablation experiments verify the effectiveness of auxiliary task and self-supervised network fine-tuning strategy.

    • AQBFO-Based Passive Beamforming Scheme for Intelligent Reflecting Surface-Aided Massive MIMO Systems with Residual Hardware Impairments

      2024, 39(2):433-444. DOI: 10.16337/j.1004-9037.2024.02.016

      Abstract (538) HTML (450) PDF 1.29 M (689) Comment (0) Favorites

      Abstract:The residual hardware impairments(HWIs) caused by the non-ideal characteristics of the transceiver hardware is unavoidable in the intelligent reflecting surface (IRS) assisted massive multiple-input multiple-output (MIMO) system, which seriously affects the uplink achievable rate. To solve this problem, a passive beamforming scheme based on the adaptive quantum bacterial foraging optimization (AQBFO) algorithm is proposed to suppress the negative impact of HWIs on the system performance. Firstly, an approximate analytical expression of the uplink achievable rate is derived based on statistical channel state information (CSI). Then,the passive beamforming optimization scheme based on AQBFO algorithm is carried out to maximize the sum rate. Simulation results show that in IRS-assisted massive MIMO system, the passive beamforming scheme based on AQBFO algorithm can effectively suppress the influence of residual HWIS and significantly improve the uplink ergodic sum rate.

    • Invisible WFRFT Communication Method with Jump Vector

      2024, 39(2):445-455. DOI: 10.16337/j.1004-9037.2024.02.017

      Abstract (529) HTML (374) PDF 2.76 M (631) Comment (0) Favorites

      Abstract:The weighted fractional Fourier transform (WFRFT) technology can greatly change the characteristics of the signal and diversify the statistical characteristics of the signal. Thus the security of communication information is ensured. In order to solve the problem of insufficient anti-scanning ability of single-parameter WFRFT communication, taking single-parameter WFRFT as an entry point, the formation mechanism of single-parameter fractional domain is deeply studied, and its potential microscopic features and dark features are analyzed. So an implicit WFRFT communication method of jump vector (IWVJ) is proposed. Using the relationship between the modulation order and the constellation diagram, the hopping matrix and the hopping vector are established. And the control rules are formulated. In addition, the dynamic modulation order is obtained through the hopping vector control to achieve safe communication. Simulation results show that the IWVJ method has higher inverse transform demodulation similarity and lower bit error rate for licensed receivers, which is better than unlicensed receivers with universal scanning capability. At the same time, the appropriate suggestions for the setting of the demodulation order error, the basic modulation order and the jump frequency are given, so that the IWVJ method can be better applied to communication systems, and provide security information with anti-jamming, anti-interception and anti-spoofing capabilities.

    • Indoor Location Privacy Protection Algorithm Based on Ciphertext KNN Retrieval

      2024, 39(2):456-470. DOI: 10.16337/j.1004-9037.2024.02.018

      Abstract (644) HTML (503) PDF 1.92 M (725) Comment (0) Favorites

      Abstract:In the location request service, how to protect the user’s location privacy and the data privacy of the location service provider (LSP) is a challenging issue related to WiFi fingerprinting applications. Based on the K-nearest neighbors (KNN) retrieval of the ciphertext, this paper proposes a positioning privacy protection algorithm suitable for the three party, which can effectively improve the protection intensity of the privacy of LSP fingerprint information and reduce calculation overhead. The positioning algorithm is completed by a third party based on the encrypted fingerprint database and encrypted positioning request, which is completed in the state of privacy. Through the random embedding of the location information in the fingerprint, the algorithm can avoid the physical location of the reference point (RP) in the fingerprint database. The Bloom filter (BF) is further used to complete the online matching of the reference point when hiding the access point information, which achieves rough positioning in the privacy of the user, and reduces the calculation overhead with the positioning algorithm. In the data set of public datasets and laboratory data, the security, expense and positioning performance of the two algorithms have been comprehensively evaluated. Compared with similar encryption algorithms, without reducing positioning accuracy, it further enhances the protection of data privacy.

    • Modified I-Rife Algorithm for Frequency Estimation of Sinusoid Wave

      2024, 39(2):471-480. DOI: 10.16337/j.1004-9037.2024.02.019

      Abstract (834) HTML (629) PDF 1.22 M (830) Comment (0) Favorites

      Abstract:Frequency estimation of sine wave signals is a common problem in the radar field. When the true frequency approaches the quantization frequency points, the calculation of the frequency shift factor in the I-Rife algorithm can introduce significant errors. In order to improve the accuracy of frequency estimation, this paper analyzes the performance and error sources of the Rife and I-Rife algorithms. By utilizing a spectral refinement method, a modified I-Rife algorithm is proposed. It replaces the amplitude of the spectral peak point with the amplitudes at 0.5 points to the left and right of the peak point, and interpolates the amplitude using the second highest frequency point. This approach allows for a more accurate estimation of the frequency offset. The proposed algorithm effectively enhances the estimation accuracy of frequency while maintaining a similar computational complexity to the original I-Rife algorithm. Simulation results demonstrate that the improved I-Rife algorithm outperforms the original I-Rife algorithm in overall performance and achieves an estimated root mean square error closer to the Cramér-Rao lower bound.

    • Low-Complexity Design of Sparse-Constrained Variable Fractional Delay Filter

      2024, 39(2):481-489. DOI: 10.16337/j.1004-9037.2024.02.020

      Abstract (545) HTML (422) PDF 1.41 M (739) Comment (0) Favorites

      Abstract:Since variable fractional delay (VFD) filter contains a large number of coefficients to be solved, this paper presents a study on sparse-constrained Farrow structure variable fractional delay filter. We add a L1 regularization constraint to further enhance the sparsity based on coefficient symmetry and optimize its frequency response to approximate a desired frequency response in the minimax error sense. In addition, the alternating direction method of multipliers (ADMM) algorithm is used to iteratively obtain the filter coefficients. Simulation experiments demonstrate that the proposed sparse-constrained VFD filter not only ensures high delay accuracy but also reduces the use of multipliers and adders by 47.69% and 58.60% respectively, thus lowering system computation and complexity greatly.

    • Detection of VR-induced Motion Sickness Levels Based on EEG Rhythm Energy and Fuzzy Entropy

      2024, 39(2):490-500. DOI: 10.16337/j.1004-9037.2024.02.021

      Abstract (783) HTML (650) PDF 2.33 M (899) Comment (0) Favorites

      Abstract:Motion sickness has been a key factor affecting the virtual reality user experience and limiting the growth of the virtual reality industry. To address this issue, this paper investigates the effects of virtual reality motion sickness on neural activity in the brain and uses electroencephalogram (EEG) features to detect levels of motion sickness. To obtain features that can measure the level of vertigo, this paper records the EEG signals of subjects before and during the experience of the vertigo test scene, calculates the rhythm energy and fuzzy entropy, uses statistical analysis for feature selection, and finally classifies and verifies the validity of the features. The results show that the energy in the θ and α bands of CP4 and Oz and the energy in the β and γ bands of C4 are significantly reduced when subjects develop motion sickness (p<0.01); in terms of fuzzy entropy, there are significantly higher values of FC4 and Cz fuzzy entropy in the δ band (p<0.000 1) and significantly lower values of O1 fuzzy entropy in the β band (p< 0.000 1). Compared to linear discriminant analysis (LDA), logistic regression (LR) and support vector machine (SVM), K nearest neighbor (KNN) shows better classification results with 89% and 91% classification accuracy on rhythm energy and fuzzy entropy, respectively. This study shows that EEG rhythm energy and fuzzy entropy are expected to be effective indicators for motion sickness level detection, providing an objective basis for studying the causes of virtual reality motion sickness and mitigation options.

Quick search
Search term
Search word
From To
Volume retrieval