Journal News More+
Announcements More+
Download Area More+
InformationResponsible Institution:China Association for Science and Technology
Sponsored by:Chinese Institute of Electronics
Nanjing University of Aeronautics and Astronautics
ISSN:1004-9037
CN:32-1367/TN
Address:29Yudao Street,Nanjing,China
Telephone:025-84892742
Chief Editor:025-84892742
E-mail:sjcj@nuaa.edu.cn
Post Coder:210016
:China Association for Science and Technology
ISSN 1004-9037
CN 32-1367/TN
Abstracting and Indexing
· Chinese Core Periodicals
· Chinese Science Citation Database(CSCD)
· Chinese Core Journals of Science and Technology
· China National Knowledge Infrastructure(CNKI)
· China Science and Technology Journal Database (VIP)
· Chinese Core Journal Database (Wanfang Data)
· Chinese Academic Journal Comprehensive Evaluation Database(CAJCED)
· Scopus
· EBSCO
· Directory of Open Access Journals (DOAJ)
· INSPEC
GE Quanbo, LI Kai, LU Zhenyu, LI Bo, YANG Liang, HUANG Yanjun
2026(1):2-27, DOI: 10.16337/j.1004-9037.2026.01.002
Abstract:
This paper aims to systematically review the research progress in cooperative technologies for sea-air heterogeneous multi-agent systems, clarify their core cooperative paradigms, and reveal the systemic challenges arising from the tight coupling of the “perception, decision-making, and control” full chain. Adopting a system-level functional perspective, the research analyzes the full-chain challenges induced by scale effects and heterogeneity effects, such as system scalability, dynamic matching, and overall robustness. Subsequently, it conducts a review and comparative analysis of the mainstream methods for five key technologies underpinning system cooperation—multi-source data fusion, communication networks, task allocation, path planning, and formation control, evaluating their advantages, limitations, and applicable scenarios. Analysis indicates that while such systems show significant potential in tasks like maritime search and rescue and wide-area inspection, their transition to practical engineering applications remains constrained by bottlenecks such as cross-platform integration difficulties, insufficient adaptability to dynamic environments, and a lack of testing and evaluation frameworks. Future efforts should focus on making sustained breakthroughs in areas such as full-chain cooperative theory modeling, lightweight intelligent algorithms, and standardized engineering architectures, while also exploring new cross-domain cooperative paradigms, to promote the development of marine-aerial heterogeneous multi-agent systems towards greater intelligence, enhanced robustness, and broader application.
SHI Yunhe, ZHANG Xiaofei, WU Qihui
2026(1):28-52, DOI: 10.16337/j.1004-9037.2026.01.003
Abstract:
The unmanned aerial vehicle (UAV) multi-modal ultra-wide spectrum cognitive instrument constructs an intelligent remote sensing system by deeply integrating visible light, infrared, synthetic aperture radar (SAR), and wireless spectrum sensors. It aims to overcome fundamental bottlenecks in traditional UAV remote sensing: Limited endurance severely constraining detection range, insufficient payload capacity restricting multi-modal perception, weak onboard computing capability causing real-time processing delays, and finite communication capacity hindering high-fidelity situational assessment. To address endurance challenges, the design employs a hybrid energy configuration combining piston engines and lithium batteries with a vertical take-off and landing (VTOL) flying-wing layout, significantly enhancing operational longevity. For payload limitations, it develops a compound-eye multi-camera array for wide-field high-resolution imaging and integrates a W-band miniaturized SAR radar with submillimeter-level vibration compensation technology, enabling air-time-frequency multi-dimensional collaborative perception. To resolve real-time processing constraints, a spatiotemporal registration framework and lightweight deep learning model establish a multi-level fusion mechanism (data-feature-semantic layers), elevating detection accuracy for low-observable targets beyond 90%. Targeting communication bottlenecks, innovative generative coding combined with knowledge-graph-driven situational reconstruction achieves high-fidelity 3D situational generation under 400-fold compression, quantified via a no-reference quality assessment model for semantic fidelity.Validated in defense reconnaissance for real-time tracking of concealed targets in complex electromagnetic environments and in emergency response for flood monitoring and 3D reconstruction, the instrument demonstrates practical value in complex scenarios. Future research should deepen cross-modal semantic understanding optimization and dynamic cooperative control of UAV swarms to advance intelligent remote sensing toward real-time, autonomous cognitive evolution.
SUN Hancun, BAI Zixuan, XU Jin, GE Ning
2026(1):53-65, DOI: 10.16337/j.1004-9037.2026.01.004
Abstract:
With the rapid development of the low-altitude economy in China, the low-altitude airspace is characterized by massive device connectivity and intensive spectrum utilization, posing significant challenges to the real-time safety supervision of unmanned aerial vehicles (UAVs). According to China’s mandatory standards and civil aviation regulations, UAVs are required to continuously broadcast their Remote identification (Remote ID) information for monitoring and identification. However, the absence of source authentication in standard broadcast protocols introduces security risks. Moreover, existing research lacks theoretical capacity analysis and quantitative evaluation specifically for the Chinese standard broadcast format. To address these issues, this paper proposes a trustworthy construction method for broadcast Remote ID utilizing the national SM2 cryptographic algorithm. By appending digital signatures to standard messages, this method ensures resilient authentication and eliminates potential security vulnerabilities inherent in international algorithms. Furthermore, we formulate a channel capacity model for the Wi-Fi Beacon broadcast system. Simulation results show that the carrier sense multiple access (CSMA) mechanism achieves an 85% performance improvement compared to the pure ALOHA protocol. Under ideal channel assumptions, using 2.4 GHz single-band, 20 MHz bandwidth, 1 s update cycle, and 18 dBm transmission power, the theoretical capacity of the trustworthy broadcast with SM2 signatures is 82 aircraft/km2, which effectively meets the current high-density capacity demand of approximately 15—22 aircraft/km2. Additionally, a dynamic signature frequency strategy is developed to balance security and capacity. The proposed signing method and capacity analysis model provide a theoretical foundation and design reference for the future deployment of low-altitude regulatory systems.
LI Yizhe, XIE Chenyu, LIU Shuming, WAN Ziheng, WEI Xintan, DONG Lu
2026(1):66-88, DOI: 10.16337/j.1004-9037.2026.01.005
Abstract:
This paper addresses the dual challenge of security and robustness in collaborative decision-making for multi-UAV systems operating in dynamic and adversarial environments, where traditional approaches that decouple safety mechanisms from control policies often fail under anomalies. To this end, we propose adaptive security control with adversarial-resilient endogenous strategy (ASC-ARES), a novel framework grounded in “security by design” and “security left shift” principles that systematically embeds multi-layer constraints, including biconnected topology control, physical collision avoidance, and energy management, into deep reinforcement learning via structured state modeling and reward shaping. Methodologically, ASC-ARES extends the deep deterministic policy gradient (DDPG) algorithm to handle hybrid action spaces through a dual-head policy network for joint optimization of three-dimensional continuous attitude and discrete yaw actions. It further integrates a centroid-guided biconnectivity control algorithm to enable proactive network connectivity awareness and constructs a mean opinion score (MOS)- driven multi-objective adaptive reward mechanism to synergistically optimize quality of experience (QoE), network resilience, safety, and energy efficiency. Experimental results demonstrate that ASC-ARES achieves superior convergence and stability, maintaining an MOS fluctuation rate of only 0.36% and a biconnectivity success rate of 99.98%. Under fast gradient sign method (FGSM), projected gradient descent (PGD), and strong noise interference (?=2.0), the system exhibits exceptional topology reconstruction and state recovery capabilities, with an average performance restoration rate exceeding 80% after interference removal. Ablation studies confirm that the topology control module improves service quality by 59%, while the repulsion mechanism reduces collision risk by 85%. These findings establish ASC-ARES as an effective paradigm for achieving integrated performance-security co-optimization in resource-constrained multi-agent systems.
CHEN Yu, HAN Tengfei, YANG Peng, XIONG Zehui, CAO Xianbin
2026(1):89-108, DOI: 10.16337/j.1004-9037.2026.01.006
Abstract:
High-precision clock synchronization is a fundamental technology enabling collaborative functions such as distributed sensing, formation control, and data fusion in general aviation swarms. However, in high-dynamic maneuvering scenarios, traditional round-trip time (RTT) synchronization methods suffer from significant accuracy degradation due to the coupling effects of relative motion-induced Doppler shifts and stochastic unequal reply time (URT) delays within airborne nodes. To address these challenges, this paper proposes a novel RTT clock synchronization algorithm that integrates relative-velocity compensation with a hybrid data-driven error correction mechanism. First, a kinematic model considering radial relative velocity is established to explicitly correct propagation delays caused by node mobility. Building on this, a batch-estimation-based delay modeling strategy is introduced. By extracting statistical features from multi-cycle timing data, this method calculates the equivalent processing delay sensitivity to eliminate systematic URT deviations. Furthermore, to address non-linear clock frequency drifts and complex environmental noise that traditional linear filters cannot resolve, a cascaded time-keeping architecture is developed. This architecture combines a Kalman filter (KF) for real-time state recursion with a Back-Propagation (BP) neural network for residual prediction. The BP network utilizes a lightweight topology to learn and compensate for non-linear errors based on inputs such as signal-to-noise ratio (SNR) and historical residuals. Extensive Monte Carlo simulations are conducted across continuous parameter spaces, including relative velocities up to 2 000 m/s and SNRs ranging from 4 dB to 20 dB. The numerical results demonstrate that the proposed algorithm achieves superior robustness and accuracy. Specifically, under strong URT interference (80 ns), the synchronization error remains stable below 0.25 ns. In low-SNR environments (4 dB), the root mean square error (RMSE) is controlled at approximately 0.2 ns, which represents a nearly tenfold improvement compared to the baseline.
WANG Zhen, HAN Jiqing, HE Yongjun, ZHENG Tieran, ZHENG Guibin
2026(1):109-116, DOI: 10.16337/j.1004-9037.2026.01.007
Abstract:
In recent years, the emergence of the Transformer model has significantly enhanced the accuracy of automatic speech recognition technology. This research aims to address the critical security vulnerabilities in Transformer-based automatic speech recognition systems by enhancing the transferability of universal speech adversarial examples. While Transformer models have significantly advanced speech processing, their susceptibility to universal adversarial perturbations remains a major concern. To exploit these weaknesses effectively, we propose a novel attack framework that leverages the structural commonalities of Transformer architectures. First, we implement a feature-level disruption strategy that maximizes the dissimilarity between perturbed and original speech within the middle-layer representations. By altering these latent representation patterns, the attack successfully shifts the internal decision boundaries of models. Second, given that sample-dependent semantic information often inhibits the generalization of universal noise, we introduce an attention gradient control mechanism. This mechanism strategically weakens the gradients associated with semantic context features, forcing the perturbation to capture underlying, sample-independent acoustic vulnerabilities instead. Finally, experimental evaluations conducted on LibriSpeech demonstrate the superior performance of the proposed method. The results indicate that our approach achieves an average word error rate of 80.6% across multiple target models, representing a 36.6% improvement in transferability compared to existing baseline universal attacks. These findings conclude that the targeted manipulation of middle-layer features combined with the suppression of semantic dependencies is a highly effective strategy for cross-model adversarial threats.Highlights:1. Propose a novel framework of universal speech adversarial attacks that maximizes middle-layer feature dissimilarity to exploit the structural similarities inherent in Transformer-based speech recognition models.2. Introduce a targeted attention gradient control mechanism to decouple sample-independent acoustic features from sample-dependent semantic context, significantly boosting attack transferability.3. Achieve a substantial increase in universal attack success rates across diverse Transformer architectures, outperforming traditional universal perturbation methods.
LIU Yuezhao, GUO Haiyan, WANG Tianshun, CHEN Feifei
2026(1):117-131, DOI: 10.16337/j.1004-9037.2026.01.008
Abstract:
In multi-user speech transmission scenarios, the statistical heterogeneity of data among different users results in the transmission performance degradation if a uniform semantic communication based speech transmission model is used by all users. To address this problem, this paper proposes a novel deep learning-based semantic communication system using federated learning based on hypernetworks (DeepSC-FedHN), enabling each user to obtain a personalized model adaptive to its own data characteristics without compromising data privacy. Specifically, considering that different modules of the semantic encoder play different roles in extracting semantic information, the edge server employs a per-user hypernetwork to generate a personalized aggregation weight matrix by dynamically evaluating the importance of each module in the semantic encoder. The generated aggregation weight matrix is then used to update the corresponding model parameters, effectively tailoring the global knowledge to different users’ needs. Concurrently, since the channel codec and semantic decoder are not involved in extracting the semantic features of each local users’ data, the standard federated averaging (FedAvg) algorithm is used to perform weighted aggregation and updates on the channel codecs and semantic decoders of all the users. Experimental results on TIMIT and Edinburgh DataShare datasets show that the proposed DeepSC-FedHN scheme leads to significant improvement of speech transmission performance. Specifically, it outperforms conventional local training, the standard FedAvg approach, the federated proximal (FedProx) method, and the layer-wise personalized FL scheme (DeepSC-pFedLA) in terms of perceptual evaluation of speech quality (PESQ), signal-to-distortion ratio (SDR) and short time objective intelligibility (STOI), particularly in non-independent and identically distributed (non-IID) data settings. Additionally, the proposed DeepSC FedHN model exhibits better generalization ability for unseen speakers’ data and also demonstrates significantly lower computational overhead for model aggregation compared to the DeepSC pFedLA. We conclude that the integration of a hypernetwork for generating personalized weights offers a highly effective mechanism for tackling data heterogeneity in federated semantic communication systems, leading to superior and more adaptable speech transmission performance while fully preserving user data privacy.
ZHOU Qian, WU Jiayang, ZHOU Yuhang
2026(1):132-146, DOI: 10.16337/j.1004-9037.2026.01.009
Abstract:
This study aims to address the critical challenges of spatial imbalance and low utilization efficiency in the siting of electric taxi charging facilities in large-scale urban environments. To this end, this paper proposes a multi-objective particle swarm optimization algorithm integrating epsilon-constraint handling and fuzzy mathematical programming, referred to as FMPPSO, with the objective of achieving a balanced and efficient charging facility layout that simultaneously considers economic cost, service efficiency, and battery health. The proposed method formulates the electric taxi charging station siting problem as a multi-objective optimization model incorporating construction and operation costs, taxi charging waiting time (reflecting passenger pickup rate), and battery degradation cost. To overcome the limitations of traditional weighted-sum methods and conventional evolutionary algorithms, fuzzy membership functions are constructed to normalize heterogeneous objectives into a unified fuzzy decision space, enabling adaptive adjustment of objective preferences while preserving the original optimization structure. Furthermore, an epsilon-constraint mechanism is introduced to transform secondary objectives into dynamic constraints, which effectively balances solution convergence and Pareto front diversity, mitigates premature convergence, and enhances global search capability. The transformed problem is solved using an enhanced particle swarm optimization framework, where particles represent candidate charging station locations and evolve iteratively under fuzzy-evaluated fitness and epsilon-controlled feasibility conditions. Extensive simulation experiments are conducted based on realistic electric taxi operation scenarios, and the proposed FMPPSO algorithm is compared with several state-of-the-art multi-objective optimization algorithms. Experimental results demonstrate that FMPPSO achieves superior performance in terms of convergence speed, solution stability, and Pareto solution diversity. Quantitatively, the proposed method improves the final objective values by approximately 3.8% compared with benchmark algorithms, while also exhibiting faster convergence under the same computational budget.
LI Zekun, SHI Zhenwei, ZOU Zhengxia
2026(1):147-159, DOI: 10.16337/j.1004-9037.2026.01.010
Abstract:
Remote-sensing instance segmentation often suffers from ambiguous object boundaries and cluttered backgrounds, while adding heavy mask heads can increase computational cost and reduce deployment flexibility. This paper aims to develop a fast, accurate, and detector-agnostic mask-generation scheme that can be integrated into existing detection pipelines with minimal engineering overhead and without extra training. We propose a two-stage framework that couples a replaceable object detector (e.g., YOLOv10 or DINO) with a plug-and-play harmonic background modelling (HBM) module. For each detected bounding box, HBM treats the local background as a harmonic function and reconstructs it by least-squares fitting of a truncated harmonic-polynomial basis. Boundary constraints are formed by sampling pixel values along the bounding-box boundary, and the coefficients are solved efficiently via the Moore-Penrose pseudoinverse. The foreground mask is then derived from the channel-wise residual between the original image and the reconstructed background, followed by a contrast-enhancing nonlinearity, Otsu thresholding, and connected-component filtering to suppress spurious fragments. The overall pipeline is fully decoupled from the detector: the detector is not modified or retrained, and the additional computation mainly comes from solving a small least-squares problem per proposal rather than processing full-resolution feature maps with a learned segmentation head. Extensive experiments on NWPU VHR-10 and iSAID-mini datasets demonstrate consistent gains in both box and mask metrics, while maintaining high throughput. With DINO as the proposal generator, DINO+HBM achieves AP-Box and AP-Mask of 69.3% and 66.3% on NWPU VHR-10 and reaches AP-Mask-50 of 92.1%, improving the previous best result by 2.5 percentage points. On iSAID-mini, DINO+HBM obtains AP-Box and AP-Mask of 55.3% and 42.3% with AP-Mask-50 and AP-Mask-75 of 72.1% and 53.3%, showing clear benefits under more complex scenes. Ablation studies further verify the roles of truncation order, constraint-point number, and sampling strategy, and indicate that bounding-box boundary sampling is more stable than random sampling for background regression and mask extraction without sacrificing speed. The proposed training-free harmonic background suppression provides an efficient way to obtain boundary-faithful instance masks in remote-sensing images and offers a practical, modular add-on to detector-based pipelines when rapid inference and easy deployment are required.
XIANG Wenliang, XIONG Shuhua, HE Haibo, TENG Qizhi, HE Xiaohai
2026(1):160-173, DOI: 10.16337/j.1004-9037.2026.01.011
Abstract:
Rock thin-section microscopic images frequently exhibit complex local textures, blurriness, and high noise levels, posing significant challenges for traditional feature extraction and matching algorithms. These methods often fail to identify effective feature points in high-resolution rock thin-section images, hindering the realization of panoramic stitching while also resulting in slow processing speeds. To address the aforementioned issues, a rock thin- section microscopic image stitching method based on an improved GLU-Net has been proposed. This method integrates an enhanced correlation computation module to improve global and local correspondence, employs a feature pyramid network for multi-scale feature fusion, incorporates a designed adaptive convolutional attention mechanism to optimize attention to key regions, utilizes global and local decoders to obtain optical flow, and applies homography transformation for image stitching, thereby constructing a novel image stitching network model. Experimental results demonstrate that, compared to traditional image stitching algorithms and other classical image stitching network models, the proposed network achieves superior stitching performance. In stitching tests on a self-constructed dataset, a stitching accuracy of 86.75% has been attained with an average registration time of 0.394 s per pair, effectively balancing enhanced accuracy with processing efficiency.
NIU Hongxia, SONG Dingxin, HOU Tao
2026(1):174-186, DOI: 10.16337/j.1004-9037.2026.01.012
Abstract:
To address problems of color cast, low sharpness, and poor performance of dark channel prior methods in processing sky regions in sand dust images, a sand dust image enhancement method based on color cast correction and sky segmentation is proposed. First, color cast in sand dust images is corrected by combining color channel compensation and the gray world algorithm. Second, a dehazing method based on sky segmentation is proposed. The image segmentation threshold is determined by information entropy, and the image is segmented into sky and non-sky regions using this threshold. A fusion window is then used to optimize the dark channel. Next, an adaptive adjustment factor is introduced to adjust the transmittance, and an atmospheric scattering model is used to restore the image, achieving the effect of removing haze. Finally, in the hue, saturation, value (HSV) space, global adaptive saturation compensation is performed on the S channel through adaptive saturation enhancement, while the V channel is enhanced with adaptive gamma correction. The proposed method improves the average gradient, standard deviation, and information entropy by 2.27%, 4.34%, and 0.25%, respectively. Experimental results show that the proposed method can correct the color cast phenomenon in sand dust images, improve image quality, and enhance the improvement effect on sky regions.
LUO Juncheng, XIE Minghong, ZHANG Yafei, LI Huafeng
2026(1):187-201, DOI: 10.16337/j.1004-9037.2026.01.013
Abstract:
Due to the limitations of existing imaging equipment, it is difficult to obtain high dynamic range (HDR) images directly. High dynamic range imaging technology is designed to generate HDR images by processing low dynamic range (LDR) images. Most existing deep learning methods reconstruct HDR images by fusing multiple images with different exposures. However, due to the relative movement of foreground and background, artifacts appear in the final reconstruction result. Existing methods only perform artifact elimination before fusing multiple images with different exposures, which leads to a heavy dependence of the final HDR image quality on the artifact suppression results before fusion. Moreover, the artifact information introduced during the fusion process is difficult to eliminate in subsequent reconstruction due to unsatisfactory artifact suppression. To address this, we propose a network framework for multi-artifact suppression of reconstructed features and multilevel information fusion to efficiently reconstruct HDR images. First, we handle the differences between different images and features through multiple artifact suppression. Unlike existing methods that only process images or features before fusion, we perform multiple artifact suppression block (MASB) on the features during the reconstruction process to further suppress artifacts in the reconstructed features. Simultaneously, to better utilize the features of non-reference input images, we propose a multilevel fusion block (MFB), through which complementary information from non-reference images can be further extracted. Experimental comparisons on multiple datasets demonstrate that the proposed method achieves better performance in both subjective visual effects and objective metrics.
2026(1):202-214, DOI: 10.16337/j.1004-9037.2026.01.014
Abstract:
Synthetic aperture radar (SAR) imagery is characterized by a large number of targets with diverse categories and significant scale variations, as well as highly complex background clutter caused by coherent speckle noise. These inherent properties substantially degrade detection accuracy and pose significant challenges to reliable target detection. To address the problem of insufficient detection performance under such conditions, this paper proposes a SAR target detection algorithm that jointly exploits spatial-channel feature fusion and frequency selection. Specifically, a ResNet-50 network pre-trained on large-scale datasets is adopted as the backbone to extract hierarchical and multi-scale feature representations from SAR images. On this basis, a feature pyramid network (FPN) augmented with a joint multi-scale spatia-channel feature enhancement module is constructed to strengthen the representation capability of features at different scales. This design enables the network to more effectively capture discriminative target information while alleviating the adverse impact of scale diversity among targets. By jointly modeling spatial and channel-wise dependencies, the proposed enhancement module improves feature expressiveness and robustness, particularly for small and weak targets embedded in cluttered backgrounds. Furthermore, a frequency selection module is introduced in the feature domain to explicitly exploit the frequency characteristics of SAR imagery. This module selectively suppresses noise components while preserving informative target-related signals, thereby enhancing target features and improving the signal-to-noise ratio. Through adaptive frequency-domain filtering, the proposed method effectively mitigates the influence of speckle noise without sacrificing critical structural information, leading to more reliable feature representations for subsequent detection. Extensive comparative experiments are conducted on two widely used benchmark datasets, MSAR and SARDet-100K, to evaluate the effectiveness of the proposed approach. Experimental results demonstrate that the proposed algorithm consistently outperforms several representative and state-of-the-art SAR image target detection methods, including Faster R-CNN, ConvNeXt, PVT-T, and YOLOF, across both datasets. These results indicate that the proposed framework achieves superior detection performance and exhibits strong generalization capability under complex SAR imaging conditions. Overall, the proposed method provides an effective solution for improving SAR target detection accuracy in scenarios involving complex backgrounds, severe speckle noise, and multi-scale target distributions.
2026(1):215-230, DOI: 10.16337/j.1004-9037.2026.01.015
Abstract:
Multi-view clustering is a powerful technique for improving analytical performance by fusing complementary multi-source information. However, there are deficient in two ways: It neglects the strong inherent correlation between representation tensors and affinity matrices, and the separate two-step strategy of representation learning and clustering leads to lack of association between these processes, rendering inefficient in handling missing data, noise and outliers in multi-view data processing. In order to address these issues, this paper proposes a multi-view sub-space clustering method based on tensor low-rank learning. A methodology is put forward for the analysis of high-order correlations among data points and the identification of the intrinsic structure of the data. The method involves the introduction of a high-order tensor constraint based on low-rank representation (LRR) and the adoption of tensor nuclear norm minimization (TNNM) based on tensor singular value decomposition (t-SVD). This approach facilitates the transformation of the original non-convex optimization problem into a solvable convex one. The application of an adaptive weighted Schatten-p norm has been utilized to capture the inherent differences between singular values, with the assistance of their prior information. Spectral clustering has been integrated into a unified framework for the purpose of optimizing the affinity matrix, with a view to more effectively characterizing clustering structures. The inexact augmented Lagrange multiplier (ALM) method has been utilized to decompose the model into four solvable sub-problems for the purpose of efficient optimization. Comprehensive experiments are conducted on six benchmark datasets spanning facial images, news stories, handwritten digits and general objects, with systematic optimization of key parameters to ensure reliability. The findings demonstrate that the proposed method exhibits a substantial enhancement in performance when compared to four contemporary algorithms, namely t-SVD-MSC, ETLMSC, WTNNM and MLAN. The proposed method demonstrated an accuracy of 0.981 on the Yale dataset, 0.995 on the UCI-Digits dataset, and 0.971 on the Scene-15 dataset. The proposed method effectively increases the robustness of the affinity matrix against noise and outliers. It accurately extracts the intrinsic subspace structure of multi-view data and demonstrates excellent practical performance and strong generalization ability in the analysis of high-dimensional and incomplete multi-view data.
XUE Bao, ZHOU Junjie, SHAO Wei
2026(1):231-243, DOI: 10.16337/j.1004-9037.2026.01.016
Abstract:
Whole slide images (WSIs) serve as the golden standard for pathological diagnosis, and their accurate classification provides critical information on tumor type, grade, and stage, which is essential for cancer prognosis and treatment strategy selection. In computational pathology, multi-instance learning (MIL) has become the mainstream approach for WSI classification. However, most existing MIL methods focus on single-scale pathological images, limiting the understanding of cancer development and progression mechanisms across different levels. Additionally, the high resolution of WSIs and information discrepancies across scales pose challenges to efficiently integrating and analyzing patches both within a single scale and across multiple scales. To address these issues, this paper proposes a WSI classification method based on deformable attention and multi-scale multi-instance learning (DMSMIL). Specifically, a deformable attention branch is designed to learn associations among patches within the same scale, enhancing attention computation efficiency. Meanwhile, an optimal transport (OT)-based association algorithm is developed to integrate pathological information across different scales, enabling efficient multi-scale information alignment. Experimental results on breast cancer and lung cancer subtype classification tasks demonstrate that the proposed method achieves classification accuracies of 85.39% and 92.00%, respectively, outperforming mainstream WSI classification methods. The proposed DMSMIL effectively integrates multi-scale pathological features and improves the accuracy of WSI-based cancer subtype classification, providing a promising approach for computational pathological diagnosis.
2026(1):244-258, DOI: 10.16337/j.1004-9037.2026.01.017
Abstract:
Reconstructing visual images from electroencephalogram (EEG) signals has become an emerging frontier in brain-computer interface (BCI) research, offering substantial potential in medical image reconstruction, neural decoding, and cognitive state analysis. However, the inherently noisy, low-amplitude, and highly temporal characteristics of EEG signals pose considerable challenges to robust feature extraction and high-fidelity image synthesis. To address these limitations, this study aims to establish an effective EEG-driven visual reconstruction framework capable of capturing fine-grained temporal dynamics while ensuring semantic consistency in the generated images. The proposed model integrates a double residual long short-term memory (LSTM) architecture with a self-designed deep convolutional generative adversarial network (DCGAN). Specifically, an LSTM network based on attention residual network and Triplet loss (ARTLNet) is constructed to improve EEG feature extraction by combining residual learning, temporal modeling, and self-attention mechanisms. Batch normalization and global average pooling are further employed to enhance signal stability and suppress feature redundancy. In the reconstruction stage, a customized DCGAN incorporating feature fusion is adopted to enrich semantic representation and improve image clarity and diversity. Experimental evaluations on both Characters and Objects EEG datasets demonstrate that ARTLNet achieves consistently higher classification and clustering accuracy across multiple algorithms compared with baseline LSTM and non-residual architectures. The generated images exhibit clearer structural details and more distinguishable category attributes, verifying the effectiveness of the proposed generative strategy. The results demonstrate that the combination of residual enhanced temporal modeling and feature-fusion-based adversarial generation can significantly improve EEG-driven visual reconstruction performance. This study confirms the viability of exploiting advanced deep learning mechanisms to decode and visualize EEG information with improved interpretability, providing methodological support for future BCI-based image reconstruction and neural representation studies.
2026(1):259-271, DOI: 10.16337/j.1004-9037.2026.01.018
Abstract:
To enhance the throughput of the mobile edge computing (MEC) system, this paper investigates a rate splitting multiple access (RSMA)-MEC system assisted by a collaborative combination of the active reconfigurable intelligent surface (RIS) and the decode-and-forward (DF) relay. In this system, the active RIS is deployed to improve the signal transmission condition, while the DF relay is employed to extend the communication range. Additionally, the RSMA protocol is utilized to enhance the spectrum efficiency in multi-user access scenario, and both the relay and the base station (BS) apply the successive interference cancellation technique to decode the transmitted signals. The interaction among these technological components requires a systematic approach to resource allocation and interference management. To fully exploit the potential gains of this composite architectural framework, an overall joint optimization across all key parameters is essential. To maximize the system throughput, the joint optimization problem involving the relay’s decoding order and transmitting power, the BS’s receiving beamforming and decoding order, the active RIS reflection coefficients, and the users’ offloading strategies are investigated. An alternating optimization algorithm is proposed to obtain the suboptimal solution for the throughput maximization problem. Finally, numerical results validate that the collaborative assistance of the active RIS and the DF relay can effectively enhance the throughput performance of the RSMA-MEC system.
LI Ting, SHEN Mingyu, ZHANG Chunjie, XIE Peizhong
2026(1):272-286, DOI: 10.16337/j.1004-9037.2026.01.019
Abstract:
Orbital angular momentum (OAM) is a technique that provides additional degrees of freedom to improve the spectral efficiency of wireless communication. However, its application typically requires strict alignment between the transmitting and receiving antennas to maintain modal orthogonality. Considering that when using misaligned concentric circle UCA to generate and receive OAM beams, there is not only modal interference caused by misaligned single-ring UCA, but also array interference among different UCAs in the concentric circle,this paper combines the OAM multiplexing generated by uniform circular array (UCA) antennas with multiple-input multiple-output (MIMO) technology to form a uniform concentric circular array orbital angular momentum (UCCA-OAM) communication system. In order to further improve the performance of the OAM communication system, this paper proposes a uniform concentric circular array orbital angular momentum (UCCA-OAM) multiplexing transmission system in the case of unaligned. This system uses multiple concentric circles UCA as the transmitting and receiving ends, and at the same time takes advantage of the benefits of OAM multiplexing to achieve signal multiplexing through discrete Fourier transform processing. In the unaligned UCCA-OAM communication system, when transmitting OAM beams, there exists not only inter-modal interference of single-ring UCA, but also inter-array interference of multi-ring UCA. Based on this, this paper proposes a dual-module interference cancellation scheme to approximately eliminate modal interference and array interference. Firstly, the modal interference caused by the misalignment of single-ring UCA is approximately eliminated through the beam control scheme based on phase compensation. Then, according to the inverse operation of the four-block matrix, a progressive block matrix inverse operation is proposed to eliminate the array interference caused by multi-ring UCA, thereby completing the dual-module interference elimination scheme based on the progressive block matrix inverse operation. Experimental results demonstrate that the proposed dual-module interference cancellation scheme effectively suppresses both modal and array interference, thereby improving the communication performance of the UCCA-OAM system.
WeChat
Online Dataset More+
Excellent Summary More+
Quick search
Volume retrieval
External Links