Low Bit Rate Generative Drone Video Compression
CSTR:
Author:
Affiliation:

1.Institute of Information Science, Beijing Jiaotong University, Beijing 100044, China;2.Visual Intelligence+X International Cooperation Joint Laboratory, Beijing Jiaotong University, Beijing 100044, China

Clc Number:

TP391

Fund Project:

  • Article
  • |
  • Figures
  • |
  • Metrics
  • |
  • Reference
  • |
  • Related
  • |
  • Cited by
  • |
  • Materials
  • |
  • Comments
    Abstract:

    In complex environments across air, space, land, and sea, the massive volume of video data exerts tremendous pressure on limited transmission bandwidth and storage devices. Therefore, improving the coding efficiency of video compression technologies under low bit rate conditions becomes crucial. In recent years, deep learning-based video compression algorithms have made significant progress, yet due to issues such as model design flaws, mismatches between optimization objectives and perceptual quality, and biases in training data distributions, the visual perception quality at extremely low bit rates has been compromised. Generative encoding effectively improves the texture and structure restoration ability at low bit rates through data distribution learning, alleviating the problem of blur artifacts in deep video compression. However, there are still two major bottlenecks in existing research: Firstly, time domain correlation modeling is insufficient and inter-frame feature correlation is missing; secondly, the lack of dynamic bit allocation mechanism makes it difficult to achieve adaptive extraction of key information. Therefore, this article proposes a video encoding algorithm based on conditional guided diffusion model-video compression (CGDM-VC), aiming to improve the perceptual quality of videos under low bit-rate conditions while enhancing inter-frame feature modeling capabilities and preserving key information. Specifically, the algorithm designs an implicit inter-frame alignment strategy, utilizing a diffusion model to capture potential inter-frame features and reduce the computational complexity of estimating explicit motion information. Meanwhile, the designed adaptive spatio-temporal importance-aware coder can dynamically allocate code rates to optimize the generation quality of key regions. Furthermore, a perceptual loss function is introduced, combined with the learned perceptual image patch similarity (LPIPS) constraint, to improve the visual fidelity of the reconstructed frames. Experimental results demonstrate that, compared to algorithms such as deep contextual video compression (DCVC), the proposed method achieves an average LPIPS reduction of 36.49% under low bit rate conditions (<0.1 BPP), showing richer texture details and more natural visual effects.

    Reference
    Related
    Cited by
Get Citation

LIU Meiqin, CHEN Hongyu, ZHOU Yiming, NI Wenhao. Low Bit Rate Generative Drone Video Compression[J].,2025,40(2):320-333.

Copy
Related Videos

Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:February 15,2025
  • Revised:March 14,2025
  • Adopted:
  • Online: April 11,2025
  • Published:
Article QR Code