基于CNN-Transformer混合架构的AI生成图像鲁棒检测方法
作者:
作者单位:

1.济南大学信息科学与工程学院山东省泛在智能计算重点实验室(筹), 济南 250022;2.北京埃斯顿医疗科技有限公司,北京 102200;3.青岛大学计算机科学技术学院, 青岛266071;4.山东省计算中心(国家超级计算济南中心) 算力互联网与信息安全教育部重点实验室,济南 250353;5.山东省基础科学研究中心(计算机科学) 齐鲁工业大学(山东省科学院) 山东省工业网络和信息系统安全重点实验室, 济南 250353

作者简介:

通讯作者:

基金项目:

国家自然科学基金(62103165);算力互联网与信息安全教育部重点实验室开放课题(2023ZD038);山东省自然科学基金(ZR2022ZD01)。


Robust Detection Method for AI-Generated Images Based on CNN-Transformer Hybrid Architecture
Author:
Affiliation:

1.Shandong Provincial Key Laboratory of Ubiquitous Intelligent Computing, School of Information Science and Engineering, University of Jinan, Jinan 250022, China;2.Beijing Estun Medical Technology Co. Ltd., Beijing 102200, China;3.College of Computer Science and Technology, Qingdao University, Qingdao 266071, China;4.Key Laboratory of Computing Power Network and Information Security, Ministry of Education, Shandong Computer Science Center (National Supercomputer Center in Jinan), Jinnan 250353, China;5.Qilu University of Technology (Shandong Academy of Sciences), Shandong Provincial Key Laboratory of Industrial Network and Information System Security, Shandong Fundamental Research Center for Computer Science, Jinan 250353, China

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
    摘要:

    深度生成模型的快速发展使得合成图像的逼真度不断提高,从图像生成到人脸篡改,各类生成技术已经深入人们的日常生活,图像真实性问题引起关注。此外,主流的图像分类模型主要在风格丰富多变的自然场景数据集上进行预训练,而单一提示词虽能生成大量的数据,但是存在明显的同质化问题,影响了学习难度的均衡性,从而使得传统的图像二分类训练方法在生成图像检测任务上存在泛化能力不足的问题。针对此类问题,本文提出了一种难易样本不均衡下的检测方法,无需修改现有分类模型,通过生成数据的自我增强方式,建立了一种有效的数据增强范式,扩充生成数据的多样性,从而平衡模型的学习难度。同时,在难易样本中利用修正的类交叉熵损失进行敏感惩罚。本文所提方法在2023年11月山东省人工智能学会举办的计算机视觉应用挑战赛(真假图片识别赛)中取得了最好的结果。

    Abstract:

    With the rapid development of deep generative models, the realism of synthetic images has been continuously improving, and various generative technologies have been deeply integrated into people’s daily life, from image generation to face manipulation, which brings attention to the authenticity of images. In addition, mainstream image classification models are mainly pre-trained on natural scene datasets with rich and varied styles, while a single prompt can generate a large amount of data, but there is an obvious homogeneity problem, which affects the imbalance of learning difficulty, thus making the traditional image binary classification training method in the generated image detection task have insufficient generalization ability. To address such issues, we propose a detection method under the difficulty and easy sample imbalance, which does not need to modify the existing classification model, and establishes an effective data augmentation paradigm by generating data self-enhancement to expand the diversity of generated data, thereby balancing the learning difficulty of the model. At the same time, we use the corrected class cross-entropy loss for sensitive punishment in difficult and easy samples. Finally, the proposed method achieves the best results in the computer vision application challenge: Real and fake image recognition competition held by the artificial intelligence society of shandong province in November 2023.

    参考文献
    相似文献
    引证文献
引用本文

康馨元,李帆,赵慧,王保栋,李鑫.基于CNN-Transformer混合架构的AI生成图像鲁棒检测方法[J].数据采集与处理,2025,40(5):1283-1293

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
历史
  • 收稿日期:2024-02-24
  • 最后修改日期:2024-07-20
  • 录用日期:
  • 在线发布日期: 2025-10-15