RD-GAN:一种结合残差密集网络的高清动漫人脸生成方法

doi:10.16337/j.1004-9037.2021.01.002

首页 > 按月查看>2021年第1月 >22-34. DOI:10.16337/j.1004-9037.2021.01.002

RD-GAN:一种结合残差密集网络的高清动漫人脸生成方法
DOI:
                        10.16337/j.1004-9037.2021.01.002
                    
作者:
                        
                        
                    
作者单位:江西师范大学计算机信息工程学院，南昌 330022
作者简介:
通讯作者:
基金项目:国家自然科学基金(61462042， 61966018)资助项目。

RD-GAN: A High Definition Animation Face Generation Method Combined with Residual Dense Network

Author:

Affiliation:

School of Computer Information Engineering, Jiangxi Normal University, Nanchang 330022,China

Fund Project:

摘要

图/表

访问统计

参考文献

相似文献

引证文献

资源附件

摘要:

随着动漫产业的快速发展，动漫人脸的生成成为一项关键技术。由于动漫人脸具有的高度简化和抽象的独特风格以及倾向于具有清晰的边缘、平滑的阴影和相对简单的纹理，现有方法中的损失函数面临很大的挑战，同时绘画的风格迁移技术无法获得满意的动漫结果。因此，本文提出了一种新颖的适用于动漫图像的损失函数，该函数的语义损失表示为VGG网络高级特征图中的正则化形式，以应对真实图像和动漫图像之间不同的风格，具有的边缘增强的边缘清晰损失可以保留动漫图像的边缘清晰度。4个公开数据集上的实验表明：通过本文提出的损失函数可以生成清晰生动的动漫人脸图像；在CK+数据集中，本文方法相比于现有的方法识别率提高了0.43%（宫崎骏风格）和3.29%（新海诚风格）；在RAF数据集中，本文方法识别率提高了0.85%（宫崎骏风格）和2.42%（新海诚风格）；在SFEW数据集中，本文方法识别率提高了0.71%（宫崎骏风格）和3.14%（新海诚风格）；在Celeba数据集中也显示了本文方法优异的生成效果。实验结果说明本文方法结合了深度学习模型的优点，使检测结果更加准确。

Abstract:

With the rapid development of the animation industry， the face generation with animation characters becomes a key technology. The existing style transfer technology of painting style cannot obtain satisfactory animation results due to the following characteristics of animation：（1） Animation has a highly simplified and abstract unique style， and （2） animation tends to have clear edges and smooth shadows and relatively simple textures， which poses great challenges to the loss function in existing methods. This paper proposes a novel loss function suitable for animation. In the loss function， the semantic loss is expressed as a regularized form in the high-level feature map of the VGG network to deal with the different styles between real and animation images， and the edge sharpness loss with edge enhancement can preserve the edge sharpness of animation images. Experiments on the four public data sets show that through the proposed loss function， clear and vivid animation images can be generated. Moreover， in the CK+ data set， the recognition rate of the proposed method is increased by 0.43% （Miyazaki Hayao style） and 3.29% （Makoto Shinkai style） compared with the existing method， increased by 0.85% （Miyazaki Hayao style） and 2.42% （Makoto Shinkai style） in the RAF data set， and increased by 0.71% （Miyazaki Hayao style） and 3.14% （Makoto Shinkai style） in the SFEW data set， respectively. The generation effect in the Celeba data set is also demonstrated. The above results show that the proposed method combines the advantages of the deep learning model to make the detection result more accurate.

表 1 CK+数据集上动漫图像不同表情的识别结果Table 1 Expression recognition results of anime images with different expressions on CK+ dataset

表 2 RAF数据集上动漫图像不同表情的识别结果Table 2 Expression recognition results of anime images with different expressions on RAF dataset

图1 生成对抗网络框架Fig.1 Generative adversarial net framework

图3 有无边缘清晰损失的生成动漫图像（宫崎骏风格）Fig.3 Generate animation images with and without loss of edge sharpness (Miyazaki Hayao style)

图4 有无边缘清晰损失的生成动漫图像（新海诚风格）Fig.4 Generate animation images with and without loss of sharp edges (Makoto Shinkai style)

图5 无语义损失生成的图像Fig.5 Generated images without semantic loss

图6 本文模型（新海诚风格）在CK+数据集中的表情识别结果混淆矩阵Fig.6 Confusion matrix of facial expression recognition results of CK+ dataset by the proposed model (Makoto Shinkai style)

图7 本文模型（新海诚风格）在RAF数据集中的表情识别结果混淆矩阵Fig.7 Confusion matrix of facial expression recognition results of RAF dataset by the proposed model (Makoto Shinkai style)

图8 本文模型（新海诚风格）在SFEW数据集中的表情识别结果混淆矩阵Fig.8 Confusion matrix of facial expression recognition results of SFEW dataset by the proposed model (Makoto Shinkai style)