2020.3-SEAN: Image Synthesis with Semantic Region-Adaptive Normalization

我的结论（仅仅代表个人观点）

* 2020年3月的论文

* 恢复结果结构信息以及图像清晰度得到的改善。

* 分辨率，256*256

* 正脸和小角度侧脸都好用，大角度侧脸没有给出测试结果，没有代码无法测试。

* 对人脸和日常场景照修复效果都挺好的。

* 论文自己说它比以下论文效果好。

20. 《PixelspixHD: Context encoders: Feature learning by inpainting》 (CVPP 2016)

28.《SPADE: Image inpainting via deep feature rearrangement》(ECCV 2017)

26.《GMCNN:Image inpainting via generative mulit-column convolutional neural networks》(NeurIPS 2018)

33.《PICNet:Pluralistic image completion》(CVPR 2019)

1、题目

《SEAN: Image Synthesis with Semantic Region-Adaptive Normalization》

Computer Vision and Pattern Recognition CVPR 2020, Oral

作者：

论文地址：

https://arxiv.org/pdf/1911.12861.pdf

代码地址：

https://github.com/ZPdesu/SEAN

Demo地址：

https://www.youtube.com/watch?v=0Vbj9xFgoUw&feature=youtu.be

2、创新点

a、achieve clear improvements

SEAN improves the quality of the synthesized images for conditional GANs. We compared to the state of the art methods SPADE and Pix2PixHD and achieve clear improvements in quantitative metrics (e.g. FID score) and visual inspection.

b、a geometrical alignment constraint

SEAN improves the per-region style encoding, so that reconstructed images can be made more similar to the input style images as measured by PSNR and visual inspection.

c、a dense multi-scale fusion generator

We propose a dense multi-scale fusion generator, which has the merit of strong representation ability to extract useful features. Our generative image inpainting framework achieves compelling visual results (as illustrated in Figure 1) on challenging datasets, compared with previous state-of-the-art approaches

3、网络框架

A、Per-Region Style Encoder

style matrix ST

4、loss函数

a、 Self-guided regression loss

b、Geometrical alignment constraint

c、Adversarial loss

d、Overall Loss

本王中，损失函数的权重依次是25，5，0.03，1

5、实验

AA、实验数据集

a、CelebAMask-HQ数据集(Progressive growing of gans for improved quality, stability,and variation. In ICLR, 2018. 6)

b、CityScapes数据集(Context encoders: Feature learning by inpainting. In CVPP, pages 2536–2544, 2016. 1, 2, 6, 7)

c、ADE20K数据集(Places: A 10 million image database for scene recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(6):1452–1464. 6)

d、Facades dataset数据集(A style-based generator architecture for generative adversarial networks. In CVPR, pages 4401–4410, 2019. 6)

BB、实验细节

* batch size=16，NVIDIA TITAN Xp GPU(12 GB memory)。

* Adam优化算法，学习率0.0002，beta1=0.5, beta2=0.9。

* 训练+验证：CelebA-HQ、Places2、FFHQ，测试：Paris Street View、CeleA

训练，输入图像大小256*256，hole大小128*128

a、CelebA-HQ数据集，直接resize脸图到256*256。

b、Paris Street View，原始936*537，随机裁剪成537*537，然后直接resize成256*256

c、Places2，原始512*？，随机裁剪成512*512，然后直接resize成256*256

d、FFHQ数据集，直接resize脸图到256*256。

* 端对端，无后处理和预处理。

Igt为原始图像，M为要给Igt增加的mask(对应遮挡区域为1，非遮挡区域为0)，Iin=Igt点乘(1-M)(非遮挡区域像素值和Igt一样，遮挡区域像素值为0，表现为黑色)