【SFT】Recovering Realistic Texture in Image Super-resolution by Deep Spatial Feature Transform

论文： https://arxiv.org/pdf/1804.02815.pdf
主页：http://mmlab.ie.cuhk.edu.hk/projects/SFTGAN/
代码：https://github.com/xinntao/SFTGAN

贡献点：

提出了SFT层

In this paper, we show that it is possible to recover textures faithful to semantic classes.[semantic priors]
Our final results show that an SR network equipped with SFT can generate more realistic and visually pleasing textures in comparison to state-of-the-art SRGAN [27] and EnhanceNet [38].

基于语义类别分类来恢复细节，并且加了SFT的SR网络会得到更加真实和视觉上更好的纹理。
对SR中一些loss进行了分析和探讨。

conventional pixel-wise mean squared error (MSE) loss [7] that tends to encourage blurry and overly-smoothed results
adversarial loss to encourage the network to favor solutions that look more like natural images
基于像素的损失会导致图像模糊和平滑；使用perceptual loss对特征维度进行优化，结合adversarial loss能得到更自然的结果

思路：

applying an affine transformation spatially to each intermediate feature maps in an SR network

个人理解：
网络结构如下，SFT网络的输入为LR图像和一个Condition，这个condition是由一个分割网络得到lr图像中不同类别的语义分割图经过四个卷积构成的condition network得到的。在SFT网络中SFT layer针对该condition进行处理得到对应的scale和shift对feature map进行transform。所以网络最后能够针对不同类别的语义学到相应的处理，也就可以通过语义分割mask引导网络的处理。