【CBAM】2018-ECCV-CBAM: Convolutional block attention module-论文阅读

CBAM

2018-ECCV-CBAM: Convolutional block attention module

来源: ChenBong 博客园

Institute：KAIST，Lunit，Adobe
Author：Sanghyun Woo， Jongchan Park， Joon-Young Lee， In So Kweon
GitHub：
- https://github.com/Jongchan/attention-module 1k+
- https://github.com/luuuyi/CBAM.PyTorch 600+
Citation： 1600+

Introduction

提出了一种在 channel-wise 和 spatial-wise 的注意力模块，可以嵌入任何CNN，在增加微小的计算开销的情况下，显著提高模型性能。

Motivation

人类视觉会关注到重要的部分，而不是图片的每个像素

Contribution

简单高效的 attention 模块（CBMA），可以用来嵌入任何CNN结构

Method

Feature MAP： (mathbf{F} in mathbb{R}^{C imes H imes W})

1D Channel attention Map： (mathbf{M}_{mathbf{c}} in mathbb{R}^{C imes 1 imes 1})

2D Spatial attention Map： (mathbf{M}_{mathbf{s}} in mathbb{R}^{1 imes H imes W})

Feature MAP 先乘 1D 的 Channel attention Map，再乘 2D 的 Spatial attention Map：

(mathbf{F}^{prime}=mathbf{M}_{mathbf{c}}(mathbf{F}) otimes mathbf{F})
(mathbf{F}^{prime prime}=mathbf{M}_{mathbf{s}}left(mathbf{F}^{prime} ight) otimes mathbf{F}^{prime})

Channel attention module

(egin{aligned} mathbf{M}_{mathbf{c}}(mathbf{F}) &=sigma(operatorname{MLP}(operatorname{AvgPool}(mathbf{F}))+M L P(operatorname{MaxPool}(mathbf{F}))) \ &=sigmaleft(mathbf{W}_{mathbf{1}}left(mathbf{W}_{mathbf{0}}left(mathbf{F}_{mathbf{a v g}}^{mathbf{c}} ight) ight)+mathbf{W}_{mathbf{1}}left(mathbf{W}_{mathbf{0}}left(mathbf{F}_{max }^{mathbf{c}} ight) ight) ight) end{aligned})

其中 (mathbf{W_0}) 和 (mathbf{W_1}) 是2层的Share MLP的参数

Spatial attention module

(egin{aligned} mathbf{M}_{mathbf{s}}(mathbf{F}) &=sigmaleft(f^{7 imes 7}([operatorname{AvgPool}(mathbf{F}) ; operatorname{MaxPool}(mathbf{F})]) ight) \ &=sigmaleft(f^{7 imes 7}left(left[mathbf{F}_{mathbf{a v g}}^{mathbf{s}} ; mathbf{F}_{mathbf{m a x}}^{mathbf{s}} ight] ight) ight) end{aligned})