Deep Imbalanced Attribute Classification usingVisual Attention Aggregation

关于Deep Imbalanced Attribute Classification usingVisual Attention Aggregation 文章的复现

这篇文章的创新点：

１　一个新的混合attention，这个结构应该很不错，即采用普通正面attention与channel attention相结合

代码放在这里：

－－－－－－－－－－－－－－－－－－－－－

x=self.layer4(x)#得到第四个block的结果:32*2048*7*7

sigx = f.sigmoid(x)#常规在w*h方面的attention

－－－－－－－－－－－－－－－－－－－－

softmx = f.exp(x)

softmx = softmx/softmx.sum(2).sum(2).unsqueeze(2).unsqueeze(3)#在channel得到attention

x=sigx*softmx#即从w*h和channel这两个地方分别进行attention,

－－－－－－－－－－－－－－－－－－－－－－－－－

这个block可以改进，也就是channel可以在全连接一下得到新的值，

softmx = softmx/softmx.sum(2).sum(2)

softmx = softmx.view(32,-1)

全连接

sofmx = sofmx.unsqueeze(2).unsqueeze(3)

x=sigx*softmx

－－－－－－－－－－－－－－－－－－－－－－－－－－－

这个混合的attention包括

2 loss=前后关系loss+weight bceloss

前后关系loss是

3 第三个是个小细节，和我不一样的，我一般，resnet50最后一个block应该是：32*2048*7*7(也就是batch*channel*width*height)

我的操作是avg_pool2d,然后变成32*2048*1*1再view一下，变成32*2048,再交叉熵

他这里的操作：

１　直接生成32*35*7*7(batch*channel*width*height)其中这个channel就是要分的类别个数，注：peta这里一般是比对35个类

２　做attention

3 重点来了，如果是我是会avg_pool2d,而这里他直接就生成了一个2d(demensition)vector .即32 *(35*7*7)

4 然后32*1715再进行两个连接就行，其实avg_pool确实会减少很多信息，所以我也可以直接32*(2048*7*7),然后直接全连接,

至于这个有没有用，可以先试一下，毕竟参数多了会过拟合，实际证明没什么用

４　复现的时候，有一个细节点，就是初始化，会不会影响，答案是影响，因为参数下降很慢很慢，不初始化好，会等很久

答案是会，具体怎么用，一般是这样

self.conv44w = nn.Conv2d(2048, 35, kernel_size=1, stride=1, padding=0,bias=False)

init.kaiming_normal(self.conv44w.weight.data, a=0, mode='fan_in')

self.bn44w = nn.BatchNorm2d(35)

init.normal(self.bn44w.weight.data, 1.0, 0.02)

init.constant(self.bn44w.bias.data, 0.0)