MINST手写数字识别（三）—— 使用antirectifier替换ReLU激活函数

这是一个来自官网的示例：https://github.com/keras-team/keras/blob/master/examples/antirectifier.py

与之前的MINST手写数字识别全连接网络相比，只是本实例使用antirectifier替换ReLU激活函数.

  1 '''The example demonstrates how to write custom layers for Keras.
  2 # Keras自定义层编写示范
  3 
  4 We build a custom activation layer called 'Antirectifier',
  5 建立了一个自定义的激活 'Antirectifier'(反校正)
  6 
  7 which modifies the shape of the tensor that passes through it.
  8 它修改通过它的张量的形状。
  9 
 10 We need to specify two methods: `compute_output_shape` and `call`.
 11 需要指定两种方法: `compute_output_shape` and `call`.
 12 
 13 Note that the same result can also be achieved via a Lambda layer.
 14 注意，同样的结果也可以通过Lambda层来实现
 15 
 16 Because our custom layer is written with primitives from the Keras
 17 自定义层是用keras底层编写的，
 18 
 19 backend (`K`), our code can run both on TensorFlow and Theano.
 20 代码可基于TensorFlow and Theano框架运行
 21 '''
 22 
 23 
 24 from __future__ import print_function
 25 import keras
 26 from keras.models import Sequential
 27 from keras import layers
 28 from keras.datasets import mnist
 29 from keras import backend as K
 30 
 31  
 32 class Antirectifier(layers.Layer):
 33 
 34     '''This is the combination of a sample-wise
 35     L2 normalization with the concatenation of the
 36     positive part of the input with the negative part
 37     of the input. The result is a tensor of samples that are
 38     twice as large as the input samples.
 39     这是示例性的L2归一化与输入的正样本与输入的负样本的级联的组合。结果张量样本是输入样本的两倍大。
 40 
 41     It can be used in place of a ReLU.
 42     可以使用RelU（Rectified Linear Unit,线性整流函数, 激活函数）替换
 43 
 44     # Input shape
 45     输入形状
 46         2D tensor of shape (samples, n)
 47         形状的2维张量
 48 
 49     # Output shape
 50     输出形状
 51         2D tensor of shape (samples, 2*n)
 52         形状的2维张量
 53 
 54     # Theoretical justification
 55     理论证明
 56         When applying ReLU, assuming that the distribution
 57         of the previous output is approximately centered around 0.,
 58         使用ReLU时，假设前一个输出分布的以0中心分布
 59 
 60         you are discarding half of your input. This is inefficient.
 61         放弃了一半的输入。这是低效的。
 62 
 63         Antirectifier allows to return all-positive outputs like ReLU,
 64         without discarding any data.
 65         反校正返回了所有正样本输出，像ReLU一样，没有丢弃数据。
 66 
 67         Tests on MNIST show that Antirectifier allows to train networks
 68         with twice less parameters yet with comparable
 69         classification accuracy as an equivalent ReLU-based network.
 70         基于MINIST（数据集）训练，展示反校正训练网络和同类ReLU-based网络相比，使用少于2倍的参数参数，但是实现了类似的分类准确度。
 71     '''
 72 
 73     def compute_output_shape(self, input_shape):
 74         shape = list(input_shape)
 75         assert len(shape) == 2  # only valid for 2D tensors
 76         shape[-1] *= 2
 77         return tuple(shape)
 78 
 79     def call(self, inputs):
 80         inputs -= K.mean(inputs, axis=1, keepdims=True)
 81         inputs = K.l2_normalize(inputs, axis=1)
 82         pos = K.relu(inputs)
 83         neg = K.relu(-inputs)
 84         return K.concatenate([pos, neg], axis=1)
 85 
 86 
 87 # global parameters
 88 # 全局变量
 89 batch_size = 128
 90 num_classes = 10
 91 epochs = 40
 92 
 93 
 94 # the data, shuffled and split between train and test sets
 95 # 筛选（数据顺序打乱）、划分训练集和测试集
 96 (x_train, y_train), (x_test, y_test) = mnist.load_data()
 97 x_train = x_train.reshape(60000, 784)
 98 x_test = x_test.reshape(10000, 784)
 99 x_train = x_train.astype('float32')
100 x_test = x_test.astype('float32')
101 x_train /= 255
102 x_test /= 255
103 print(x_train.shape[0], 'train samples')
104 print(x_test.shape[0], 'test samples')
105 
106 
107 # convert class vectors to binary class matrices
108 # 类别向量转为多分类矩阵
109 y_train = keras.utils.to_categorical(y_train, num_classes)
110 y_test = keras.utils.to_categorical(y_test, num_classes)
111 
112  
113 # build the model
114 # 建立模型
115 model = Sequential()
116 model.add(layers.Dense(256, input_shape=(784,)))
117 model.add(Antirectifier())
118 model.add(layers.Dropout(0.1))
119 model.add(layers.Dense(256))
120 model.add(Antirectifier())
121 model.add(layers.Dropout(0.1))
122 model.add(layers.Dense(num_classes))
123 model.add(layers.Activation('softmax'))
124 
125 # compile the model
126 # 编译模型
127 model.compile(loss='categorical_crossentropy',
128               optimizer='rmsprop',
129               metrics=['accuracy'])
130 
131 # train the model
132 # 训练模型
133 model.fit(x_train, y_train,
134           batch_size=batch_size,
135           epochs=epochs,
136           verbose=1,
137           validation_data=(x_test, y_test))
138 
139 
140 # next, compare with an equivalent network
141 # with2x bigger Dense layers and ReLU
142 # 下一步，使用同结构网络比较，该网络有2倍打的全连接层和ReLU激活函数

执行结果：

60000 train samples
10000 test samples
Train on 60000 samples, validate on 10000 samples
Epoch 1/4
60000/60000 [==============================] - 4s 62us/step - loss: 0.6030 - acc: 0.9131 - val_loss: 0.1637 - val_acc: 0.9565
Epoch 2/4
60000/60000 [==============================] - 3s 58us/step - loss: 0.1264 - acc: 0.9652 - val_loss: 0.0910 - val_acc: 0.9730
Epoch 3/4
60000/60000 [==============================] - 3s 57us/step - loss: 0.0822 - acc: 0.9762 - val_loss: 0.0836 - val_acc: 0.9757
Epoch 4/4
60000/60000 [==============================] - 3s 57us/step - loss: 0.0638 - acc: 0.9810 - val_loss: 0.0762 - val_acc: 0.9780
<keras.callbacks.History at 0x7f355fba6c88>

评估模型：

score = model.evaluate(x_test, y_test,
                       verbose=1)
print('Test score:', score[0])
print('Test accuracy:', score[1])

评估结果：

10000/10000 [==============================] - 1s 59us/step
Test score: 0.07624727148264647
Test accuracy: 0.978

参考链接：

1、https://github.com/keras-team/keras/blob/master/examples/antirectifier.py

2、https://blog.csdn.net/wyx100/article/details/80678735