基于MNIST数据集使用TensorFlow训练一个没有隐含层的浅层神经网络

基础

在参考①中我们详细介绍了没有隐含层的神经网络结构，该神经网络只有输入层和输出层，并且输入层和输出层是通过全连接方式进行连接的。具体结构如下：

我们用此网络结构基于MNIST数据集（参考②）进行训练，在MNIST数据集中每张图像的分辨率为28*28，即784维，对应于上图中的x; 而输出为数字类别，即0~9，因此上图中的y的维度维10。因此权重w的维度为[784, 10]，wi,j代表第j维的特征对应的第i类的权重值，主要是为了矩阵相乘时计算的方便，具体见下面代码。

训练过程

1、训练过程中反向传播优化器选择了梯度下降算法，结合代码中使用batch训练，因此梯度下降算法是mini-batch，也就使用batch_size（代码中为100）的批量梯度下降算法。

2、损失函数选择使用了softmax的交叉熵。

from tensorflow.examples.tutorials.mnist import input_data
import tensorflow as tf

# 加载数据
mnist = input_data.read_data_sets('/home/workspace/python/tf/data/mnist', one_hot=True)

# 创建模型
x = tf.placeholder(tf.float32, [None, 784])
W = tf.Variable(tf.zeros([784, 10]))
b = tf.Variable(tf.zeros([10]))
y = tf.matmul(x, W) + b

# 正确的样本标签
y_ = tf.placeholder(tf.float32, [None, 10])

# 损失函数选择softmax后的交叉熵，结果作为y的输出
cross_entropy = tf.reduce_mean(
    tf.nn.softmax_cross_entropy_with_logits(labels=y_, logits=y))
train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)

sess = tf.InteractiveSession()
tf.global_variables_initializer().run()

# 训练过程
for _ in range(1000):
    batch_xs, batch_ys = mnist.train.next_batch(100)
    sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys})

#使用测试集评估准确率
correct_prediction = tf.equal(tf.argmax(y, 1), tf.argmax(y_, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
print (sess.run(accuracy, feed_dict = {x: mnist.test.images,
                                       y_: mnist.test.labels}))

输出：92%左右。

软件版本

TensorFlow 1.0.1 + Python 2.7.12

参考

①、使用Softmax回归将神经网络输出转成概率分布

②、使用Tensorflow操作MNIST数据

③、github上的tensorflow官方示例代码

④、tensorflow官网针对MNIST数据集的入门介绍