本文在 《TensorFlow 实战之Softmax Regression识别手写数字(一)》的基础上增加了一个隐藏层(200个节点),使用relu激活函数,将手写数字识别准确率从91.2%提升到了95.8%。


变化细节:

增加200个节点的隐藏层,使用relu作为隐藏层的激活函数,当然也尝试了使用sigmoid作为激活函数,但是效果没relu好。sigmoid和relu的比较,后续会用专门的章节进行介绍。


实战中如果遇到问题,请参考 http://www.sohu.com/a/125061373_465975 这篇博客。


代码:

# coding:utf-8
'''
简单softmax识别手写数字
'''
import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data

####################################################
#
# @brief 定义模型参数
#
####################################################

LEARNING_RATE = 0.01
BATCH_NUMBER = 1000
BATCH_SIZE = 50
HIDDEN_SIZE = 200
INPUT_SIZE = 784
CLASS_NUMBER = 10

####################################################
#
# @brief 定义输入
#
####################################################

# x: 输入向量, None*784维
x = tf.placeholder("float", [None, INPUT_SIZE])

# y: 输入标签
y = tf.placeholder("float", [None, CLASS_NUMBER])

####################################################
#
# @brief 定义求解参数
#
####################################################

# w: 权重, 784 * 10 维
# w = tf.Variable(tf.zeros([784, HIDDEN_SIZE]), name="weights")
w1 = tf.Variable(tf.truncated_normal([INPUT_SIZE, HIDDEN_SIZE], stddev=0.1))
# b: 截距, 10维
# b = tf.Variable(tf.zeros([HIDDEN_SIZE]), name="biases")
b1 = tf.Variable(tf.ones([HIDDEN_SIZE])/10, name="biases1")

# relu效果
# h1 = tf.sigmoid(tf.matmul(x, w1) + b1) # 0.9212
h1 = tf.nn.relu(tf.matmul(x, w1) + b1) # 0.9558

w2 = tf.Variable(tf.truncated_normal([HIDDEN_SIZE, CLASS_NUMBER], stddev=0.1))
b2 = tf.Variable(tf.ones([CLASS_NUMBER])/10, name="biases2")

####################################################
#
# @brief 定义优化过程
#
####################################################

# predict_y: 预测的结果 x * w + b, None * 10 维
predict_y = tf.nn.softmax(tf.matmul(h1, w2) + b2)

# 交叉熵损失函数
cross_entropy = - tf.reduce_sum(y * tf.log(predict_y))

# 梯度下降
train_op = tf.train.GradientDescentOptimizer(LEARNING_RATE).minimize(cross_entropy)

####################################################
#
# @brief 定义衡量指标 op
#
####################################################

equal_op = tf.equal(tf.argmax(y, 1), tf.argmax(predict_y, 1))
accuracy_op = tf.reduce_mean(tf.cast(equal_op, "float"))

####################################################
#
# @brief 开始求解
#
####################################################

# 初始化 op
init_op = tf.initialize_all_variables()

# mnist数据输入
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)

test_xs = mnist.test.images
test_ys = mnist.test.labels

# 创建一个保存变量的op
saver = tf.train.Saver()

with tf.Session() as sess:
    sess = tf.Session()
    sess.run(init_op)
    # 迭代
    for step in range(BATCH_NUMBER):
        train_xs, train_ys = mnist.train.next_batch(BATCH_SIZE)
        sess.run(train_op, feed_dict={x: train_xs, y: train_ys})
        train_accuracy = sess.run(accuracy_op, feed_dict={x: train_xs, y: train_ys})
        test_accuracy = sess.run(accuracy_op, feed_dict={x: test_xs, y: test_ys})

        print("step-%d accuracy: train-%f test-%f" % \
              (step, train_accuracy, test_accuracy))


输出:

step-0 accuracy: train-0.240000 test-0.104600
step-1 accuracy: train-0.520000 test-0.312800
step-2 accuracy: train-0.300000 test-0.097400
step-3 accuracy: train-0.340000 test-0.415200
step-4 accuracy: train-0.620000 test-0.404300
step-5 accuracy: train-0.560000 test-0.486000
step-6 accuracy: train-0.720000 test-0.525000
step-7 accuracy: train-0.700000 test-0.604200
step-8 accuracy: train-0.640000 test-0.618300
step-9 accuracy: train-0.760000 test-0.607800
step-10 accuracy: train-0.740000 test-0.617200
step-11 accuracy: train-0.680000 test-0.617000
step-12 accuracy: train-0.800000 test-0.678400
step-13 accuracy: train-0.780000 test-0.656600
step-14 accuracy: train-0.920000 test-0.738200
step-15 accuracy: train-0.760000 test-0.716000
step-16 accuracy: train-0.840000 test-0.734400
step-17 accuracy: train-0.800000 test-0.763600
step-18 accuracy: train-0.920000 test-0.727200
step-19 accuracy: train-0.920000 test-0.800100
step-20 accuracy: train-0.840000 test-0.731300
......
step-980 accuracy: train-0.960000 test-0.961400
step-981 accuracy: train-1.000000 test-0.959300
step-982 accuracy: train-1.000000 test-0.963400
step-983 accuracy: train-1.000000 test-0.964900
step-984 accuracy: train-1.000000 test-0.964200
step-985 accuracy: train-1.000000 test-0.962800
step-986 accuracy: train-1.000000 test-0.964000
step-987 accuracy: train-0.980000 test-0.961000
step-988 accuracy: train-1.000000 test-0.958100
step-989 accuracy: train-1.000000 test-0.959300
step-990 accuracy: train-1.000000 test-0.959900
step-991 accuracy: train-1.000000 test-0.960400
step-992 accuracy: train-1.000000 test-0.962100
step-993 accuracy: train-0.980000 test-0.964100
step-994 accuracy: train-1.000000 test-0.965600
step-995 accuracy: train-1.000000 test-0.959000
step-996 accuracy: train-1.000000 test-0.961000
step-997 accuracy: train-1.000000 test-0.963800
step-998 accuracy: train-0.980000 test-0.963600
step-999 accuracy: train-1.000000 test-0.957700