本文在 《TensorFlow 实战之Softmax Regression识别手写数字(一)》的基础上增加了一个隐藏层(200个节点),使用relu激活函数,将手写数字识别准确率从91.2%提升到了95.8%。
变化细节:
增加200个节点的隐藏层,使用relu作为隐藏层的激活函数,当然也尝试了使用sigmoid作为激活函数,但是效果没relu好。sigmoid和relu的比较,后续会用专门的章节进行介绍。
实战中如果遇到问题,请参考 http://www.sohu.com/a/125061373_465975 这篇博客。
代码:
# coding:utf-8 ''' 简单softmax识别手写数字 ''' import tensorflow as tf from tensorflow.examples.tutorials.mnist import input_data #################################################### # # @brief 定义模型参数 # #################################################### LEARNING_RATE = 0.01 BATCH_NUMBER = 1000 BATCH_SIZE = 50 HIDDEN_SIZE = 200 INPUT_SIZE = 784 CLASS_NUMBER = 10 #################################################### # # @brief 定义输入 # #################################################### # x: 输入向量, None*784维 x = tf.placeholder("float", [None, INPUT_SIZE]) # y: 输入标签 y = tf.placeholder("float", [None, CLASS_NUMBER]) #################################################### # # @brief 定义求解参数 # #################################################### # w: 权重, 784 * 10 维 # w = tf.Variable(tf.zeros([784, HIDDEN_SIZE]), name="weights") w1 = tf.Variable(tf.truncated_normal([INPUT_SIZE, HIDDEN_SIZE], stddev=0.1)) # b: 截距, 10维 # b = tf.Variable(tf.zeros([HIDDEN_SIZE]), name="biases") b1 = tf.Variable(tf.ones([HIDDEN_SIZE])/10, name="biases1") # relu效果 # h1 = tf.sigmoid(tf.matmul(x, w1) + b1) # 0.9212 h1 = tf.nn.relu(tf.matmul(x, w1) + b1) # 0.9558 w2 = tf.Variable(tf.truncated_normal([HIDDEN_SIZE, CLASS_NUMBER], stddev=0.1)) b2 = tf.Variable(tf.ones([CLASS_NUMBER])/10, name="biases2") #################################################### # # @brief 定义优化过程 # #################################################### # predict_y: 预测的结果 x * w + b, None * 10 维 predict_y = tf.nn.softmax(tf.matmul(h1, w2) + b2) # 交叉熵损失函数 cross_entropy = - tf.reduce_sum(y * tf.log(predict_y)) # 梯度下降 train_op = tf.train.GradientDescentOptimizer(LEARNING_RATE).minimize(cross_entropy) #################################################### # # @brief 定义衡量指标 op # #################################################### equal_op = tf.equal(tf.argmax(y, 1), tf.argmax(predict_y, 1)) accuracy_op = tf.reduce_mean(tf.cast(equal_op, "float")) #################################################### # # @brief 开始求解 # #################################################### # 初始化 op init_op = tf.initialize_all_variables() # mnist数据输入 mnist = input_data.read_data_sets("MNIST_data/", one_hot=True) test_xs = mnist.test.images test_ys = mnist.test.labels # 创建一个保存变量的op saver = tf.train.Saver() with tf.Session() as sess: sess = tf.Session() sess.run(init_op) # 迭代 for step in range(BATCH_NUMBER): train_xs, train_ys = mnist.train.next_batch(BATCH_SIZE) sess.run(train_op, feed_dict={x: train_xs, y: train_ys}) train_accuracy = sess.run(accuracy_op, feed_dict={x: train_xs, y: train_ys}) test_accuracy = sess.run(accuracy_op, feed_dict={x: test_xs, y: test_ys}) print("step-%d accuracy: train-%f test-%f" % \ (step, train_accuracy, test_accuracy))
输出:
step-0 accuracy: train-0.240000 test-0.104600 step-1 accuracy: train-0.520000 test-0.312800 step-2 accuracy: train-0.300000 test-0.097400 step-3 accuracy: train-0.340000 test-0.415200 step-4 accuracy: train-0.620000 test-0.404300 step-5 accuracy: train-0.560000 test-0.486000 step-6 accuracy: train-0.720000 test-0.525000 step-7 accuracy: train-0.700000 test-0.604200 step-8 accuracy: train-0.640000 test-0.618300 step-9 accuracy: train-0.760000 test-0.607800 step-10 accuracy: train-0.740000 test-0.617200 step-11 accuracy: train-0.680000 test-0.617000 step-12 accuracy: train-0.800000 test-0.678400 step-13 accuracy: train-0.780000 test-0.656600 step-14 accuracy: train-0.920000 test-0.738200 step-15 accuracy: train-0.760000 test-0.716000 step-16 accuracy: train-0.840000 test-0.734400 step-17 accuracy: train-0.800000 test-0.763600 step-18 accuracy: train-0.920000 test-0.727200 step-19 accuracy: train-0.920000 test-0.800100 step-20 accuracy: train-0.840000 test-0.731300 ...... step-980 accuracy: train-0.960000 test-0.961400 step-981 accuracy: train-1.000000 test-0.959300 step-982 accuracy: train-1.000000 test-0.963400 step-983 accuracy: train-1.000000 test-0.964900 step-984 accuracy: train-1.000000 test-0.964200 step-985 accuracy: train-1.000000 test-0.962800 step-986 accuracy: train-1.000000 test-0.964000 step-987 accuracy: train-0.980000 test-0.961000 step-988 accuracy: train-1.000000 test-0.958100 step-989 accuracy: train-1.000000 test-0.959300 step-990 accuracy: train-1.000000 test-0.959900 step-991 accuracy: train-1.000000 test-0.960400 step-992 accuracy: train-1.000000 test-0.962100 step-993 accuracy: train-0.980000 test-0.964100 step-994 accuracy: train-1.000000 test-0.965600 step-995 accuracy: train-1.000000 test-0.959000 step-996 accuracy: train-1.000000 test-0.961000 step-997 accuracy: train-1.000000 test-0.963800 step-998 accuracy: train-0.980000 test-0.963600 step-999 accuracy: train-1.000000 test-0.957700