when using Tensorflow for multiple linear regression, we encounter the problem of parameter non-convergence. The problem lies in the choice of optimization methods: if you use tf.train.AdamOptimizer
, the parameters will converge and the loss function is reasonable, but the weight and bias items are not consistent with the original, which is the first place that you don"t understand; if you use opt = tf.train.GradientDescentOptimizer
, the loss function will always increase and you can"t find the reason. If beginners have been unable to find the reason, I hope you have something to understand, you can help explain that the amount of code is not large. Here is the code:
import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf
-sharp
X1 = np.matrix(np.random.uniform(-10, 10, 100)).T
X2 = np.matrix(np.linspace(-10, 10, 100)).T
X3 = np.matrix(np.linspace(-10, 10, 100)).T
X_input = np.concatenate((X1, X2, X3), axis=1)
-sharp 20,, -35, 4.3 25
Y_input = 20 * X1 - 35 * X2 + 4.3 * X3 + 25 * np.ones((100, 1))
-sharp
W = tf.Variable(tf.random_uniform(shape=[3, 1]))
b = tf.Variable(tf.random_uniform(shape=[1, 1]))
-sharp
X = tf.placeholder(dtype=tf.float32, shape=[None, 3])
Y = tf.placeholder(dtype=tf.float32, shape=[None, 1])
-sharp
Y_pred = tf.matmul(X, W) + b * np.ones((100, 1))
-sharp
loss = tf.reduce_sum(tf.square(Y_pred - Y)) / 100
-sharp Adma0.01
opt = tf.train.AdamOptimizer(0.01).minimize(loss)
-sharp
-sharp opt = tf.train.GradientDescentOptimizer(0.01).minimize(loss)
-sharp
x_axis = []
y_axis = []
with tf.Session() as sess:
-sharp
sess.run(tf.global_variables_initializer())
print("training,please wait...")
for i in range(20000):
sess.run(opt, feed_dict={Y: Y_input, X: X_input})
x_axis.append(i)
y_axis.append(sess.run(loss, feed_dict={Y: Y_input, X: X_input}))
print("finish training!")
print("W:", sess.run(W), "\nb:", sess.run(b))
print(sess.run(loss, feed_dict={Y: Y_input, X: X_input}))
plt.plot(x_axis, y_axis)
plt.show()