Since we've explored placeholders and variables, let's build an example model for regression analysis, similar to the one we created in Chapter 13, Parallelizing Neural Network Training with TensorFlow, where our goal is to implement a linear regression model: .
In this model, w and b are the two parameters of this simple regression model that need to be defined as variables. Note that x is the input to the model, which we can define as a placeholder. Furthermore, recall that for training this model, we need to formulate a cost function. Here, we use the Mean Squared Error (MSE) cost function that we defined in Chapter 10, Predicting Continuous Target Variables with Regression Analysis .
Here, y is the true value, which is given as the input to this model for training. Therefore, we need to define y as a placeholder as well. Finally, is the prediction output, which will be computed using TensorFlow operations—tf.matmul
and tf.add
. Recall that TensorFlow operations return zero or more tensors; here, tf.matmul
and tf.add
return one tensor.
We can also use the overloaded operator +
for adding two tensors; however, the advantage of tf.add
is that we can provide an additional name for the resulting tensor via the name
parameter.
So, let's summarize all our tensors with their mathematical notations and coding naming, as follows:
tf_x
defined as a placeholdertf_y
defined as a placeholderweight
defined as a variablebias
defined as a variabley_hat
returned by the TensorFlow operations to compute the prediction using the regression model
The code to implement this simple regression model is as follows:
>>> import tensorflow as tf >>> import numpy as np >>> >>> g = tf.Graph() >>> >>> with g.as_default(): ... tf.set_random_seed(123) ... ## placeholders ... tf_x = tf.placeholder(shape=(None), ... dtype=tf.float32, ... name='tf_x') ... tf_y = tf.placeholder(shape=(None), ... dtype=tf.float32, ... name='tf_y') ... ... ## define the variable (model parameters) ... weight = tf.Variable( ... tf.random_normal( ... shape=(1, 1), ... stddev=0.25), ... name='weight') ... bias = tf.Variable(0.0, name='bias') ... ... ## build the model ... y_hat = tf.add(weight * tf_x, bias, ... name='y_hat') ... ... ## compute the cost ... cost = tf.reduce_mean(tf.square(tf_y - y_hat), ... name='cost') ... ... ## train the model ... optim = tf.train.GradientDescentOptimizer( ... learning_rate=0.001) ... train_op = optim.minimize(cost, name='train_op')
Now that we've built the graph, our next steps are to create a session to launch the graph and train the model. But before we go further, let's see how we can evaluate tensors and execute operations. We'll create a random regression data with one feature, using the make_random_data
function and visualizing the data:
>>> ## create a random toy dataset for regression >>> >>> import numpy as np >>> import matplotlib.pyplot as plt >>> np.random.seed(0) >>> >>> def make_random_data(): ... x = np.random.uniform(low=-2, high=4, size=200) ... y = [] ... for t in x: ... r = np.random.normal(loc=0.0, ... scale=(0.5 + t*t/3), ... size=None) ... y.append(r) ... return x, 1.726*x -0.84 + np.array(y) >>> >>> >>> x, y = make_random_data() >>> >>> plt.plot(x, y, 'o') >>> plt.show()
The following figure shows the random regression data that we generated:
Now we're ready; let's train the previous model. Let's start by creating a TensorFlow session object called sess
. Then, we want to initialize our variables which, as we saw, we can do with sess.run(tf.global_variables_initializer())
. After this, we can create a for
loop to execute the train operator and calculate the training cost at the same time.
So let's combine the two tasks, the first to execute an operator, and the second to evaluate a tensor, into one sess.run
method call. The code for this is as follows:
>>> ## train/test splits >>> x_train, y_train = x[:100], y[:100] >>> x_test, y_test = x[100:], y[100:] >>> >>> >>> n_epochs = 500 >>> training_costs = [] >>> with tf.Session(graph=g) as sess: ... sess.run(tf.global_variables_initializer()) ... ... ## train the model for n_epochs ... for e in range(n_epochs): ... c, _ = sess.run([cost, train_op], ... feed_dict={tf_x: x_train, ... tf_y: y_train}) ... training_costs.append(c) ... if not e % 50: ... print('Epoch %4d: %.4f' % (e, c)) Epoch 0: 12.2230 Epoch 50: 8.3876 Epoch 100: 6.5721 Epoch 150: 5.6844 Epoch 200: 5.2269 Epoch 250: 4.9725 Epoch 300: 4.8169 Epoch 350: 4.7119 Epoch 400: 4.6347 Epoch 450: 4.5742 >>> plt.plot(training_costs) >>> plt.show()
The code generates the following graph that shows the training costs after each epoch: