An introduction to Tensorflex


The aim of this blog post

This blog post serves to function as a tutorial for using the project I have been working on as part of Google Summer of Code, called Tensorflex. Tensorflex is a bridge to Tensorflow, Google’s machine learning framework. Follow the work on Github here and star/watch the repository if it helps you out!

This post will first touch upon using Python for training and saving models and then on how to make predicitions from the saved model using Tensorflex.

Tensorflow and Tensorflex

  • Tensorflow is Google’s machine learning framework and is the de-facto standard for training models and making predictions in industry and academia (along with PyTorch).
  • The best way to use Tensorflow for training models is through Python. The C++ API can also be used but is generally more cumbersome and low-level than needed. Moreover, documentation for the Tensorflow Python API is the most easy to read and understand.
  • Tensorflex consist of the Tensorflow bindings for Elixir. It is based on the Tensorflow C API which only supports making predictions from saved models at the moment.
  • Tensorflex can be used for making predictions in Elixir from trained models written in Python.
  • The first half of this post covers using Python for training a simple neural network model on the widely known Iris dataset
  • The second half covers making predictions from a set of inputs using Tensorflex

Introduction to computational graphs

alt text

Computational graphs form the basis for how Tensorflow works. The figure shown above is an example of a computational graph. As is evident, a graph consists of a set of operations that are performed on the inputs provided to the graph. Tensorflow uses this graph to:

  • describe the architecture for your machine learning model
  • save this graph (after training) so that predictions can be made by performing the same set of actions (read operations) for the input provided at the moment

Therefore, without too much further ado, we now discuss the dataset to be used for our ML model, then the Python Tensorflow code and Tensorflex.

The Iris dataset

This is perhaps the best known database to be found in the pattern recognition literature. Fisher’s paper is a classic in the field and is referenced frequently to this day. (See Duda & Hart, for example.) The data set contains 3 classes of 50 instances each, where each class refers to a type of iris plant. One class is linearly separable from the other 2; the latter are NOT linearly separable from each other. The predicted attribute is the class of iris plant.

  • Attribute Information:
    • sepal length in cm
    • sepal width in cm
    • petal length in cm
    • petal width in cm
    • class:
      • Iris Setosa
      • Iris Versicolour
      • Iris Virginica

The model can be downloaded as a CSV file here

Python code for training a simple model on the Iris dataset

This code has been adopted from the original Github repository here. The entire code that is to follow is available as an example in the Tensorflex repository. Here is the link.

The first step requires cloning the Tensorflex repository, and then you can navigate to the examples/iris-example folder and run the example directly from there using python train_model.py on the command line.

A brief description of what is happening in the train_model.py Python code:

We first make all the required imports, and then read in the entire dataset from the CSV file. We then put the features in the X NumPy array and assign the class labels to the y NumPy vector. Then in the last few lines of this code snippet we shuffle the data using the np.random.choice function and assign these shuffled values from X to X_values and y to y_values respectively. Also, it is important to note, when we read in the class labels, we encode them as one-hot variables because the neural network can only understand numeral values and not strings.

import tensorflow as tf
import pandas as pd
import numpy as np
from tensorflow.python.tools import freeze_graph
import os

dir = os.path.dirname(os.path.realpath(__file__)) + '/data/'

seed = 1234
np.random.seed(seed)
tf.set_random_seed(seed)

dataset = pd.read_csv('dataset.csv')
dataset = pd.get_dummies(dataset, columns=['Species']) 
values = list(dataset.columns.values)

y = dataset[values[-3:]]
y = np.array(y, dtype='float32')
X = dataset[values[1:-3]]
X = np.array(X, dtype='float32')

indices = np.random.choice(len(X), len(X), replace=False)
X_values = X[indices]
y_values = y[indices]

Next, we separate the entire data into train and test sets. Since, we want to feed in the input later on into Elixir manually and see the results, we set the test dataset size to just 10 values. We also declare our session variable to be used for computation later on.

Then we define the neural network architecture for our model. As can be seen it consists of the following operations:

  • A placeholder for our input data/features, which can consist of any number of rows and 4 columns (corresponding to the 4 features)
  • A placeholder for the class labels corresponding to the input features which has the same number of rows as the input, but 3 columns (For the 3 classes, after one-hot encoding)
  • There are two hidden layer and both have 8 nodes
  • The first set of weights to be multipled to the input, the biases to be added to this product
  • The second set of weights to be multiplied to the matrix sum obtained in the previous step and the second set of biases to be added to this product
  • The obtained matrix sum is passed through to the softmax function to obtain the output. Softmax squashes the results into values between 0 and 1 which resemble probabilities (although they are mathematically not). Thus one can see which class was predicted by observing which class has the highest numerical value at the output

It is extremely important to understand that operations need to be named. This is crucial to reusing this graph later for making predictions in Elixir. Moreover, it is important to remember the input operation names as well as the output operation names, as these are going to be referred to when feeding input data and obtaining outputs in Elixir. In our case, we name the input operation input and the output operation output.

test_size = 10
X_test = X_values[-test_size:]
X_train = X_values[:-test_size]
y_test = y_values[-test_size:]
y_train = y_values[:-test_size]

sess = tf.Session()

interval = 50
epoch = 500

X_data = tf.placeholder(shape=[None, 4], dtype=tf.float32, name='input')
y_target = tf.placeholder(shape=[None, 3], dtype=tf.float32)

hidden_layer_nodes = 8

w1 = tf.Variable(tf.random_normal(shape=[4,hidden_layer_nodes]), name='weights1') 
b1 = tf.Variable(tf.random_normal(shape=[hidden_layer_nodes]), name='biases1')   
w2 = tf.Variable(tf.random_normal(shape=[hidden_layer_nodes,3]), name='weights2') 
b2 = tf.Variable(tf.random_normal(shape=[3]),name='biases2')


hidden_output = tf.nn.relu(tf.add(tf.matmul(X_data, w1), b1))
final_output = tf.nn.softmax(tf.add(tf.matmul(hidden_output, w2), b2), name='output')

Next, we define the loss function to be used, the optimization algorithm for minimizing this loss function, and then initialize our session. We then train our model and observe the values of loss as it is minimized in every subsequent epoch.

loss = tf.reduce_mean(-tf.reduce_sum(y_target * tf.log(final_output), axis=0))


optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.001).minimize(loss)


init = tf.global_variables_initializer()
sess.run(init)

saver = tf.train.Saver()

print('Training the model...')
for i in range(1, (epoch + 1)):
    sess.run(optimizer, feed_dict={X_data: X_train, y_target: y_train})
    if i % interval == 0:
        print('Epoch', i, '|', 'Loss:', sess.run(loss, feed_dict={X_data: X_train, y_target: y_train}))

The next snippet of code is extremely crucial as this is where we save the graph. We first save the Session’s graph definition as a .pbtxt (human readable format) file because this will be required when we finally save the graph in the required .pb format using the freeze_graph function. In the next line we save the files required to store all the information regarding the graph using the save function. Lastly, we use the freeze_graph function to convert these files into one single binary .pb file. This is the file format required by Tensorflex to load the pre-defined graph in Elixir. Quick note: After running the script, make sure to copy and store the obtained graphdef.pb file from the data folder as it will be used to load the model in Tensorflex later.

tf.train.write_graph(sess.graph_def, dir, 'graph.pbtxt')
saver.save(sess, dir+'graph')
freeze_graph.freeze_graph(dir+'graph.pbtxt','', False, dir+'graph', 'output', 'save/restore_all', 'save/Const:0', 'data/graphdef.pb', True, '')

The last few lines of code below are just for printing out the input data features being fed from the test dataset and whether the actual class label matches the predicted label. It can be seen that the neural network predicts the correct label each time. To verify Tensorflex’s working, we will use this same data input and see if the predictions match the trained model saved above.

for i in range(len(X_test)):
    print('For datapoint: '+ str(X_test[i])+ ' Actual:', y_test[i], 'Predicted:', np.rint(sess.run(final_output, feed_dict={X_data: [X_test[i]]})))

Generating predictions in Elixir using Tensorflex

Before, we can generate predictions for our test data in Tensorflex, the following needs to be taken care of

  • Make sure you have the generated graphdef.pb file from running the python script
  • Ensure that after cloning the Tensorflex repository you have also followed the How to run section of the README. An important thing to consider if the code does not compile for you, is to check if your install directories for Tensorflow and Erlang/Elixir are somewhere unusual, and make changes to the Makefile accordingly

After everything is working fine, run iex -S mix in console and iex should start up.

Now, we first read in the graph definition model created in Python into Elixir. Then we see all the operations this graph has as a sanity check and see if it has our input and output operations listed. We do this as follows:

iex(1)> {:ok, graph} = Tensorflex.read_graph("graphdef.pb")
2018-06-13 01:55:06.968589: I tensorflow/core/platform/cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
{:ok, #Reference<0.1149503059.832962562.223497>}

iex(2)> Tensorflex.get_graph_ops graph
["biases2", "biases2/read", "weights2", "weights2/read", "biases1",
 "biases1/read", "weights1", "weights1/read", "input", "MatMul", "Add", "Relu",
 "MatMul_1", "Add_1", "output"]

Then we need to create our input and output tensor. For the input tensor, we first create matrices containing the data values (the 10 values from the test set) and the dimensions for them. We then pass these matrices to the float32_tensor function to create the input tensor. For the output tensor, since it’s values will only be populated once the predictions are run, we need to allocate the tensor by specifying the dimensions using float32_tensor_alloc.

iex(3)> in_vals = Tensorflex.create_matrix(10,4,[[7.7, 2.6, 6.9, 2.3],[5.0, 3.4,1.6,0.4], [6.7,3.3,5.7,2.1],[4.8,3.1,1.6,0.2],[5.1,3.3,1.7,0.5],[6.8,3.2,5.9,2.3],[6.5,3.0,5.5,1.8],[5.5,2.3,4.0,1.3],[4.4,3.0,1.3,0.2],[4.6,3.2,1.4,0.2]])
#Reference<0.1149503059.832962561.224293>

iex(4)> in_dims = Tensorflex.create_matrix(1,2,[[10,4]])
#Reference<0.1149503059.832962561.224301>

iex(5)> {:ok, input_tensor} = Tensorflex.float32_tensor(in_vals, in_dims)
{:ok, #Reference<0.1149503059.832962561.224309>}

iex(6)> out_dims = Tensorflex.create_matrix(1,2,[[10,3]])
#Reference<0.1149503059.832962561.224317>

iex(7)> {:ok, output_tensor} = Tensorflex.float32_tensor_alloc(out_dims)
{:ok, #Reference<0.1149503059.832962561.224325>}

Lastly, we just need to run the Session to generate predictions. For this, the run_session function is used. It takes in 5 arguments: the graph definition, the input tensor, the output tensor, the name of the input operation and the output operation. For our case, as mentioned before, the name of the input operation is input and the name of the output operation is output. Thus, this is done as follows:

iex(8)> Tensorflex.run_session(graph, input_tensor, output_tensor, "input","output")
[
  [2.065545068319352e-9, 2.6526619330979884e-4, 0.9997346997261047],
  [0.9949457049369812, 0.005054288078099489, 1.2820266626079047e-8],
  [9.829102964431513e-6, 0.047891464084386826, 0.9520986676216125],
  [0.9923586249351501, 0.007641407195478678, 1.2165889629045523e-8],
  [0.9862624406814575, 0.013737381435930729, 6.134903429710903e-8],
  [5.727139296141104e-7, 0.0067168646492064, 0.9932825565338135],
  [5.32361154910177e-5, 0.12258552014827728, 0.8773612380027771],
  [0.0023881904780864716, 0.862952709197998, 0.13465915620326996],
  [0.9962936043739319, 0.003706414718180895, 9.868954542469055e-9],
  [0.9973812699317932, 0.002618684433400631, 4.519155716309342e-9]
]

Now, while this output from the Session might look a bit unusual it isn’t. It’s a matrix of 10 rows and 3 columns as to be expected. The values are the direct values obtained from the softmax function. If you look closely, you will see that for each row, out of the three columns, one will contain a value very close to 1 and the other two will contain values close to 0. Thus these values can actually be read as 1 or 0. Once you compare them with the values obtained when the Python script was run, you will see that the predictions obtained are an exact match.

Conclusion

Thus, this post served as a tutorial for Tensorflex, the Tensorflow bindings for Elixir. To keep updated with development happening on the project watch and star the repository on Github. Feel free to contact me with any questions or feature requests at anshumanc[dot]1996[at]gmail[dot]com.

Have a nice day! :-)