Today we're looking at running inference / forward pass on a neural network model in Golang. If you're a beginner like me, using a framework like Keras, makes writing deep learning algorithms significantly easier. If you're very fresh to deep learning, please have a look at my previous post: Deep Learning, Bottom Up.

But why run it in Go? There are several reasons why that would be beneficial:

  • Current infrastructure is already running Kubernetes / Docker containers and Golang makes the binaries extremely small and efficient
  • Web frameworks for Go are much faster than the Python ones
  • The team aren't necessarily data scientists working in Python and work in Go
  • Pushing data internally using GRPC for faster communication between micro services

There are some caveats and things we need to watch out for when doing so and I'll get to them in a bit. The high level process is:

  1. Build, and Train the model using Keras
  2. Use a TF session with keras.backend when building and training the model
  3. Name the input layer and output layer in the model (we'll see why later)
  4. Use that TF session to save the model as a computation graph with the variables (the normal in keras is hdf5 but we skip that)
  5. Load up the model in Go and run inference

The full code can be found on my Github page for the more savvy folks:

I'll only be pulling out snippets of the code that are relevant in the rest of the article. If something isn't clear, please refer to the python notebook. You should be able to get the idea without reading every line though.

Binary Classification with Keras

The dataset of chest x-rays is in good resolution and the CSV is very clean and well labelled. You can get it from here. It's about ~150GB of 1024x1024 PNGs. After putting the data into a CSV and taking a quick peek with pandas, we see corresponding images and data look like this:

Pandas df.head()

![chest x-ray csv](/content/images/2018/04/chestrays-csv-head.jpg) Plot one of the images in the notebook: ![chest xray](/content/images/2018/04/chest-xray.png)

Okay so the only 2 columns we care about are the image file names and the labels. I'd like to classify whether or not there was No Finding or if it was say...Atelectasis.

Prepping the Data For Keras Image Generator

Keras has a nice way of building models using generators so that's what we'll do here. It automatically picks up the labels based on the folder structure so I have the following:



The generator will automatically pick the folder name as the label. We just need to write a bit of code to put the images in the right folders. You can have a look at the full source for the imported variables but the ones you need to here are:

train_rows=3600 # arbitrarily picked a smaller number which we'll read from the csv

The rest are pretty self explanatory or you can infer its meaning.

# Prepare train and test sets

# Factorize the labels and make the directories, convert all | to _'s, remove spaces
labels, names = pd.factorize(df[1])
image_names = image_dir + df.iloc[0:rows,0].values

# data mover function, also populates the dictionary so we can see the distribution of data
def copyImages(dataframe, idx, directory="train"):
    classification = dataframe.iloc[idx][1].replace(" ","").replace("|","_")
    source = image_dir + dataframe.iloc[idx][0]
    destination = directory + "/"
    if classification == "NoFinding":
        shutil.copy(source, destination + "NoFinding")
    elif classification.find(toClassify) >= 0:
        shutil.copy(source, destination + toClassify)

# Make classification directories
pathlib.Path("train/" + "NoFinding").mkdir(parents=True, exist_ok=True)
pathlib.Path("train/" + toClassify).mkdir(parents=True, exist_ok=True)
pathlib.Path("test/" + "NoFinding").mkdir(parents=True, exist_ok=True)
pathlib.Path("test/" + toClassify).mkdir(parents=True, exist_ok=True)

for r in range(train_rows):
    copyImages(df, r, "train")

for r in range(test_rows):
    copyImages(df, train_rows + r, "test")

Build the Model

Now to build the NN Model. The code itself is very short and concise which is why I really like Keras.

sess = tf.Session()
model = Sequential()

model.add(Conv2D(32, (3, 3), activation='relu', input_shape=(IMG_WIDTH, IMG_HEIGHT, CH), name="inputLayer"))
model.add(Conv2D(32, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Dense(64, activation='relu'))
model.add(Dense(1, activation='sigmoid', name="inferenceLayer"))

sgd = optimizers.SGD(lr=0.01, momentum=0.0, decay=0.0, nesterov=False)
model.compile(loss='binary_crossentropy', optimizer=sgd, metrics=["accuracy"])

K is from keras import backend as K

There are 3 main important points to make this work with Golang:

  1. We initiated a tf.Session(). We need to do this to save it as a computation graph later. Normally you wouldn't need to do this at all in Keras.
  2. The input layer is named "inputLayer" that's the first node we need to know when running inference
  3. The final output is our desired result and so that layer is called "inferenceLayer"

The Go bindings execute operations on the graph and so it needs to know the node names.

Listing all the Nodes of the Computation Graph

At this point we can actually show what the node names are by running:

[ for n in tf.get_default_graph().as_graph_def().node]

Here's a shortened version of the list:

164 rows in total

The ones we're interested in are: inputLayer_input and inferenceLayer/Sigmoid. Notice that it isn't what we actually named it. That's because Keras/TF appends different numbers and values to it to make it unique. Numbers may be added at the end depending on the number of times you run the model. We named it anyway in the code to make it easier to find.

So if we actually run this model at a later date and export it, our Go code needs to change to correspond to the new node names otherwise it will just chuck an error.

So when we run the generator we get this output:

Found 1865 images belonging to 2 classes.
Found 683 images belonging to 2 classes.
{'Atelectasis': 0, 'NoFinding': 1}

Anything classified as a 0 is Atelectasis, anything classified as a 1 is No finding. Keras will give us a 0 or a 1 when we run model.predict_classes(input). However, since we pulled out the last Sigmoid function, we'll get a value between 0 and 1. It's the same idea, we can use a threshold value to determine when it's a 0 and when it's a 1.

Alright now run the model and then save the output with:

# Use TF to save the graph model instead of Keras save model to load it in Golang
builder = tf.saved_model.builder.SavedModelBuilder("myModel")
# Tag the model, required for Go
builder.add_meta_graph_and_variables(sess, ["myTag"])

Important notes here:

  1. The model is saved in a folder called myModel
  2. The graph is tagged with myTag

This generates all the protobuf, variables and graph as binary in the myModel folder. We'll need these string values in the Go code.

Loading and Running the Model in Go

Here's the code in its entirety:

package main

import (

	tf ""

func main() {
	// replace myModel and myTag with the appropriate exported names in the chestrays-keras-binary-classification.ipynb
	model, err := tf.LoadSavedModel("myModel", []string{"myTag"}, nil)

	if err != nil {
		fmt.Printf("Error loading saved model: %s\n", err.Error())

	defer model.Session.Close()

	tensor, _ := tf.NewTensor([1][250][250][3]float32{})

	result, err := model.Session.Run(
			model.Graph.Operation("inputLayer_input").Output(0): tensor, // Replace this with your input layer name
			model.Graph.Operation("inferenceLayer/Sigmoid").Output(0), // Replace this with your output layer name

	if err != nil {
		fmt.Printf("Error running the session with input, err: %s\n", err.Error())

	fmt.Printf("Result value: %v \n", result[0].Value())


Pretty self explanatory of why we needed those strings.

The tensor we input is in the shape [batch size][width][height][channels].
In this case we just used empty dummy values but to actually use it, we need to convert an image into those dimensions.

I trained the model on Windows 10 with an Nvidia GTX 970 (4GB). Found out later that the Go bindings only work on Linux and Mac. So I actually copied the myModel folder over to my Linux machine and ran the Go code.

You'll need to install the Go bindings and also run go get

A successful run should yield something like this:

(ML) tony@tony-nuc:$GOPATH/src/$ go run main.go
2018-04-02 20:30:51.905087: I tensorflow/core/platform/] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
2018-04-02 20:30:51.905281: I tensorflow/cc/saved_model/] Loading SavedModel with tags: { myTag }; from: myModel
2018-04-02 20:30:51.913855: I tensorflow/cc/saved_model/] Restoring SavedModel bundle.
2018-04-02 20:30:52.121236: I tensorflow/cc/saved_model/] Running LegacyInitOp on SavedModel bundle.
2018-04-02 20:30:52.122132: I tensorflow/cc/saved_model/] SavedModel load for tags { myTag }; Status: success. Took 216855 microseconds.
Result value: [[0.5441803]] 

Some Performance Numbers

Recall the model was:

3x3x32 Convolutional Layer
3x3x32 Convolutional Layer
2x2 Max Pool Layer
64 Node Fully Connected Layer with Dropout
1 Sigmoid output Layer

For Python:

  • CPU: - ~2.72s to warm up and run one inference and ~0.049s for each inference after
  • GPU: - ~3.52s to warm up and run one inference and ~0.009s for each inference after
  • Saved Model Size (HDF5) 242MB

For Go:

  • CPU: - ~0.255s to warm up and run one inference and ~0.045s for each inference after
  • GPU: - N/A
  • Saved Model Size(Protobuf binaries) 236MB

I didn't run it too many times so take it with a grain of salt. I did try to keep the test python code the same as the Go one with a small dummy tensor:

from keras.preprocessing import image
from keras.models import load_model
import numpy as np
model = load_model("model.h5")
img = np.zeros((1,250,250,3))
x = np.vstack([img]) # just append to this if we have more than one image.
classes = model.predict_classes(x)

That's the code for the first run. Just comment out the imports and model loading afterwards for consecutive runs. the %%time is to measure the execution time of the Jupyter Notebook Cell.

It goes without saying that the Go docker container would be much smaller than the Python one and the web frameworks would probably be the big differentiation between Python and Go.

Here's a writeup by Bijan on some performance on web frameworks between Node, Go and Python. I'll put the summary of a couple relevant ones here:

Python + Flask:
11751 Requests/sec => 16393 requests in 30s 
Average Latency 55.54ms

PyPy2.7 Python + Twisted:
12633 Requests/sec => 379001 requests in 30s

Golang + bmizerany Pat + GOMAXPROCS(7):
51684 Requests/sec => 1550508 requests in 30s

Golang + Gorilla Pat (using Gorillas Muxer)
37756 Requests/sec => 1132689 requests in 30s 
Average Latency 1.71ms

Golang (no external dependencies)
63300 Requests/sec

I'll have to try it with Go + GPU combo to see how it performs but I suspect very similarly.

Do you use Go to serve up your models in prod? I'd love to know about your experience. Drop me an email or comment.