Machine learning with Tensorflow — Simple neural network

4 min readAug 9, 2021

To begin with a machine learning, we can start from very basic and optimize our neural network step by step. Keras offers good datasets, here we’ll be using MNIST digits classification dataset. Keras official page gives good examples and how to use it, you might want to check it out for future usage.

What is MNIST dataset?

MNIST stands for Modified National Institute of Standards and Technology dataset. We’ll be using MNIST digits classification dataset. This is a dataset of 60,000 28x28 grayscale images of the 10 digits, along with a test set of 10,000 images.

Load the data

To use data in neural network, we need to extract features, transform and load (ETL). Below code will show what’s needed for this dataset.

Above code shows each layers have neurons connected to each pixel of images. Normally value of pixels are normalized in between `[0,1]` (pixel value is divided by 255).

Output is one class among 10 classes.

Last layer has single layer of activation function softmax. Softmax is an activation function of a neural network to normalize the output of a network to a probability distribution over predicted output classes. it normalizes values into 0 ~ 1. You can find more about activation functions here.

Build the model

Before building model, I’d always make some callback functions that can optimize notebook. Here I’ll show one example to create tensor board.

If you’re using local machine, you should change path for logs.

Now let’s build our first model

I’d personally like to import model and layers so I can just type model.add It’s up to you how to write your code.

So, model is defined and we need to compile model, there are few things needed to be done:

Need optimizer (https://www.tensorflow.org/api_docs/python/tf/keras/optimizers)
Need loss functions (https://www.tensorflow.org/api_docs/python/tf/keras/losses)
Evaluate trained model

There are few loss functions of my favorite.

MSE
binary_crossentropy
categorical_crossentropy

Metrics are:

Accuracy : Calculates how often predictions equal labels.
Precision: Computes the precision of the predictions with respect to the labels.
and many more (https://www.tensorflow.org/api_docs/python/tf/keras/metrics)

Metrics are only used in evaluation of model, not in training. Loss functions are used to optimize neural network however, metrics are used to determine performance of neural network.

Compile model

We’ve created our first Neural network. Let’s compile this and see what’s inside.

SGD (Stochastic Gradient Descent) optimizer is used to reduce the loss of each epoch. Few variables can be called:

epochs is that model has gone through entire train dataset.
‘batch_size’ is number of training instance in a single batch

Code above will compile and reveal its summary how model is consisted of, we’ve not put any more than single dense layer.

Model: "sequential" _________________________________________________________________ Layer (type)                 Output Shape              Param #    ================================================================= dense_layer (Dense)          (None, 10)                7850       ================================================================= Total params: 7,850 Trainable params: 7,850 Non-trainable params: 0 _________________________________________________________________

To train model in tensorflow is fairly easy, simply call fit.

Train the model

Now we’ve got our model ready to be trained, let’s train it. This process will take around 3~5min so go grab a coffee.

It will output something like this:

Epoch 195/200
375/375 [==============================] - 1s 2ms/step - loss: 0.2779 - accuracy: 0.9228 - val_loss: 0.2765 - val_accuracy: 0.9231
Epoch 196/200
375/375 [==============================] - 1s 2ms/step - loss: 0.2795 - accuracy: 0.9221 - val_loss: 0.2764 - val_accuracy: 0.9230
Epoch 197/200
375/375 [==============================] - 1s 2ms/step - loss: 0.2659 - accuracy: 0.9269 - val_loss: 0.2764 - val_accuracy: 0.9226
Epoch 198/200
375/375 [==============================] - 1s 2ms/step - loss: 0.2742 - accuracy: 0.9252 - val_loss: 0.2764 - val_accuracy: 0.9234
Epoch 199/200
375/375 [==============================] - 1s 2ms/step - loss: 0.2750 - accuracy: 0.9220 - val_loss: 0.2762 - val_accuracy: 0.9232
Epoch 200/200
375/375 [==============================] - 1s 2ms/step - loss: 0.2769 - accuracy: 0.9216 - val_loss: 0.2762 - val_accuracy: 0.9236<keras.callbacks.History at 0x7feac26f2310>

You can see each epoch passes its accuracy is increasing.