Deep Learning – Introduction By Example


There has been major breakthroughs in the field of Artificial Intelligence and automation in recent years. We use these sophisticated tools in unsophisticated ways to achieve what might have felt impossible few years ago. Major examples of these advances include FaceID introduced in the new Apple iPhone X, Google Translate and don’t forget Netflix’s super cool recommendations (You totally get my needs Netflix on those days of chills).

It is cool to use them but super hot to build them. Don’t believe me? Sure, google it you will feel the heat. In this post let’s dive straight up and build one for ourselves. We will not be building a hadron collider so I think we will be fine after this.


In this post we will be building a Fruit Recognizier (An Artificial Intelligence that can classify fruits from images into 60 different categories). We will be neglecting minute details like how we read data into memory. This allows us to focus on what is important.

You can download the dataset from kaggle at this link and extract it.


|___ Deep Learning Introduction.ipynb
|___ fruits-360/
| |___ ...

All the code for this post is available here.


This post assumes basic understanding of:

  • Python Programming Language (Python 3 is used herein) – Learn Python
  • Numpy (Numerical Computation Library for Python) – Learn Numpy
  • Matplotlib (Graph and image plotting librar for Python) – Learn Matplotlib
  • PIL (Python Imaging Library) basic image writting and reading of image – Learn PIL

In this post we can get away without having much knowledge about Numpy, Matplotlib and PIL. Knowing Python is a must!!

I have made some helper functions (available in repository) for you to make your task easy!! 🙂

Notations Followed *Explained Ahead



Neural Network is an algorithmic approach that combines the concept of representation learning with machine learning (well kind of). So, what are these two terms?


These are a set of approaches and algorithms used to carry out predictive analysis. In other words, machine learning lets us predict about the events or objects using previous knowledge we embed into the algorithm used. For example, Stock Price prediction, Search Indexing, Property Price Prediction.

This is all hunky-dory but there are many nuts and bolts to fix before we can use these approaches. Machine Learning lacks the power of learning directly from raw data. It needs various processing steps to be applied to data so as to make it computer understandable. For example, most property price prediction algorithms require the area, bhk and other locality based parameters to be provided for prediction; it cannot just calculate the property price directly from the images of the house.


These are a set of approaches that learn to preprocess raw data into meaningful one. This means these approaches can produce data that is acceptable for machine learning techniques. So much so that machine learning algorithms trained through these representations in many cases outperform those trained by human hand crafted represenations.


A neural network is a combination of artificial neurons that work in conjunction towards a common goal.

An artificial neuron is a simple unit (inspired from neurons in brain) that do a weighted sum of input and applies a non-linear function to it (as shown in figure).

A neuron taking n inputs and giving one output!!

A neuron can be used for many different tasks:

  • Stock Price Prediction
  • Recommend whether to do a cesarean or not
  • Perform AND and OR Boolean operations

and list goes on

But when given a data which doesn’t have good representation they perform poorly.


In Neural Networks there are 2 types of neurons – hidden neurons and output neurons. Hidden neurons are employed with the task of learning underlying representation so as to ease the work of output neurons. This gives us the power of Machine Learning and Representation Learning into one single entity.


Enough with theory for this post I think. Now let us dive into what is the fun part. I have made some helper functions to ensure that you don’t have to go into full details here. I’ll be opening up this black box code in further posts. Meanwhile feel free to fiddle with the code.

Note: Before starting make sure to install all required libraries by using 
pip install -r requirements.txt
and download helper function files from below given link.

Download code from here

Our Imports

from utils import *
from nn import *

Let’s collect our training data. get_data_from_dir(path, size=[100, 100]) is a function present in It collects all data from fruits-360 directory. fruits-360 directory contains images; each image is part of directory named after the fruit present in the image. For example, All apples are present in fruits-360/Training/Apple and so on.

X, y, labels_list = get_data_from_dir('fruits-360/Training/')
Found 28736 images belonging to 60 different classes

Data currently present in X is the images in lexicographical order of the names of the fruit directories present in fruit-360 and y holds the index numbers 0 to 59 representing fruits names present in labels_list.

This means all apples are present in X before all oranges. Neural Network are good at learning patterns in data. But sometimes they tend to work so well that they outperform others on given data. This is good but still this forces them to forget about the general notion of being an apple and focus only on the apples present in the dataset. To stop this from happening we shuffle the data randomly so as to make sure a NN is not able to fit too well on the given data while completely overlooking the generality.

We will be discussing in a little while what I meant by fitting in the above lines.

X, y = shuffle(X, y)

After training a neural network to a dataset, we generally need to see how well it performs on data which it hasn’t seen. To do so we separate training and testing data to make sure our NN has not idea about the testing data.

X_train, y_train, X_test, y_test = train_test_split(X, y)
print('Number of training Examples: {} \nNumber of testing Examples {}'.format(
    X_train.shape[0], X_test.shape[0]))

Here I chose 20% of training data to be used for testing only.

Apart from training and testing data we also need data to see the performance of our NN after every training cycle (if you are still wondering what I mean by training have a deep breath we are almost there). But why do we need it? Think of it as the first hurdle in NNs lifespan to prove its capabilities. Our Neural Network needs to perform well on this dataset before we can actually test it on the testing dataset.

This dataset is like a preliminary qualification round to make sure we aren’t really bringing a NN for testing before having general idea of how it works. This dataset is known as validation dataset or development dataset.

So, we validate after every training cycle and we test after completing all training cycles.

X_dev, y_dev, _ = get_data_from_dir('fruits-360/Validation/')
X_dev, y_dev = shuffle(X_dev, y_dev)
Found 9673 images belonging to 60 different classes

Now the collection of data for training, validating and testing is complete. Just one last thing to look forward to.

Neural Networks in case of classification, like the one at hand (classifying fruits), output probabilities of an image belonging to a particular class. This means if an image belongs to class apple and the NN is 90% sure that the image is that of an apple then it outputs an array of length 60 (since 60 classes) having probabilities something like [0.9, 0.01, 0.001…60 items]. If you have studied probability then you will realize that the sum of all these probabilities will be 1 (as these are all the possible classes the image can belong).

But what about true values? Of course in above case NN produced a prediction and was 90% sure. But in case of true values we are 100% sure that the image is that of an apple. Thus we give this image a label in the form of an array such as [1, 0, 0, .. 60 items]. This representation where we represent a label in form an array with only one position as 1 is known as one hot vector.

Remember Apple has an index 0 in labels_list.

Let us compute one hot vectors.

y_train_hot = one_hot(y_train)
y_dev_hot = one_hot(y_dev)
y_test_hot = one_hot(y_test)
array([ 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 1., 0., 0., 0., 0., 0., 0., 0., 0.])

Let us have a look at some of the images. Remember Y is true value and Y_hat is predicted value.

plot_some(X_train, y_train, labels_list=labels_list)
[20700 22003 11925 15794 15179  6766  6577  5860 15354 11392

Each image is of shape 100, 100, 3 (3 for RGB) and we have 60 classes. We will be allowing our neural network to see a batch of only 32 images at a time and we will allow the Neural Network to train for only one cycle (Each cycle known as an epoch). Let us define these constants in code:

INPUT_SHAPE = [100, 100, 3]

Let us define our Neural Network with these params and see its summary. Don’t worry if you do not understand summary as of now. I’ll be explaining them in the next post. Just go along for now. 🙂

model = defin_model(INPUT_SHAPE, N_CLASSES)

Training a Neural Network

Remember when I told you that an artificial neuron does a weighted sum of the input. It must be confusing that how do we get those magical number (w1, w2, …, wn) to do the weighted sum. Let me explain how this works but from a satellite view for now as it is more complex of a thing to just throw in this post.

Consider a neural network having all weights randomly assigned. What will happen? Well it will take an input and will throw out some garbage that is not of use to anyone. But we have data that knows what the true value is. What if we just find the error between the garbage and the true value and nudge our weights in the direction where the error is minimum and do it continuously till the time the error is acceptable enough.

This is what we are going to do with the next statement. We will nudge the randomly assigned weights to make the result better and better. And we will do it for every next 32 images and not all the 22988 images. This ensures that we play in the bound of the memory available and don’t let the NN to overly fit the training data and over look generalization.

          validation_data=(X_dev, y_dev_hot))
Train on 22988 samples, validate on 9673 samples
Epoch 1/1
22988/22988 [==============================] - 705s 31ms/step - loss: 0.9468 - acc: 0.7427 - val_loss: 0.5162 - val_acc: 0.8657

Let’s evaluate on training data first:

results = model.evaluate(X_train, y_train_hot)
print('Loss on Training set: {}  and Accuracy on Training Set: {}'.format(
    results[0], results[1]))
22988/22988 [==============================] - 177s 8ms/step
Loss on Training set: 0.059198942950391784  and Accuracy on Training Set: 0.979076039662501

Let’s evaluate on development set:

results = model.evaluate(X_dev, y_dev_hot)
print('Loss on Validation set: {}  and Accuracy on Validation Set: {}'.format(
    results[0], results[1]))
9673/9673 [==============================] - 74s 8ms/step
Loss on Validation set: 0.5162139866278878  and Accuracy on Validation Set: 0.8657086736337852

Here we saw that we are getting an accuracy of 98% on training and 86.5% in validation data. Common let’s find out results on testing data. Remember that the testing data was originally taken from training data and thus it might not be a good representation of general data.

Let’s finally evaluate our test data!!

results = model.evaluate(X_test, y_test_hot)
print('Loss on Test set: {}  and Accuracy on Test Set: {}'.format(
    results[0], results[1]))

Let’s visualize our results!!

plot_some(X_test, y_test, np.argmax(model.predict(X_test), axis=1), labels_list=labels_list)
[4870 3857 2178 4037 4227 1719 4608 5546 1334 2730]

This brings us to the end of this post. I hope you got a sense of what Deep Learning can do. This is just a demo and it can be improved 1000s of folds by using other techniques. But I think it is good enough for any beginner.

Find source code here


  1. Deep Learning with Python
  2. Deep Learning: A Practitioner’s Approach
  3. Deep Learning Book
  4. Hands-On Machine Learning with Scikit-Learn and TensorFlow

Leave a Reply

Close Menu