Image classifier

An introduction of Convolutional neural network and how to use GPU.

In this tutorial, we’ll apply a convolutional neural network (CNN) to another standardized dataset, the “CIFAR” data. This dataset is a collection of 60,000 images over 10 classes.

Required libraries

In [1]:
from __future__ import division, print_function
import os
import sys
import pickle

import numpy as np
from PIL import Image
import matplotlib.pyplot as plt
from sklearn.preprocessing import LabelBinarizer
from sklearn.metrics import confusion_matrix, classification_report

import renom as rm
from renom.optimizer import Sgd, Adam
from renom.cuda.cuda import set_cuda_active

GPU-enabled Computing

If you wish to use a GPU, you’ll need to call the set_cuda_active() with the single argument True . This will generally allow training to run much faster than relying on the CPU. You’ll need an NVIDIA GPU installed on your machine.

In [2]:
set_cuda_active(True)

Load data

Here we just unpickle the CIFAR image data, collected from the CIFAR website ( https://www.cs.toronto.edu/~kriz/cifar.html ). As in Tutorial 1, we scale the data to a range of 0 to 1 and binarize the labels.

In [3]:
dir = "./cifar-10-batches-py/"
paths = ["data_batch_1", "data_batch_2", "data_batch_3",
         "data_batch_4", "data_batch_5"]

def unpickle(f):
    fo = open(f, 'rb')
    if sys.version_info.major == 2:
        # Python 2.7
        d = pickle.load(fo)
    elif sys.version_info.major == 3:
        # Python 3.4
        d = pickle.load(fo, encoding="latin-1")
    fo.close()
    return d

# Load train data.
data = list(map(unpickle, [os.path.join(dir, p) for p in paths]))
train_x = np.vstack([d["data"] for d in data])
train_y = np.vstack([d["labels"] for d in data])

# Load test data.
data = unpickle(os.path.join(dir, "test_batch"))
test_x = np.array(data["data"])
test_y = np.array(data["labels"])

# Rehsape and rescale image.
train_x = train_x.reshape(-1, 3, 32, 32)
train_y = train_y.reshape(-1, 1)
test_x = test_x.reshape(-1, 3, 32, 32)
test_y = test_y.reshape(-1, 1)

train_x = train_x / 255.
test_x = test_x / 255.

# Binalize
labels_train = LabelBinarizer().fit_transform(train_y)
labels_test = LabelBinarizer().fit_transform(test_y)

# Change types.
train_x = train_x.astype(np.float32)
test_x = test_x.astype(np.float32)
labels_train = labels_train.astype(np.float32)
labels_test = labels_test.astype(np.float32)

N = len(train_x)

Neural network definition

Setup the CNN- essentially similar to Tutorial 1, except that here we are using several hidden layers. Also, we try to avoid over-fitting, by using the’‘dropout’’ technique.

In [4]:
class Cifar10(rm.Model):

    def __init__(self):
        super(Cifar10, self).__init__()
        self._l1 = rm.Conv2d(channel=32)
        self._l2 = rm.Conv2d(channel=32)
        self._l3 = rm.Conv2d(channel=64)
        self._l4 = rm.Conv2d(channel=64)
        self._l5 = rm.Dense(512)
        self._l6 = rm.Dense(10)
        self._sd = rm.SpatialDropout(dropout_ratio=0.25)
        self._pool = rm.MaxPool2d(filter=2, stride=2)

    def forward(self, x):
        t1 = rm.relu(self._l1(x))
        t2 = self._sd(self._pool(rm.relu(self._l2(t1))))
        t3 = rm.relu(self._l3(t2))
        t4 = self._sd(self._pool(rm.relu(self._l4(t3))))
        t5 = rm.flatten(t4)
        t6 = rm.dropout(rm.relu(self._l5(t5)))
        t7 = self._l6(t5)
        return t7

Definition of a nural network with sequential model

In [5]:
sequential = rm.Sequential([
        rm.Conv2d(channel=32),
        rm.Relu(),
        rm.Conv2d(channel=32),
        rm.Relu(),
        rm.MaxPool2d(filter=2, stride=2),
        rm.Dropout(dropout_ratio=0.25),
        rm.Conv2d(channel=64),
        rm.Relu(),
        rm.Conv2d(channel=64),
        rm.Relu(),
        rm.MaxPool2d(filter=2, stride=2),
        rm.Dropout(dropout_ratio=0.25),
        rm.Flatten(),
        rm.Dense(512),
        rm.Relu(),
        rm.Dropout(dropout_ratio=0.5),
        rm.Dense(10),
    ])

Instantiation

In [6]:
# Choose neural network.
network = Cifar10()
#network = sequential
optimizer = Adam()

Training loop

In the training loop, we recommend running a validation step for each minibatch. This will allow us to check the learning process, and prevent overfitting. This also allows you to diagnose training problems by comparing the validation and training learning curves.

In [7]:
# Hyper parameters
batch = 128
epoch = 20

learning_curve = []
test_learning_curve = []

for i in xrange(epoch):
    perm = np.random.permutation(N)
    loss = 0
    for j in xrange(0, N // batch):
        train_batch = train_x[perm[j * batch:(j + 1) * batch]]
        responce_batch = labels_train[perm[j * batch:(j + 1) * batch]]

        # Loss function
        network.set_models(inference=False)
        with network.train():
            l = rm.softmax_cross_entropy(network(train_batch), responce_batch)

        # Back propagation
        grad = l.grad()

        # Update
        grad.update(optimizer)
        loss += l.as_ndarray()

    train_loss = loss / (N // batch)

    # Validation
    test_loss = 0
    M = len(test_x)
    network.set_models(inference=True)
    for j in range(M//batch):
        test_batch = test_x[j * batch:(j + 1) * batch]
        test_label_batch = labels_test[j * batch:(j + 1) * batch]
        prediction = network(test_batch)
        test_loss += rm.softmax_cross_entropy(prediction, test_label_batch).as_ndarray()
    test_loss /= (j+1)

    test_learning_curve.append(test_loss)
    learning_curve.append(train_loss)
    print("epoch %03d train_loss:%f test_loss:%f"%(i, train_loss, test_loss))
epoch 000 train_loss:1.759590 test_loss:1.427164
epoch 001 train_loss:1.452497 test_loss:1.292583
epoch 002 train_loss:1.334705 test_loss:1.205753
epoch 003 train_loss:1.248209 test_loss:1.120521
epoch 004 train_loss:1.184235 test_loss:1.080423
epoch 005 train_loss:1.124357 test_loss:1.009399
epoch 006 train_loss:1.077931 test_loss:0.984938
epoch 007 train_loss:1.036453 test_loss:0.957155
epoch 008 train_loss:1.004979 test_loss:0.930456
epoch 009 train_loss:0.978650 test_loss:0.931968
epoch 010 train_loss:0.952131 test_loss:0.922379
epoch 011 train_loss:0.924238 test_loss:0.892902
epoch 012 train_loss:0.907049 test_loss:0.894063
epoch 013 train_loss:0.887215 test_loss:0.878045
epoch 014 train_loss:0.875834 test_loss:0.868688
epoch 015 train_loss:0.858393 test_loss:0.861893
epoch 016 train_loss:0.845052 test_loss:0.867170
epoch 017 train_loss:0.832762 test_loss:0.863249
epoch 018 train_loss:0.825219 test_loss:0.856124
epoch 019 train_loss:0.816549 test_loss:0.863323

Model evaluation

Finally, we evaluate the models labeling performance using the same metrics as in Tutorial 1.

In [8]:
network.set_models(inference=True)
predictions = np.argmax(network(test_x).as_ndarray(), axis=1)

# Confusion matrix and classification report.
print(confusion_matrix(test_y, predictions))
print(classification_report(test_y, predictions))

# Learning curve.
plt.plot(learning_curve, linewidth=3, label="train")
plt.plot(test_learning_curve, linewidth=3, label="test")
plt.title("Learning curve")
plt.ylabel("error")
plt.xlabel("epoch")
plt.legend()
plt.grid()
plt.show()
[[695  20  46  24  20   7  16   9 123  40]
 [  5 840   3  12   3   4  14   5  23  91]
 [ 58   2 494  77 121  74 112  32  18  12]
 [  9   6  52 522  71 148 120  24  22  26]
 [ 23   1  58  67 637  35 102  58  12   7]
 [  8   2  48 180  51 587  61  44  11   8]
 [  3   3  32  45  26  22 846   7   9   7]
 [ 12   0  27  54  80  68  19 715   4  21]
 [ 44  33   7  10   7   6   6   6 860  21]
 [ 17  77   5  14   8   5  13   9  38 814]]
             precision    recall  f1-score   support

          0       0.80      0.69      0.74      1000
          1       0.85      0.84      0.85      1000
          2       0.64      0.49      0.56      1000
          3       0.52      0.52      0.52      1000
          4       0.62      0.64      0.63      1000
          5       0.61      0.59      0.60      1000
          6       0.65      0.85      0.73      1000
          7       0.79      0.71      0.75      1000
          8       0.77      0.86      0.81      1000
          9       0.78      0.81      0.80      1000

avg / total       0.70      0.70      0.70     10000

../../../_images/notebooks_image_processing_image-classifier_notebook_16_1.png