image Binary Classification

Image binary classification problem using Convolution neural network model to Caltech 101

We’ll apply to a convolutional neural network (CNN) to Caltech101 dataset. Caltech101 was the image dataset created by Fei-Fei Li at California Institute of Technology.

Data reference is as bellow.
Caltech 101
L. Fei-Fei, R. Fergus and P. Perona. One-Shot learning of object categories. IEEE Trans. Pattern Recognition and Machine Intelligence. In press.

Required libraries

In [31]:
import renom as rm
from renom.utility.distributor import ImageClassificationDistributor
from renom.utility.distributor.imageloader import ImageLoader
from renom.utility.image import *
from renom.optimizer import Sgd, Adam
from renom.cuda.cuda import set_cuda_active
import matplotlib.pyplot as plt
import numpy as np
import math
import os

from PIL import Image
from sklearn.preprocessing import LabelBinarizer
from sklearn.metrics import confusion_matrix, classification_report
from sklearn.model_selection import train_test_split


GPU-enabled Computing

If you wish to use a GPU, you’ll need to call the set_cuda_active() with the single argument True . This will generally allow training to run much faster than relying on the CPU. You’ll need an NVIDIA GPU installed on your machine.

In [32]:
set_cuda_active(True)

Load for classification

Load the image for classification

In [33]:
def load_for_classification(path):
    class_list = ["Motorbikes", "airplanes"]
    onehot_vectors = []
    for i in range(len(class_list)):
        temp = [0] * len(class_list)
        temp[i] = 1
        onehot_vectors.append(temp)
    X_list = []
    y_list = []
    for classname in class_list:
        imglist = os.listdir(path + classname)
        for filename in imglist:
            filepath = path + classname + "/" + filename
            X_list.append(filepath)
            onehot = onehot_vectors[class_list.index(classname)]
            y_list.append(onehot)

    return X_list, y_list, class_list

Load the file path and Label data

In [34]:
path = "101_ObjectCategories/"
X_list, Y_list, class_list = load_for_classification(path)

Load the image data and resize data

We loaded images in RGB format on the basis of the path and resized them. Resizing is quite important preprocessing because, in this case, the number of units of input layer is fixed value. Furthermore, we scaled the data to a range of 0 to 1. Next, we divided the umages into training sets and test sets.

In [35]:
x_size=32
y_size=32
channel=3

# Load data
X_tmp = np.empty((0,x_size*y_size*channel), int)
Y_tmp = []
for i in range(len(X_list)):
    img = Image.open(X_list[i]).convert('RGB')
    img = img.resize((x_size,y_size))
    img = np.asarray(img)
    img = img / 255.
    img = np.array([list(img.flatten())])
    X_tmp = np.append(X_tmp, img, axis=0)
Y_tmp = np.array(Y_list)

# Split images
test_size = 0.2

X_train, X_test, y_train, y_test = train_test_split(X_tmp, Y_tmp, test_size=test_size,random_state=16)

# Reshape images and binalize
X_train = X_train.reshape(-1, x_size, y_size, channel)
X_train = X_train.transpose(0,3,1,2)
labels_train = LabelBinarizer().fit_transform(y_train).astype(np.float32)

X_test = X_test.reshape(-1, x_size, y_size, channel)
X_test = X_test.transpose(0,3,1,2)
labels_test = LabelBinarizer().fit_transform(y_test).astype(np.float32)

Definition of a neural network with sequential model

In [36]:
sequential = rm.Sequential([
        rm.Conv2d(channel=32),
        rm.Relu(),
        rm.MaxPool2d(filter=2, stride=2),
        rm.Dropout(dropout_ratio=0.25),
        rm.Flatten(),
        rm.Dense(256),
        rm.Relu(),
        rm.Dropout(dropout_ratio=0.5),
        rm.Dense(2),
    ])

Instantiation

In [37]:
# Choose neural network.
network = sequential
# Choose optimizer
optimizer = Adam()

Training loop

In [38]:
N = len(X_train)

# Hyper parameters
batch = 100
epoch = 10

learning_curve = []
test_learning_curve = []

for i in range(epoch):
    perm = np.random.permutation(N)
    loss = 0
    for j in range(0, N // batch):
        train_batch = X_train[perm[j * batch:(j + 1) * batch]]
        responce_batch =labels_train[perm[j * batch:(j + 1) * batch]]

        # Loss function
        network.set_models(inference=False)
        with network.train():
            l = rm.softmax_cross_entropy(network(train_batch), responce_batch)

        # Back propagation
        grad = l.grad()

        # Update
        grad.update(optimizer)
        loss += l.as_ndarray()

    train_loss = loss / (N // batch)

    # Validation
    test_loss = 0
    M = len(X_test)
    network.set_models(inference=True)
    for j in range(M//batch):
        test_batch = X_test[j * batch:(j + 1) * batch]
        test_label_batch = labels_test[j * batch:(j + 1) * batch]
        #print(test_label_batch)
        prediction = network(test_batch)
        test_loss += rm.softmax_cross_entropy(prediction, test_label_batch).as_ndarray()
    test_loss /= (j+1)

    test_learning_curve.append(test_loss)
    learning_curve.append(train_loss)
    print("epoch %03d train_loss:%f test_loss:%f"%(i, train_loss, test_loss))

epoch 000 train_loss:0.845165 test_loss:0.472857
epoch 001 train_loss:0.263124 test_loss:0.168094
epoch 002 train_loss:0.126543 test_loss:0.085918
epoch 003 train_loss:0.059745 test_loss:0.073824
epoch 004 train_loss:0.051910 test_loss:0.061211
epoch 005 train_loss:0.035872 test_loss:0.059432
epoch 006 train_loss:0.023361 test_loss:0.039140
epoch 007 train_loss:0.018275 test_loss:0.029468
epoch 008 train_loss:0.017973 test_loss:0.027034
epoch 009 train_loss:0.011490 test_loss:0.020061

Model evaluation

We evaluated the models labeling performance using the confusion matrix. We successfully created a good classifier because the F1 score that is one of measure of a test’s accuracy was 0.98

In [39]:
network.set_models(inference=True)
predictions = np.argmax(network(X_test).as_ndarray(), axis=1)

# Confusion matrix and classification report.
test_seikai_label = (np.argmax(y_test,axis = 1) ).reshape(-1,1)
print(confusion_matrix(test_seikai_label, predictions))
print(classification_report(test_seikai_label, predictions))

# Learning curve.
plt.plot(learning_curve, linewidth=3, label="train")
plt.plot(test_learning_curve, linewidth=3, label="test")
plt.title("Learning curve")
plt.ylabel("error")
plt.xlabel("epoch")
plt.legend()
plt.grid()
plt.show()
[[148   2]
 [  1 169]]
             precision    recall  f1-score   support

          0       0.99      0.99      0.99       150
          1       0.99      0.99      0.99       170

avg / total       0.99      0.99      0.99       320

../../../_images/notebooks_image_processing_image-binary-classification_notebook_18_1.png

Wrong classification

What kind of image was the misclassified image? Here, we show each picture.

In [40]:
diff = test_seikai_label.reshape(-1,) - predictions
diff_num = np.count_nonzero(diff)
diff_list = np.where(diff != 0)

for i in range(diff_num):
    plt.imshow(X_test[diff_list][i][0], 'gray')
    plt.show()

../../../_images/notebooks_image_processing_image-binary-classification_notebook_20_0.png
../../../_images/notebooks_image_processing_image-binary-classification_notebook_20_1.png
../../../_images/notebooks_image_processing_image-binary-classification_notebook_20_2.png

There were some images that were difficult to distinguish these labels even with the our eyes