Auto Encoder Visualization

Features of Explaination Variables Visualization using Auto Encoder

Features of explaination variables are important to analyze the data characteristics and to consider the feature engineering.
We can use Auto Encoder as one of the dimension reduction method, and its hidden units might represent the compressed and summarized values of explanatin variables. But, each visualization method, for example PCA and t-SNE, has characteristics of the visualization method.
So we can see the characteristics of auto encoder visualization.

Requirements

For this tutorial, you will need the following modules

  • numpy 1.13.1
  • matplotlib 2.0.2
  • scikit-learn 0.18.2
In [1]:
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.cm as cm
from sklearn.preprocessing import LabelBinarizer
from sklearn.datasets import load_iris, fetch_mldata

import renom as rm
from renom.optimizer import Adam
from renom.cuda import set_cuda_active
set_cuda_active(False)

Samples Balance

Load data and Display the number of samples for each class.

In [2]:
iris = load_iris()
X = iris.data
y = iris.target

X = X.astype(np.float32)
y = y.astype(np.float32)

left = []
height = []
label = []
uniq_label = list(set(y))
for i in range(0, len(uniq_label)):
    label.append(uniq_label[i])
    left.append(uniq_label[i])
    height.append(len(y[y==uniq_label[i]]))
plt.bar(left, height, color="black", tick_label=label, align="center")
plt.xlabel("Label")
plt.ylabel("Samples")
plt.show()
notebooks/visualization/autoencoder_visualization/../../../../../../../home/grid00/repositories/ReNom/doc/_build/html/.doctrees/nbsphinx/notebooks_visualization_autoencoder_visualization_notebook_4_0.png

Define the Model

Define the Auto Encoder model and inplement the predict function to use for visualization.
We have the number of units for hidden layer set 2.
Input and Output is same to reconstruct the input as output, so the role of hidden layer unit is finding the simple representation of explaination variables.
Generally, in the case that each explaination variable has the correlation for each other, the Auto Encoder can simplify the input information.
In [3]:
lb = LabelBinarizer().fit(y)
N = len(X)
class AutoEncoder(rm.Model):
    def __init__(self):
        super(AutoEncoder, self).__init__()
        self.layer1 = rm.Dense(2)
        self.layer2 = rm.Dense(4)

    def forward(self, X):
        t1 = self.layer1(X)
        out = self.layer2(t1)
        return out

    def predict(self, X, y):
        t1 = self.layer1(X)
        uniq_label = list(set(y))
        for i in range(0, len(uniq_label)):
            plt.scatter(t1[y==uniq_label[i], :][:, 0], t1[y==uniq_label[i], :][:, 1], color=cm.get_cmap("tab20").colors[i], label=str(uniq_label[i]), alpha=0.5)
        plt.legend()
        plt.show()
        out = self.layer2(t1)
        return out

Run the Model

We use the Auto Encoder as visualization method, but Auto Encoder has the learning using the random seed and initialization.
So, every time you learn, output is different and sometimes big different representation is occurred.
This is the first time Auto Encoder calculation.
In [4]:
model = AutoEncoder()
optimizer = Adam()

batch = 64
epoch = 100
for i in range(epoch):
    for j in range(N//batch):
        train_batch = X[j*batch : (j+1)*batch]
        with model.train():
            z = model.forward(train_batch)
            loss = rm.mse(z, train_batch)
        loss.grad().update(optimizer)
    if i%5 == 0:
        print("epoch %2d train_loss:%f" % (i, loss))

pred = model.predict(X, y)
epoch  0 train_loss:66.602905
epoch  5 train_loss:63.385071
epoch 10 train_loss:60.402035
epoch 15 train_loss:57.676476
epoch 20 train_loss:55.197670
epoch 25 train_loss:52.943356
epoch 30 train_loss:50.887455
epoch 35 train_loss:49.004417
epoch 40 train_loss:47.271103
epoch 45 train_loss:45.667252
epoch 50 train_loss:44.175373
epoch 55 train_loss:42.780369
epoch 60 train_loss:41.469101
epoch 65 train_loss:40.230038
epoch 70 train_loss:39.052925
epoch 75 train_loss:37.928589
epoch 80 train_loss:36.848644
epoch 85 train_loss:35.805466
epoch 90 train_loss:34.792000
epoch 95 train_loss:33.801750
notebooks/visualization/autoencoder_visualization/../../../../../../../home/grid00/repositories/ReNom/doc/_build/html/.doctrees/nbsphinx/notebooks_visualization_autoencoder_visualization_notebook_8_1.png

Run the Model

This is the second time Auto Encoder calculation.

In [5]:
model = AutoEncoder()
optimizer = Adam()

batch = 64
epoch = 100
for i in range(epoch):
    for j in range(N//batch):
        train_batch = X[j*batch : (j+1)*batch]
        with model.train():
            z = model.forward(train_batch)
            loss = rm.mse(z, train_batch)
        loss.grad().update(optimizer)
    if i%5 == 0:
        print("epoch %2d train_loss:%f" % (i, loss))

pred = model.predict(X, y)
epoch  0 train_loss:53.558716
epoch  5 train_loss:51.806366
epoch 10 train_loss:50.138603
epoch 15 train_loss:48.602467
epoch 20 train_loss:47.198566
epoch 25 train_loss:45.918667
epoch 30 train_loss:44.751385
epoch 35 train_loss:43.684692
epoch 40 train_loss:42.707012
epoch 45 train_loss:41.807610
epoch 50 train_loss:40.976612
epoch 55 train_loss:40.204933
epoch 60 train_loss:39.484173
epoch 65 train_loss:38.806442
epoch 70 train_loss:38.164253
epoch 75 train_loss:37.550442
epoch 80 train_loss:36.958046
epoch 85 train_loss:36.380226
epoch 90 train_loss:35.810226
epoch 95 train_loss:35.241344
notebooks/visualization/autoencoder_visualization/../../../../../../../home/grid00/repositories/ReNom/doc/_build/html/.doctrees/nbsphinx/notebooks_visualization_autoencoder_visualization_notebook_10_1.png
In this time, we try to visualize the iris dataset which composes of four dimension input and three classes, it was very easy case.
Next, we try to use the MNIST dataset which has the ten classes and input feature has 784 dimensions.
In [6]:
mnist = fetch_mldata("MNIST original", data_home=".")
X = mnist.data / 255
y = mnist.target

X = X.astype(np.float32)
y = y.astype(np.float32)

left = []
height = []
label = []
uniq_label = list(set(y))
for i in range(0, len(uniq_label)):
    label.append(uniq_label[i])
    left.append(uniq_label[i])
    height.append(len(y[y==uniq_label[i]]))
plt.clf()
plt.bar(left, height, color="black", tick_label=label, align="center")
plt.xlabel("Label")
plt.ylabel("Samples")
plt.show()
notebooks/visualization/autoencoder_visualization/../../../../../../../home/grid00/repositories/ReNom/doc/_build/html/.doctrees/nbsphinx/notebooks_visualization_autoencoder_visualization_notebook_12_0.png

Define the Model for MNIST

In [7]:
lb = LabelBinarizer().fit(y)
N = len(X)
class AutoEncoder(rm.Model):
    def __init__(self):
        super(AutoEncoder, self).__init__()
        self.layer1 = rm.Dense(2)
        self.layer2 = rm.Dense(784)

    def forward(self, X):
        t1 = self.layer1(X)
        out = self.layer2(t1)
        return out

    def predict(self, X, y):
        t1 = self.layer1(X)
        uniq_label = list(set(y))
        for i in range(0, len(uniq_label)):
            plt.scatter(t1[y==uniq_label[i], :][:, 0], t1[y==uniq_label[i], :][:, 1], color=cm.get_cmap("tab20").colors[i], label=str(uniq_label[i]), alpha=0.5)
        plt.legend()
        plt.show()
        out = self.layer2(t1)
        return out

Run the Model

This is the Auto Encoder calculation for MNIST.

In [8]:
model = AutoEncoder()
optimizer = Adam()

batch = 64
epoch = 100
for i in range(epoch):
    for j in range(N//batch):
        train_batch = X[j*batch : (j+1)*batch]
        with model.train():
            z = model.forward(train_batch)
            loss = rm.mse(z, train_batch)
        loss.grad().update(optimizer)
    if i%5 == 0:
        print("epoch %2d train_loss:%f" % (i, loss))

pred = model.predict(X, y)
epoch  0 train_loss:18.098236
epoch  5 train_loss:19.707720
epoch 10 train_loss:18.138201
epoch 15 train_loss:18.016071
epoch 20 train_loss:17.998529
epoch 25 train_loss:17.979641
epoch 30 train_loss:17.986599
epoch 35 train_loss:17.984344
epoch 40 train_loss:18.061237
epoch 45 train_loss:18.016376
epoch 50 train_loss:18.029505
epoch 55 train_loss:18.093777
epoch 60 train_loss:18.066256
epoch 65 train_loss:18.138508
epoch 70 train_loss:18.124315
epoch 75 train_loss:18.271906
epoch 80 train_loss:18.221874
epoch 85 train_loss:18.161594
epoch 90 train_loss:18.229214
epoch 95 train_loss:18.162909
notebooks/visualization/autoencoder_visualization/../../../../../../../home/grid00/repositories/ReNom/doc/_build/html/.doctrees/nbsphinx/notebooks_visualization_autoencoder_visualization_notebook_16_1.png
As we can see above, it is difficult to visualize the characteristics of the data which has many classes and dimenstions.
Different from PCA and t-SNE, Auto Encoder tries to find the better representation for reconstruction of the input.
So, we consider that the visualization of Auto Encoder is useful of the measurement whether fully connected network can classify the data easier or not.
We can’t mention which visualization method is the best, but as we can see, each method has the uniqueness and focus on the different aspect of the data.