Tutorial 0.1 Auto Differentiation

This is an introduction of auto differentiation with ReNom.

As an example, we create dasets from sin function with small noises and separate it into train dataset and test dataset. Then we define a 2-layer neural network with most naive represantation.

Requiremens

In this tutorial, following modules are required.

In [1]:
# -*- coding: utf-8 -*-
import matplotlib.pyplot as plt
import numpy as np
import renom as rm

Data preparation

As stated above, we create an dataset from sin function. The populatoin of our dataset population_distribution is defined as below.
The train data and test data are generated with regard to population_distribution .
In [2]:
population_distribution = lambda x:np.sin(x) + np.random.randn(*x.shape)*0.05

train_x = np.random.rand(1024, 1)*np.pi*2
train_y = population_distribution(train_x)

test_x = np.random.rand(128, 1)*np.pi*2
test_y = population_distribution(test_x)

The following graph is the population of generated dataset. The blue one is train set, the orange one is test set.

In [3]:
plt.clf()
plt.grid()
plt.scatter(train_x, train_y, label = "train")
plt.scatter(test_x, test_y, label="test")
plt.title("Population of dataset")
plt.ylabel("y")
plt.xlabel("x")
plt.legend()
plt.show()
../../_images/notebooks_tutorial0.1-autodifferentiation_notebook_6_0.png

Neural network definition

We define a 2-layer neural network. It has 2 weight parameters and 2 bias parameters. These parameters are updated with regard to their gradients. Thus these paramters are created as a Variable object.

In [4]:
INPUT_SIZE = 1
OUTPUT_SIZE = 1
HIDDEN_SIZE = 5

w1 = rm.Variable(np.random.randn(INPUT_SIZE, HIDDEN_SIZE)*0.01)
b1 = rm.Variable(np.zeros((1, HIDDEN_SIZE)))
w2 = rm.Variable(np.random.randn(HIDDEN_SIZE, OUTPUT_SIZE)*0.01)
b2 = rm.Variable(np.zeros((1, OUTPUT_SIZE)))

optimiser = rm.Sgd(0.01)

def nn_forward(x):
    z = rm.dot(rm.tanh(rm.dot(x, w1) + b1), w2) + b2
    return z

def nn(x, y):
    z = nn_forward(x)
    loss = rm.sum(((z - y)**2)/2)
    return loss

Training loop

Training loop is described below.

In [5]:
N = len(train_x)
batch_size = 32
train_curve = []
for i in range(1, 101):
    perm = np.random.permutation(N)
    total_loss = 0
    for j in range(N//batch_size):
        index = perm[j*batch_size:(j+1)*batch_size]
        train_batch_x = train_x[index]
        train_batch_y = train_y[index]
        loss = nn(train_batch_x, train_batch_y)
        loss.grad().update(optimiser)
        total_loss += loss.as_ndarray()
    train_curve.append(total_loss/(j+1))
    if i%10 == 0:
        print("epoch %02d train_loss:%f"%(i, train_curve[-1]))

plt.clf()
plt.grid()
plt.plot(train_curve)
plt.title("Training curve")
plt.ylabel("train error")
plt.xlabel("epoch")
plt.show()
epoch 10 train_loss:1.167116
epoch 20 train_loss:0.470738
epoch 30 train_loss:0.424667
epoch 40 train_loss:0.458149
epoch 50 train_loss:0.422721
epoch 60 train_loss:0.375404
epoch 70 train_loss:0.382299
epoch 80 train_loss:0.265065
epoch 90 train_loss:0.444043
epoch 100 train_loss:0.262017
../../_images/notebooks_tutorial0.1-autodifferentiation_notebook_10_1.png

Prediction

At the last, we test our model with test dataset. Normally ReNom returns Node object and it contiues to expand the computational graph. For that reason, the method as_ndarray should be called.

In [6]:
predicted = nn_forward(test_x).as_ndarray()

We can confirm the model approximates the test dataset population.

In [7]:
plt.clf()
plt.grid()
plt.scatter(test_x, test_y, label = "true")
plt.scatter(test_x, predicted, label="predicted")
plt.title("Prediction result")
plt.ylabel("y")
plt.xlabel("x")
plt.legend()
plt.show()
../../_images/notebooks_tutorial0.1-autodifferentiation_notebook_14_0.png