Tutorial 0.2 Functional Model

An introductoin of how to build functional model.

In this tutorial, you can learn the following:

  • How to define a neural network.
  • How to make it learn.
  • Accessing to each layers weight parameters.

No datasets or GPU are used in this tutorial. Renom version2.0 and numpy modules are required.

Required modules

In [1]:
from __future__ import division, print_function
import numpy as np
import renom as rm
from renom.optimizer import Sgd
from renom.cuda.cuda import set_cuda_active

Create original model class

In ReNom, using the model class to define the neural network class is recommended. The class renom.Model is able to prevent memory leaks and makes the code more understandable. Moreover, utilities (used for saving and copying weights) are designed for the model class.

Start with defining our own model class using renom.Model .

Then define learnable layers (parameters) in the **init ()** method as attributes of the class.

Finally, forward calculation is done with the forward() method.

In [2]:
class Tutorial02(rm.Model):

    def __init__(self):
        # Definition of learnable layer.
        # Layer objects are created but weights are not created yet.
        self._layer1 = rm.Dense(100)
        self._layer2 = rm.Dense(10)

    # Definition of forward calculation.
    def forward(self, x):
        return self._layer2(rm.relu(self._layer1(x)))

Dense object

A dense object represents a fully connected layer as decribed below:

z = W・x + b

while W and b are the learnable parameters.

The class Tutorial02, which is a 2 layer neural network, can be written as follows:

z = w2・activation(w1・x + b1) + b2

All objects that have learnable parameters refer to the Parametrized class.

In [3]:
# Instantiation.
layer = rm.Dense(2)
layer(np.random.rand(1, 2))

# Type
print(type(layer))

# Learnable parameters.
# 'Params' is a dictionaly.
keys = layer.params.keys()
print("This object has {} learnable parameters {}".format(len(keys), keys))

# Confirmation of the inheritence
print("Is this object a child of Parametrized?",
      isinstance(layer, rm.Parametrized))
<class 'renom.layers.function.dense.Dense'>
This object has 2 learnable parameters ['b', 'w']
Is this object a child of Parametrized? True

Activation functions

In the class Tutorial02, we use rm.relu as an activation function. For more different functions, please check the API section in the document.

In [4]:
r = np.random.randn(2)
a = rm.relu(r)
print("func    :          input            →          output  ")
print("relu    : {} → {}".format(r, a))
a = rm.sigmoid(r)
print("sigmoid : {} → {}".format(r, a))
a = rm.tanh(r)
print("tanh    : {} → {}".format(r, a))
func    :          input            →          output
relu    : [ 0.46189755  1.45523968] → [ 0.46189755  1.45523965]
sigmoid : [ 0.46189755  1.45523968] → [ 0.61346424  0.81080353]
tanh    : [ 0.46189755  1.45523968] → [ 0.43162951  0.8967241 ]

Execute forward propagation

After preparing the input data x and the target data y , instantiate the class object and execute forward propagation.

In [5]:
# This input matrix has 10 datas(records) and each data has 100 dims.
x = np.random.rand(10, 100)
# This target matrix has 10 datas(records) and each data has 10 dims.
y = np.random.rand(10, 10)

# Instantiation.
model = Tutorial02()

# Forward propagation.
z = model(x)
print("Output shape is {}.".format(z.shape))
Output shape is (10, 10).

Define loss function

For building the model, we use the gradient discent method to update the weight parameters. First, we have to define an objective function.

renom.mean_squared_error(z, y) is a loss function for measuring the distance between z and y(ex.1). You can also write it without the function (ex.2).

In [6]:
# These are same.
loss = rm.mean_squared_error(model(x), y) # ex.1
loss = rm.sum((model(x) - y)**2)/10/2      # ex.2

Execute backward propagation

Call the the method grad() to execute backpropagation. grad() returns a Grad class object, which contains references of the learnable parameters. Therefore you can update learnable parameters by calling the update() method of the Grad object.

At this point, there are 2 differences between renom versions 1 and 2. One is the name of the backpropagation function. Another one is the with block .

Because of the auto-differentiation function, computational graphs are generated throughout the code. This causes a memory leaks in both the cpu and gpu cases.

Therefore, in ReNom version 2, the auto-differentiation function is only enabled in the with block .

‘with model.train()’: means the computational graph, which contains learnable parameters of the ‘model’ instance, is generatable.

If there is no with block , learnable parameters will never be updated.

In [7]:
for _ in range(5):
    with model.train():
        loss = rm.sum((model(x) - y)**2)/10/2
    loss.grad().update()
    print(loss.as_ndarray())
3.31950998306
2.04589605331
1.4258275032
1.07277190685
0.85322368145

Update weights with optimizer

ReNom provides some optimizers, such as stochastic gradient descent.

In [8]:
optimizer = Sgd(lr = 0.01)
for _ in range(5):
    with model.train():
        loss = rm.sum((model(x) - y)**2)/10/2
    loss.grad().update(optimizer)
    print(loss.as_ndarray())
0.712773025036
0.618365347385
0.527970135212
0.473213016987
0.437764585018

Access the weight parameters

Once weight parameters are created after forward propagation, you can access them through the ‘params’ attribute.

‘params’ is a dictionary which contains learnable parameters.

In [9]:
# Confirm the weight parameters
print("keys of weight")
print(model._layer1.params.keys())

print()

# Get the parameters using either of the following ways
print("The weight of first layer's 'w'.")
print(model._layer1.params.w[:2, 0])
print(model._layer1.params["w"][:2, 0])

print()

# Initialize the parameters with random values
shape = model._layer1.params.w.shape
model._layer1.params.w = rm.Variable(np.random.randn(*shape)*0.1)
print("Set another value to the above weight.")
print(model._layer1.params.w[:2, 0])
keys of weight
['b', 'w']

The weight of first layer's 'w'.
[-0.07399973  0.03172175]
[-0.07399973  0.03172175]

Set another value to the above weight.
[-0.12289888  0.03108433]