Functional Model

An introduction of how to build functional model.

In this tutorial, you can learn the following:

  • How to define a neural network.
  • How to train it.
  • Accessing each layer's weight parameters.

No datasets or GPU are used in this tutorial. Only the Renom version2.0 and numpy modules are required.

Required modules

  • numpy 1.12.1
In [1]:
from __future__ import division, print_function
import numpy as np
import renom as rm
from renom.optimizer import Sgd
from renom.cuda.cuda import set_cuda_active

Create original model class

In ReNom, there are mainly two ways to define the neural network model, one is the sequential model and another is the functional model.
It is recommended you use the sequential model when you use the simple neutal network model.
Sometimes, there are cases it is difficult to describe the compilicated model.
In such case, functional model is useful.
The class renom.Model is able to prevent memory leaks and makes the code more understandable. Moreover, utilities (used for saving and copying weights) are designed for the model class.
Let's start by defining our own model class using renom.Model .
You can define the learnable layers (parameters) in the __init__() as attributes of the class.
Finally, forward calculation is done with the forward() method.
In [2]:
class Tutorial02(rm.Model):

    def __init__(self):
        # Definition of learnable layer.
        # Layer objects are created but weights are not created yet.
        self._layer1 = rm.Dense(100)
        self._layer2 = rm.Dense(10)

    # Definition of forward calculation.
    def forward(self, x):
        return self._layer2(rm.relu(self._layer1(x)))

Dense object

A fully connected layer is represented by a dense object, as decribed below:

z = W・x + b

where W and b are the learnable parameters.

The class Tutorial02, which is a 2 layer neural network, can be written as follows:

z = w2・activation(w1・x + b1) + b2

All objects that have learnable parameters refer to the Parametrized class.

In [3]:
# Instantiation.
layer = rm.Dense(2)
layer(np.random.rand(1, 2))

# Type

# Learnable parameters.
# 'Params' is a dictionary.
keys = layer.params.keys()
print("This object has {} learnable parameters {}".format(len(keys), keys))

# Confirmation of the inheritence
print("Is this object a child of Parametrized?",
      isinstance(layer, rm.Parametrized))
<class 'renom.layers.function.dense.Dense'>
This object has 2 learnable parameters dict_keys(['w', 'b'])
Is this object a child of Parametrized? True

Activation functions

In the class Tutorial02 , we use rm.relu as an the activation function. For other activation function choices, please see the API section in the documentation.

In [4]:
r = np.random.randn(2)
a = rm.relu(r)
print("func    :          input            →          output  ")
print("relu    : {} → {}".format(r, a))
a = rm.sigmoid(r)
print("sigmoid : {} → {}".format(r, a))
a = rm.tanh(r)
print("tanh    : {} → {}".format(r, a))
func    :          input            →          output
relu    : [-0.10757246  1.86587957] → [ 0.          1.86587954]
sigmoid : [-0.10757246  1.86587957] → [ 0.47313279  0.8659808 ]
tanh    : [-0.10757246  1.86587957] → [-0.10715944  0.95321912]

Execute forward propagation

After preparing the input data x and the target data y , instantiate the class object and execute forward propagation.

In [5]:
# This input matrix has 10 datas(records) and each data has 100 dims.
x = np.random.rand(10, 100)
# This target matrix has 10 datas(records) and each data has 10 dims.
y = np.random.rand(10, 10)

# Instantiation.
model = Tutorial02()

# Forward propagation.
z = model(x)
print("Output shape is {}.".format(z.shape))
Output shape is (10, 10).

Define loss function

For building the model, we use the gradient descent method to update the weight parameters. First, we have to define an objective function.

renom.mean_squared_error(z, y) is a loss function for measuring the distance between z and y(ex.1). You can also write it without the function (ex.2).

In [6]:
# These are same.
loss = rm.mean_squared_error(model(x), y) # ex.1
loss = rm.sum((model(x) - y)**2)/10/2      # ex.2

Execute backward propagation

Call the the method grad() to execute backpropagation, and returns a Grad class object, which contains references to the learnable parameters. These references allow you to update learnable parameters by calling the update() method of the Grad object.

At this point, there are 2 differences between renom versions 1 and 2: One is the name of the backpropagation function. Another is the with block .

Because of the auto-differentiation function, computational graphs are generated throughout the code. This causes memory leaks in both the cpu and gpu cases.

Therefore, in ReNom version 2, the auto-differentiation function is only enabled in the with block .

'with model.train()': means the computational graph, which contains learnable parameters of the 'model' instance, is generatable.

If there is no with block , learnable parameters will never be updated.

In [7]:
for _ in range(5):
    with model.train():
        loss = rm.sum((model(x) - y)**2)/10/2

Update weights with an optimizer

ReNom provides some optimizers, such as stochastic gradient descent.

In [8]:
optimizer = Sgd(lr = 0.01)
for _ in range(5):
    with model.train():
        loss = rm.sum((model(x) - y)**2)/10/2

Access the weight parameters

Once weight parameters have been created by forward propagation, you can access them through the 'params' attribute.

'params' is a dictionary which contains learnable parameters.

In [9]:
# Confirm the weight parameters
print("keys of weight")


# Get the parameters using either of the following ways
print("The weight of first layer's 'w'.")
print(model._layer1.params.w[:2, 0])
print(model._layer1.params["w"][:2, 0])


# Initialize the parameters with random values
shape = model._layer1.params.w.shape
model._layer1.params.w = rm.Variable(np.random.randn(*shape)*0.1)
print("Set another value to the above weight.")
print(model._layer1.params.w[:2, 0])
keys of weight
dict_keys(['w', 'b'])

The weight of first layer's 'w'.
[ -7.86825895 -13.92789936]
[ -7.86825895 -13.92789936]

Set another value to the above weight.
[ 0.056779   -0.09167486]