Introduction to Perceptron and Layer

A Summary of Perceptron and Layer

In this tutorial, we will focus on perceptrons and layers that consist in neural network. There is also the perceptron function, which is compared to the neural network itself, but here, we will refer to perceptron as the unit used in neural network.

About Perceptrons

Neural network consists of units called ‘perceptron’. These units are actually based on neuron cells.

Neuron cells exist in our brain and it is said that the human brain consists of 10 to 100 billion neurons. When information travels through neuron, neurons receive electrical pulse, and if they receive a certain amount of electricity, it fires electrical pulse, sending it to another neuron.

McCulloch et al. [1] used this feature and presented a mathematical model called ‘perceptron’. A perceptron receives signal with weight constants multiplied to it, takes the sum of all inputs, and pass it through the activation function. Activation function outputs a certain amount of signal, depending on the amount of input received. The diagram below represents a diagram of a perceptron model.

The equation below represents the calculation of the diagram above.

\begin{split}\begin{array}{l} y=f(z)\\ z=w_1x_1+w_2x_2+w_3x_3+\dots+b \end{array}\end{split}

As explained earlier, neurons fire when received a certain amount of electricity. Thus, the first activation function that was used was a threshold function, which outputs 1 when input crosses over the threshold and outputs 0 when it doesn’t.

\begin{split}f(z)=\left\{ \begin{array}{l} 0 & (x \lt 0) \\ 1 & (x \geq 0) \end{array} \right.\end{split}
In [9]:
import numpy as np
import matplotlib.pyplot as plt

x=np.array([i/100 for i in range(-3000,3000)])

However, when building a network using threshold function, there were some drawback, such as difficulty in computation for updating the weights. For this problem, David E. and Hinton et al.[2] presented the a new type of activation function: the sigmoid function.

f(x) = \frac{1}{1 + \exp(-x)}
In [8]:
import renom as rm

x=np.array([i/100 for i in range(-3000,3000)])

When inputs are large, we can obtain the same output properties as the threshold function from the sigmoid function. With this function introduced, we are able to obtain values that are between [0,1]. Also, with this function, we are able to update the weights in an easier way.

Getting values between [0,1] is something we should appreciate about. If 1 perceptron can only output 0 or 1 value, we can only apply it to classification problems. However, because sigmoid functions are able to output the values between [0,1], we can also apply it to regression problems. Also, if we build a network with Perceptrons, we can also build functions that are applicable to much difficult non-linear regression or classification problem.

Now-a-days, there have been many activation functions proposed after sigmoid function. Nevertheless, all these functions can represent values other than 0 and 1. We can also use this function to develop a network for difficult non-linear regression or classification problem.

Perceptrons are also known as units, nodes, neurons to some people. If we define it more strictly, it may differ from other people, but we believe that using it as the same term is not a problem.

About Layers

So what are layers? Layers are a group of perceptrons. As mentioned above, neural network is consisted with perceptrons, but if you look from a broader view, you can also see it as a network of layers. Usually, when calculation it a feed forward method, we calculate through each layer. In a layer, perceptrons don’t connect with each other but rather connects with the perceptrons in behind or in front of it.


Neural Network consists of perceptrons and layers. If looking from broader perspective, we can see it as a connection of layers, but from a narrow view, we can see it as a connection of perceptrons. Perceptrons are based on human’s neuron cell, and by implementing and constructing a network out of it, we can build a function that can solve difficult regressive or classification problems.

As a side note, the definition of perceptron differs on how people see it, since there is no consistent definition for this term. Because MLP, constructs of multiple perceptron as nodes, we think of perceptrons as a mathematical model representing the function of a neuron.

[1] McCulloch, Warren S., and Walter Pitts. "A logical calculus of the ideas immanent in nervous activity." The bulletin of mathematical biophysics 5.4 (1943): 115-133.

[2] Rumelhart, David E., Geoffrey E. Hinton, and Ronald J. Williams. "Learning representations by back-propagating errors." nature 323.6088 (1986): 533.