# Bike Share prediction ¶

Bike Share prediction model using fully conected neural network having plural units in ouput layer.

In this section, we’ll construct fully-connected neural network to predict the number of two kinds of bike share users in a day from season, weather and so on. Please download the free data from UCI website in advance ( https://archive.ics.uci.edu/ml/datasets/Bike+Sharing+Dataset ).

## Required libraries ¶

- matplotlib 2.0.2
- numpy 1.12.1
- scikit-learn 0.18.2
- pandas 0.20.3

```
In [2]:
```

```
%matplotlib inline
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
import renom as rm
from renom import Sequential
from renom import Dense, Relu
from renom import Adam
```

## Load & preprocess the data ¶

First of all we’ll load the data. Although two files are contained in the downloaded folder, we’ll use only day.csv because we will predict the number of casual users and registered users in a day in this tutrial.

```
In [3]:
```

```
df = pd.read_csv("../day.csv")
```

We drop unnecessary columns (‘instant’, ‘dteday’, ‘cnt’) from the dataframe. Regarding ‘dteday’, it is dropped because it can be represented by other parameters (season, yr, mnth, holiday, weekday, workingday) and because of simplifying the preprocess of dataset. However, including dteday may improve the precision of prediction.

```
In [4]:
```

```
df1=df.drop(['instant','dteday','cnt'],axis=1)
df1.head()
```

```
Out[4]:
```

season | yr | mnth | holiday | weekday | workingday | weathersit | temp | atemp | hum | windspeed | casual | registered | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|

0 | 1 | 0 | 1 | 0 | 6 | 0 | 2 | 0.344167 | 0.363625 | 0.805833 | 0.160446 | 331 | 654 |

1 | 1 | 0 | 1 | 0 | 0 | 0 | 2 | 0.363478 | 0.353739 | 0.696087 | 0.248539 | 131 | 670 |

2 | 1 | 0 | 1 | 0 | 1 | 1 | 1 | 0.196364 | 0.189405 | 0.437273 | 0.248309 | 120 | 1229 |

3 | 1 | 0 | 1 | 0 | 2 | 1 | 1 | 0.200000 | 0.212122 | 0.590435 | 0.160296 | 108 | 1454 |

4 | 1 | 0 | 1 | 0 | 3 | 1 | 1 | 0.226957 | 0.229270 | 0.436957 | 0.186900 | 82 | 1518 |

Now, we standardize the data in each column and convert it to a numpy array.

```
In [5]:
```

```
df_s = df1.copy()
col_std=[]
col_mean=[]
for col in df1.columns:
v_std = df1[col].std()
v_mean = df1[col].mean()
col_std.append(v_std)
col_mean.append(v_mean)
df_s[col] = (df_s[col] - v_mean) / v_std
df_s.head()
```

```
Out[5]:
```

season | yr | mnth | holiday | weekday | workingday | weathersit | temp | atemp | hum | windspeed | casual | registered | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|

0 | -1.347291 | -1.000684 | -1.599066 | -0.171863 | 1.497783 | -1.470218 | 1.109667 | -0.826097 | -0.679481 | 1.249316 | -0.387626 | -0.753218 | -1.924153 |

1 | -1.347291 | -1.000684 | -1.599066 | -0.171863 | -1.495054 | -1.470218 | 1.109667 | -0.720601 | -0.740146 | 0.478785 | 0.749089 | -1.044499 | -1.913899 |

2 | -1.347291 | -1.000684 | -1.599066 | -0.171863 | -0.996248 | 0.679241 | -0.725551 | -1.633538 | -1.748570 | -1.338358 | 0.746121 | -1.060519 | -1.555624 |

3 | -1.347291 | -1.000684 | -1.599066 | -0.171863 | -0.497441 | 0.679241 | -0.725551 | -1.613675 | -1.609168 | -0.263001 | -0.389562 | -1.077996 | -1.411417 |

4 | -1.347291 | -1.000684 | -1.599066 | -0.171863 | 0.001365 | 0.679241 | -0.725551 | -1.466410 | -1.503941 | -1.340576 | -0.046275 | -1.115863 | -1.370398 |

## Split data ¶

We will split the data for training and test from dataset.

```
In [6]:
```

```
X, y = np.array(df_s.iloc[:, :11]), np.array(df_s.iloc[:, 11:13])
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.1, random_state=42)
```

## Definition of a neural network with sequential model ¶

A fully connected neural network is constructed because all parameters seems to have an effect on the the number of casual users and registerd users. The output layer has two unit because predicted objects are two. The number of units in other layers are hyperparameters. Refering the tutrial of Hyperparameter search is helpful to decide the number of units.

```
In [7]:
```

```
sequential = Sequential([
Dense(10),
Relu(),
Dense(8),
Relu(),
Dense(6),
Relu(),
Dense(2)
])
```

## Training loop for count of user regression ¶

```
In [8]:
```

```
# parameters
BATCH = 10
EPOCH = 100
optimizer = Adam(lr=0.01)
# Learning curves
learning_curve = []
test_curve = []
# Training loop
for i in range(1, 1+EPOCH):
N = X_train.shape[0] # Number of records in training data
perm = np.random.permutation(N)
train_loss = 0
for j in range(N//BATCH):
# Make mini-batch
index = perm[j*BATCH:(j+1)*BATCH]
train_batch_x = X_train[index]
train_batch_y = y_train[index]
# Forward propagation
with sequential.train():
z = sequential(train_batch_x)
loss = rm.mean_squared_error(z, train_batch_y)
# Backpropagation
grad = loss.grad()
# Update
grad.update(optimizer)
train_loss += loss.as_ndarray()
# calculate mean squared error for training data
train_loss = train_loss / (N // BATCH)
learning_curve.append(train_loss)
# calculate mean squared error for testidation data
y_test_pred = sequential(X_test)
test_loss = rm.mean_squared_error(y_test_pred, y_test).as_ndarray()
test_curve.append(test_loss)
# print training progress
if i % 10 == 0:
print("Epoch %d - loss: %f - test_loss: %f" % (i, train_loss, test_loss))
print('Finished!')
```

```
Epoch 10 - loss: 0.135388 - test_loss: 0.131142
Epoch 20 - loss: 0.119636 - test_loss: 0.117048
Epoch 30 - loss: 0.108519 - test_loss: 0.118423
Epoch 40 - loss: 0.105684 - test_loss: 0.140581
Epoch 50 - loss: 0.099742 - test_loss: 0.130865
Epoch 60 - loss: 0.099561 - test_loss: 0.127347
Epoch 70 - loss: 0.098927 - test_loss: 0.122917
Epoch 80 - loss: 0.095182 - test_loss: 0.140498
Epoch 90 - loss: 0.100924 - test_loss: 0.156761
Epoch 100 - loss: 0.093405 - test_loss: 0.150348
Finished!
```

## Model evaluation ¶

### Plot learning curve ¶

At first, let’s plot the learning curve to confirm whether the model has learned properly.

```
In [9]:
```

```
plt.figure(figsize=(10, 4))
plt.plot(learning_curve, label='train_loss')
plt.plot(test_curve, label='test_loss', alpha=0.6)
plt.title('Learning curve')
plt.xlabel("Epoch")
plt.ylabel("MSE")
plt.ylim(0, 1)
plt.legend()
plt.grid()
```

From above figure, we can figure out the test loss curve gets to deviate from the train loss curve. This means the model gets to overfit the train data, thererfore leanring is stopped before the test loss deviate far from train loss. This overfitting is likely to occur when the number of dataset is small.

### Compare the actual and predicted count of users ¶

Next, we would like to compare the actual data with predicted data.

```
In [10]:
```

```
# predict test value
y_pred = sequential(X_test)
casual_true = y_test[:,:1].reshape(-1, 1) * col_std[11] + col_mean[11]
casual_pred = y_pred[:,:1] * col_std[11] + col_mean[11]
registered_true = y_test[:,1:2].reshape(-1, 1) * col_std[12] + col_mean[12]
registered_pred = y_pred[:,1:2] * col_std[12] + col_mean[12]
plt.figure(figsize=(8, 8))
plt.plot([5, 8000], [5, 8000], c='k', alpha=0.6, label = 'diagonal line') # diagonal line
plt.scatter(casual_true, casual_pred,label='casual')
plt.scatter(registered_true, registered_pred,label='registered')
plt.xlim(0, 8000)
plt.ylim(0, 8000)
plt.xlabel('acutual count of users', fontsize=16)
plt.ylabel('predicted count of users', fontsize=16)
plt.legend()
plt.grid()
```

The graph’s x-axis is actual count of users and y-axis is predicted count of uses. The black line is diagonal line (y=x) and the closer to it the plots are, the better the prediction is. From the above graph, we can find that the number of users around 1000 to 3000 couldn’t be predicted successfully. In order to predict more accurately, it is necessary to separate into successfully and not successfully predicted data and investigate the cause.

### Root mean squared error ¶

The root mean squared error represents the average loss between original data and the predicted data. In this case, our prediction model has averagely about 208 error for casual users per day and about 377 error for registered users per day.

```
In [11]:
```

```
print("Root mean squared error:{}".format(np.sqrt(rm.mse(casual_true, casual_pred))))
print("Root mean squared error:{}".format(np.sqrt(rm.mse(registered_true, registered_pred))))
```

```
Root mean squared error:208.06089782714844
Root mean squared error:377.4617004394531
```