Airline Passegers Prediction

An introduction of LSTM regression.

We can mainly use LSTM for classification and regression. This provides an example of applying LSTM for one dimension regression, and predict the number of passengers of international airline. You can download the dataset from http://datamarket.com/data/list/?q=provider:tsdl [Box & Jenkins (1976)]. By using the number of passengers of three months data, we will predict the number of passengers of next month. In this case, input data and output data is as follow. - input:period needed for prediction - exmaple:three months - output:month we want to predict - example:next month

In [1]:
%matplotlib inline
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.metrics import mean_squared_error

import renom as rm
from renom.optimizer import Adam
from renom.cuda import set_cuda_active
# if you would like to use GPU, set True, otherwise you should be set to False
set_cuda_active(False)

Create dataset

comment We will create the dataset from one-dimension time series array.

In [2]:
def create_dataset(ds, look_back=1):
    X, y = [], []
    for i in range(len(ds)-look_back):
        X.append(ds[i : i+look_back])
        y.append(ds[i+look_back])
    X = np.reshape(np.array(X), [-1, look_back, 1])
    y = np.reshape(np.array(y), [-1, 1])
    return X, y

Split data

We will split the data for training and test from dataset.

In [3]:
def split_data(X, y, test_size=0.1):
    pos = int(round(len(X) * (1-test_size)))
    X_train, y_train = X[:pos], y[:pos]
    X_test, y_test = X[pos:], y[pos:]
    return X_train, y_train, X_test, y_test

Load data from csv file

We will load the data and some adjustments to data. Nomalizing the data to learn stably.

In [4]:
df = pd.read_csv("international-airline-passengers.csv",usecols=[1])
ds = df.values.astype("float32")
plt.figure(figsize=(8,8))
plt.plot(ds)
plt.show("data.png")
plt.clf()
v_min = np.min(np.abs(ds))
v_max = np.max(np.abs(ds))
ds -= v_min
ds /= v_max - v_min


look_back = 3
X, y = create_dataset(ds, look_back)
print("X:{},y:{}".format(X.shape, y.shape))
X_train, y_train, X_test, y_test = split_data(X, y, 0.33)
print("X_train:{},y_train:{},X_test:{},y:test{}".format(X_train.shape, y_train.shape, X_test.shape, y_test.shape))
../../../_images/notebooks_time_series_lstm-regression_notebook_7_0.png
X:(141, 3, 1),y:(141, 1)
X_train:(94, 3, 1),y_train:(94, 1),X_test:(47, 3, 1),y:test(47, 1)
<matplotlib.figure.Figure at 0x7fd4430ddf60>

Model definition

In [5]:
sequential = rm.Sequential([
    rm.Lstm(15),
    rm.Dense(1)
])

Train loop

First, we will make the batch data for training. T is period needed for prediction, and we have to write sequential.truncate to truncate the propagation for one time sequence. We can update the parameters at l.grad().update(optimizer)

for t in range(T):
    z = sequential(X_test[:,t,:])
    l_test = rm.mse(zm response_batch)
sequential.truncate()
test_loss += l_test.as_ndarray()

Above part is for calculating test loss for confirmation of learning state each epoch.

In [6]:
batch_size = 15
epoch = 10000
N = len(X_train)
T = X_train.shape[1]

learning_curve = []
test_learning_curve = []
optimizer = Adam(lr=0.001)
for i in range(epoch):
    loss = 0
    test_loss = 0
    for j in range(N//batch_size):
        train_batch = X_train[j*batch_size : (j+1)*batch_size]
        response_batch = y_train[j*batch_size : (j+1)*batch_size]
        l = 0
        with sequential.train():
            for t in range(T):
                z = sequential(train_batch[:, t, :])
                l += rm.mse(z, response_batch)
            sequential.truncate()
        l.grad().update(optimizer)
        loss += l.as_ndarray()
    l_test = 0
    for t in range(T):
        z = sequential(X_test[:, t, :])
        l_test += rm.mse(z, y_test)
    sequential.truncate()
    test_loss += l_test.as_ndarray()
    if i % 500 == 0:
        print("epoch:{:04d} loss:{:.5f} test_loss:{:.5f}".format(i, loss, test_loss))
    learning_curve.append(loss)
    test_learning_curve.append(test_loss)
epoch:0000 loss:0.31358 test_loss:0.30123
epoch:0500 loss:0.07190 test_loss:0.07348
epoch:1000 loss:0.06732 test_loss:0.06736
epoch:1500 loss:0.06288 test_loss:0.06179
epoch:2000 loss:0.05899 test_loss:0.05701
epoch:2500 loss:0.05565 test_loss:0.05300
epoch:3000 loss:0.05281 test_loss:0.04970
epoch:3500 loss:0.05043 test_loss:0.04703
epoch:4000 loss:0.04846 test_loss:0.04492
epoch:4500 loss:0.04687 test_loss:0.04331
epoch:5000 loss:0.04559 test_loss:0.04210
epoch:5500 loss:0.04460 test_loss:0.04125
epoch:6000 loss:0.04383 test_loss:0.04066
epoch:6500 loss:0.04325 test_loss:0.04030
epoch:7000 loss:0.04282 test_loss:0.04009
epoch:7500 loss:0.04250 test_loss:0.03999
epoch:8000 loss:0.04228 test_loss:0.03997
epoch:8500 loss:0.04212 test_loss:0.03999
epoch:9000 loss:0.04201 test_loss:0.04003
epoch:9500 loss:0.04193 test_loss:0.04009

Evaluate the model and show the learning curve

for t in range(T):
    train_predict = sequential(X_train[:,t,:])
sequential.truncate()

This part predicts the number of passengers of next month on traning data.

for t in range(T):
    test_predict = sequential(X_test[:,t,:])
sequential.truncate()

This part predicts the number of passengers of next month on test data.

In [7]:
for t in range(T):
    test_predict = sequential(X_test[:, t, :])
sequential.truncate()

y_test_raw = y_test * (v_max - v_min) + v_min
test_predict = test_predict * (v_max - v_min) + v_min

print("Root mean squared error:{}".format(np.sqrt(mean_squared_error(y_test_raw, test_predict))))

plt.figure(figsize=(8,8))
plt.title("predictions")
plt.ylim(0, v_max+100)
plt.grid(True)
plt.plot(y_test_raw, ".", label ="original")
plt.plot(test_predict, ".", label="predict")

plt.plot(y_test_raw)
plt.plot(test_predict)

idx = 0
for true, pred in zip(y_test_raw, test_predict):
    plt.plot([idx,idx],[true,pred], c="k", lw=1, alpha=0.3)
    idx += 1
plt.legend()
plt.show("airline.png")
Root mean squared error:53.55954360961914
../../../_images/notebooks_time_series_lstm-regression_notebook_13_1.png

The root mean squared error represents the average loss between original data and the predicted data. In this case, our prediction model has averagely about 53 error in each month pridiction.

Final figure illustrate the difference between original data and preicted data. Blue points represent origina data, orange points represent predicted data, black lines show the error between original data and predicted data.