Airline Passegers Prediction

An introduction of LSTM regression.

We can mainly use LSTM for classification and regression.
This provides an example of applying LSTM for one dimension regression, and predict the number of passengers of international airline.
You can download the dataset from
[Box & Jenkins (1976)].
By using the number of passengers of three months data, we will predict the number of passengers of next month.
In this case, input data and output data is as follow.
  • input:period needed for prediction
    • exmaple:three months
  • output:month we want to predict
    • example:next month

Required Libraries

  • matplotlib 2.0.2
  • numpy 1.12.1
  • scikit-learn 0.18.2
  • pandas 0.20.3
In [1]:
%matplotlib inline
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.metrics import mean_squared_error

import renom as rm
from renom.optimizer import Adam
from renom.cuda import set_cuda_active
# if you would like to use GPU, set True, otherwise you should be set to False

Create dataset

We will create the dataset from one-dimension time series array.

In [2]:
def create_dataset(ds, look_back=1):
    X, y = [], []
    for i in range(len(ds)-look_back):
        X.append(ds[i : i+look_back])
    X = np.reshape(np.array(X), [-1, look_back, 1])
    y = np.reshape(np.array(y), [-1, 1])
    return X, y

Split data

We will split the data for training and test from dataset.

In [3]:
def split_data(X, y, test_size=0.1):
    pos = int(round(len(X) * (1-test_size)))
    X_train, y_train = X[:pos], y[:pos]
    X_test, y_test = X[pos:], y[pos:]
    return X_train, y_train, X_test, y_test

Load data from csv file

We will load the data and some adjustments to data. Nomalizing the data to learn stably.

In [4]:
df = pd.read_csv("./international-airline-passengers.csv",usecols=[1],header=None,skiprows=1,skipfooter=3,engine="python")
ds = df.values.astype("float32")
data = []
for i in range(ds.shape[0]-1):
data = np.array(data)
v_min = np.min(np.abs(data))
v_max = np.max(np.abs(data))
data -= v_min
data /= v_max - v_min

look_back = 3
X, y = create_dataset(data, look_back)
print("X:{},y:{}".format(X.shape, y.shape))
X_train, y_train, X_test, y_test = split_data(X, y, 0.33)
print("X_train:{},y_train:{},X_test:{},y:test{}".format(X_train.shape, y_train.shape, X_test.shape, y_test.shape))
X:(140, 3, 1),y:(140, 1)
X_train:(94, 3, 1),y_train:(94, 1),X_test:(46, 3, 1),y:test(46, 1)
<matplotlib.figure.Figure at 0x7fe78bca5b00>

Model definition

In [5]:
sequential = rm.Sequential([

Train loop

First, we will make the batch data for training. T is period needed for prediction, and we have to write sequential.truncate to truncate the propagation for one time sequence. We can update the parameters at l.grad().update(optimizer)

for t in range(T):
    z = sequential(X_test[:,t,:])
    l_test = rm.mse(zm response_batch)
test_loss += l_test.as_ndarray()

Above part is for calculating test loss for confirmation of learning state each epoch.

In [6]:
batch_size = 15
epoch = 800
N = len(X_train)
T = X_train.shape[1]

learning_curve = []
test_learning_curve = []
optimizer = Adam(lr=0.001)
for i in range(epoch):
    loss = 0
    test_loss = 0
    perm = np.random.permutation(N)
    for j in range(N//batch_size):
        train_batch = X_train[perm[j*batch_size : (j+1)*batch_size]]
        response_batch = y_train[perm[j*batch_size : (j+1)*batch_size]]
        l = 0
        with sequential.train():
            for t in range(T):
                z = sequential(train_batch[:, t, :])
                l = rm.mse(z, response_batch)
        loss += l.as_ndarray()
    l_test = 0
    for t in range(T):
        z = sequential(X_test[:, t, :])
        l_test = rm.mse(z, y_test)
    test_loss += l_test.as_ndarray()
    if i % 100 == 0:
        print("epoch:{:04d} loss:{:.5f} test_loss:{:.5f}".format(i, loss, test_loss))
epoch:0000 loss:0.16352 test_loss:0.11648
epoch:0100 loss:0.11747 test_loss:0.10974
epoch:0200 loss:0.08164 test_loss:0.20705
epoch:0300 loss:0.07782 test_loss:0.13734
epoch:0400 loss:0.06581 test_loss:0.08912
epoch:0500 loss:0.04596 test_loss:0.11666
epoch:0600 loss:0.03076 test_loss:0.12018
epoch:0700 loss:0.03054 test_loss:0.10680

Evaluate the model and show the learning curve

for t in range(T):
    train_predict = sequential(X_train[:,t,:])

This part predicts the number of passengers of next month on traning data.

for t in range(T):
    test_predict = sequential(X_test[:,t,:])

This part predicts the number of passengers of next month on test data.

In [7]:
for t in range(T):
    test_predict = sequential(X_test[:, t, :])

y_test_raw = y_test * (v_max - v_min) + v_min
test_predict = test_predict * (v_max - v_min) + v_min

print("Root mean squared error:{}".format(np.sqrt(mean_squared_error(y_test_raw, test_predict))))

plt.plot(y_test_raw, marker=".", label ="original")
plt.plot(test_predict, marker=".", label="predict")

Root mean squared error:45.258628845214844

The root mean squared error represents the average loss between original data and the predicted data. In this case, our prediction model has averagely about 45 error in each month pridiction.

Final figure illustrate the difference between original data and preicted data. Blue points represent origina data, orange points represent predicted data, black lines show the error between original data and predicted data.