Airline Passegers Prediction

An introduction of LSTM regression.

We can mainly use LSTM for classification and regression.
This provides an example of applying LSTM for one dimension regression, and predict the number of passengers of international airline.
You can download the dataset from
[Box & Jenkins (1976)].
By using the number of passengers of three months data, we will predict the number of passengers of next month.
In this case, input data and output data is as follow.
  • input:period needed for prediction
    • exmaple:three months
  • output:month we want to predict
    • example:next month

Required Libraries

  • matplotlib 2.0.2
  • numpy 1.12.1
  • scikit-learn 0.18.2
  • pandas 0.20.3
In [1]:
%matplotlib inline
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.metrics import mean_squared_error

import renom as rm
from renom.optimizer import Adam
from renom.cuda import set_cuda_active
# if you would like to use GPU, set True, otherwise you should be set to False

Create dataset

We will create the dataset from one-dimension time series array.

In [2]:
def create_dataset(ds, look_back=1):
    X, y = [], []
    for i in range(len(ds)-look_back):
        X.append(ds[i : i+look_back])
    X = np.reshape(np.array(X), [-1, look_back, 1])
    y = np.reshape(np.array(y), [-1, 1])
    return X, y

Split data

We will split the data for training and test from dataset.

In [3]:
def split_data(X, y, test_size=0.1):
    pos = int(round(len(X) * (1-test_size)))
    X_train, y_train = X[:pos], y[:pos]
    X_test, y_test = X[pos:], y[pos:]
    return X_train, y_train, X_test, y_test

Load data from csv file

We will load the data and some adjustments to data. Nomalizing the data to learn stably.

In [4]:
df = pd.read_csv("./international-airline-passengers.csv",usecols=[1],header=None,skiprows=1,skipfooter=3,engine="python")
ds = df.values.astype("float32")
data = []
for i in range(ds.shape[0]-1):
data = np.array(data)
v_min = np.min(np.abs(data))
v_max = np.max(np.abs(data))
data -= v_min
data /= v_max - v_min

look_back = 3
X, y = create_dataset(data, look_back)
print("X:{},y:{}".format(X.shape, y.shape))
X_train, y_train, X_test, y_test = split_data(X, y, 0.33)
print("X_train:{},y_train:{},X_test:{},y:test{}".format(X_train.shape, y_train.shape, X_test.shape, y_test.shape))
X:(140, 3, 1),y:(140, 1)
X_train:(94, 3, 1),y_train:(94, 1),X_test:(46, 3, 1),y:test(46, 1)
<matplotlib.figure.Figure at 0x7f2c1c8bd9e8>

Model definition

In [5]:
sequential = rm.Sequential([

Train loop

First, we will make the batch data for training. T is period needed for prediction, and we have to write sequential.truncate to truncate the propagation for one time sequence. We can update the parameters at l.grad().update(optimizer)

for t in range(T):
    z = sequential(X_test[:,t,:])
    l_test = rm.mse(zm response_batch)
test_loss += l_test.as_ndarray()

Above part is for calculating test loss for confirmation of learning state each epoch.

In [6]:
batch_size = 15
epoch = 800
N = len(X_train)
T = X_train.shape[1]

learning_curve = []
test_learning_curve = []
optimizer = Adam(lr=0.001)
for i in range(epoch):
    loss = 0
    test_loss = 0
    perm = np.random.permutation(N)
    for j in range(N//batch_size):
        train_batch = X_train[perm[j*batch_size : (j+1)*batch_size]]
        response_batch = y_train[perm[j*batch_size : (j+1)*batch_size]]
        l = 0
        with sequential.train():
            for t in range(T):
                z = sequential(train_batch[:, t, :])
                l += rm.mse(z, response_batch)
        loss += l.as_ndarray()
    l_test = 0
    for t in range(T):
        z = sequential(X_test[:, t, :])
        l_test += rm.mse(z, y_test)
    test_loss += l_test.as_ndarray()
    if i % 100 == 0:
        print("epoch:{:04d} loss:{:.5f} test_loss:{:.5f}".format(i, loss, test_loss))
epoch:0000 loss:0.49338 test_loss:0.11832
epoch:0100 loss:0.46984 test_loss:0.10772
epoch:0200 loss:0.40060 test_loss:0.10393
epoch:0300 loss:0.39566 test_loss:0.10658
epoch:0400 loss:0.37653 test_loss:0.09266
epoch:0500 loss:0.33101 test_loss:0.09584
epoch:0600 loss:0.31224 test_loss:0.10852
epoch:0700 loss:0.28301 test_loss:0.12889

Evaluate the model and show the learning curve

for t in range(T):
    train_predict = sequential(X_train[:,t,:])

This part predicts the number of passengers of next month on traning data.

for t in range(T):
    test_predict = sequential(X_test[:,t,:])

This part predicts the number of passengers of next month on test data.

In [7]:
for t in range(T):
    test_predict = sequential(X_test[:, t, :])

y_test_raw = y_test * (v_max - v_min) + v_min
test_predict = test_predict * (v_max - v_min) + v_min

print("Root mean squared error:{}".format(np.sqrt(mean_squared_error(y_test_raw, test_predict))))

plt.plot(y_test_raw, marker=".", label ="original")
plt.plot(test_predict, marker=".", label="predict")

Root mean squared error:53.07077407836914

The root mean squared error represents the average loss between original data and the predicted data. In this case, our prediction model has averagely about 45 error in each month pridiction.

Final figure illustrate the difference between original data and preicted data. Blue points represent origina data, orange points represent predicted data, black lines show the error between original data and predicted data.