# Building energy efficiency prediction ¶

Energy efficiency prediction model using fully conected neural network.

In this section, we'll construct simple fully-connected neural networks to building energy efficiency analysis. Here we predict heating-load of from each building features, such as wall-area or glazing-area. Heating/Cooling load is defined by how much energy our air conditioners need to maintain indoor temperature (unit: kWh). The more difficult it is to keep indoor temperature, the bigger Heating/CoolingLoad become. To give an example, the size of room or building material pervious to heat (it means the building easily exchange heat with outdoor) can lead to bigger load. Please download the free data from UCI website in advance ( https://archive.ics.uci.edu/ml/datasets/Energy+efficiency ).

## Required libraries ¶

• matplotlib 2.0.2
• numpy 1.12.1
• scikit-learn 0.18.2
• pandas 0.20.3
• xlrd 1.1.0
In [1]:

%matplotlib inline
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split

import renom as rm
from renom import Sequential
from renom import Dense, Tanh, Relu


## Load & preprocess the data ¶

In [2]:

columns = ["RelativeCompactness", "SurfaceArea", "WallArea", "RoofArea", "OverallArea",

Out[2]:

0 0.98 514.5 294.0 110.25 7.0 2 0.0 0 15.55 21.33
1 0.98 514.5 294.0 110.25 7.0 3 0.0 0 15.55 21.33
2 0.98 514.5 294.0 110.25 7.0 4 0.0 0 15.55 21.33
3 0.98 514.5 294.0 110.25 7.0 5 0.0 0 15.55 21.33
4 0.90 563.5 318.5 122.50 7.0 2 0.0 0 20.84 28.28

Now, we standardize the data in each column and convert it to a numpy array.

In [3]:

df_s = df.copy()

for col in df.columns:
v_std = df[col].std()
v_mean = df[col].mean()
df_s[col] = (df_s[col] - v_mean) / v_std

In [4]:

df_s.head()

Out[4]:

0 2.040447 -1.784712 -0.561586 -1.469119 0.999349 -1.340767 -1.7593 -1.813393 -0.669679 -0.342443
1 2.040447 -1.784712 -0.561586 -1.469119 0.999349 -0.446922 -1.7593 -1.813393 -0.669679 -0.342443
2 2.040447 -1.784712 -0.561586 -1.469119 0.999349 0.446922 -1.7593 -1.813393 -0.669679 -0.342443
3 2.040447 -1.784712 -0.561586 -1.469119 0.999349 1.340767 -1.7593 -1.813393 -0.669679 -0.342443
4 1.284142 -1.228438 0.000000 -1.197897 0.999349 -1.340767 -1.7593 -1.813393 -0.145408 0.388113
In [5]:

X, y = np.array(df_s.iloc[:, :8]), np.array(df_s.iloc[:, 8:])
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.1, random_state=42)


## Definition of a nural network with sequential model ¶

In [6]:

model = Sequential([
Dense(8),
Relu(),
Dense(8),
Relu(),
Dense(6),
Relu(),
Dense(1)
])


## Define an optimizer function ¶

In [7]:

optimizer = rm.Adam()


## Training loop for heating-load regression ¶

In the training loop, we recommend to watch not only train loss but also test loss to prevent overfitting. After forward propagation, we calculate mean squared error (MSE) loss between actual and predicted heating load. Our aim is to fit the NN model by reducing the MSE loss.

In [8]:

# parameters
EPOCH = 3000 # Number of epochs
BATCH =128 # Mini-batch size

# Learning curves
learning_curve = []
test_curve = []

# Training loop
for i in range(1, 1+EPOCH):

N = X_train.shape[0] # Number of records in training data
perm = np.random.permutation(N)
train_loss = 0

for j in range(N//BATCH):
# Make mini-batch
index = perm[j*BATCH:(j+1)*BATCH]
train_batch_x = X_train[index]
train_batch_y = y_train[index]

# Forward propagation
with model.train():
z = model(train_batch_x)
loss = rm.mean_squared_error(z, train_batch_y)

# Backpropagation

# Update

train_loss += loss.as_ndarray()

# calculate mean squared error for training data
train_loss = train_loss / (N // BATCH)
learning_curve.append(train_loss)

# calculate mean squared error for testidation data
y_test_pred = model(X_test)
test_loss = rm.mean_squared_error(y_test_pred, y_test).as_ndarray()
test_curve.append(test_loss)

# print training progress
if i % 100 == 0:
print("Epoch %d - loss: %f - test_loss: %f" % (i, train_loss, test_loss))

print('Finished!')

Epoch 100 - loss: 0.113990 - test_loss: 0.130233
Epoch 200 - loss: 0.086719 - test_loss: 0.101008
Epoch 300 - loss: 0.080777 - test_loss: 0.097986
Epoch 400 - loss: 0.074606 - test_loss: 0.092612
Epoch 500 - loss: 0.071284 - test_loss: 0.085902
Epoch 600 - loss: 0.066842 - test_loss: 0.081911
Epoch 700 - loss: 0.065506 - test_loss: 0.080158
Epoch 800 - loss: 0.062914 - test_loss: 0.077236
Epoch 900 - loss: 0.059501 - test_loss: 0.074397
Epoch 1000 - loss: 0.041378 - test_loss: 0.046624
Epoch 1100 - loss: 0.030633 - test_loss: 0.032956
Epoch 1200 - loss: 0.025489 - test_loss: 0.028105
Epoch 1300 - loss: 0.022370 - test_loss: 0.024761
Epoch 1400 - loss: 0.019271 - test_loss: 0.021640
Epoch 1500 - loss: 0.018819 - test_loss: 0.020465
Epoch 1600 - loss: 0.018496 - test_loss: 0.019925
Epoch 1700 - loss: 0.018561 - test_loss: 0.019719
Epoch 1800 - loss: 0.018252 - test_loss: 0.019707
Epoch 1900 - loss: 0.018131 - test_loss: 0.019602
Epoch 2000 - loss: 0.017597 - test_loss: 0.019875
Epoch 2100 - loss: 0.017724 - test_loss: 0.019799
Epoch 2200 - loss: 0.017274 - test_loss: 0.019823
Epoch 2300 - loss: 0.017453 - test_loss: 0.019761
Epoch 2400 - loss: 0.017629 - test_loss: 0.019617
Epoch 2500 - loss: 0.017617 - test_loss: 0.019620
Epoch 2600 - loss: 0.018240 - test_loss: 0.019802
Epoch 2700 - loss: 0.018037 - test_loss: 0.019727
Epoch 2800 - loss: 0.018023 - test_loss: 0.019688
Epoch 2900 - loss: 0.017270 - test_loss: 0.019696
Epoch 3000 - loss: 0.017458 - test_loss: 0.019748
Finished!


## Model evaluation ¶

Let's evaluate the fitted model!

### Plot learning curve ¶

In [9]:

plt.figure(figsize=(10, 4))
plt.plot(learning_curve, label='loss')
plt.plot(test_curve, label='test_loss', alpha=0.6)
plt.title('Learning curve')
plt.xlabel("Epoch")
plt.ylabel("MSE")
plt.ylim(0, 0.2)
plt.legend()
plt.grid()


### Compare the actual and predicted heating-load ¶

In [12]:

# predict test value
y_pred = model(X_test)

# convert standardized heating-load to its original unit
heating_load_true = y_test[:, 0].reshape(-1, 1) * v_std + v_mean
heating_load_pred = y_pred * v_std + v_mean

plt.figure(figsize=(8, 8))
plt.plot([5, 50], [5, 50], c='k', alpha=0.6) # diagonal line
plt.xlim(5, 50)
plt.ylim(5, 50)

In [11]:

print(max(abs(heating_load_pred - heating_load_true)))

[ 1.57366776]