# Boston House Price Mapping ¶

An introduction of Mapping boston house price dataset by ReNom TDA.

In this tutorial, we visualize boston house price dataset. you can learn following points.

- How to analyse topology.

## Requirement ¶

```
In [1]:
```

```
import numpy as np
from sklearn.cluster import DBSCAN
from sklearn.datasets import load_boston
from renom.tda.topology import Topology
from renom.tda.lens import PCA
```

## Import boston house price dataset ¶

Next, we have to load boston house price data. To accomplish this, we’ll
use the
```
load_boston
```

module included in the scikit-learn package.

The boston house price dataset consists of 506 data and data has 13 columns.

13 columns + target value is following.

CRIM - per capita crime rate by town

ZN - proportion of residential land zoned for lots over 25,000 sq.ft.

INDUS - proportion of non-retail business acres per town.

CHAS - Charles River dummy variable (1 if tract bounds river; 0 otherwise)

NOX - nitric oxides concentration (parts per 10 million)

RM - average number of rooms per dwelling

AGE - proportion of owner-occupied units built prior to 1940

DIS - weighted distances to five Boston employment centres

RAD - index of accessibility to radial highways

TAX - full-value property-tax rate per $10,000

PTRATIO - pupil-teacher ratio by town

B - 1000(Bk - 0.63)^2 where Bk is the proportion of blacks by town

LSTAT - lower status of the population

target - median value of owner-occupied homes

```
In [2]:
```

```
bos = load_boston()
target = bos.target
data = np.concatenate([bos.data, bos.target.reshape(-1,1)], axis=1)
data = (data - np.mean(data, axis=0)) / np.std(data, axis=0)
```

## Create topology instance ¶

```
In [3]:
```

```
topology = Topology()
```

## Create point cloud ¶

```
In [4]:
```

```
metric = None
lens = [PCA(components=[0,1])]
topology.fit_transform(data, metric=metric, lens=lens)
```

```
projected by PCA.
finish fit_transform.
```

## Mapping to topological space ¶

```
In [5]:
```

```
clusterer = DBSCAN(eps=25, min_samples=1)
topology.map(resolution=25, overlap=0.5, clusterer=clusterer)
```

```
mapping start, please wait...
created 304 nodes.
calculating cluster coordination.
calculating edge.
created 870 edges.
```

## Color topology & show ¶

```
In [6]:
```

```
print("colored by target value.")
topology.color(target, dtype="numerical", ctype="rgb")
topology.show(fig_size=(10, 10), node_size=5, edge_width=1, mode=None, strength=None)
for i in range(len(bos.feature_names)):
print("colored by %s." % bos.feature_names[i])
topology.color(data[:, i], dtype="numerical", ctype="rgb")
topology.show(fig_size=(10, 10), node_size=5, edge_width=1, mode=None, strength=None)
```

```
colored by target value.
```

```
colored by CRIM.
```

```
colored by ZN.
```

```
colored by INDUS.
```

```
colored by CHAS.
```

```
colored by NOX.
```

```
colored by RM.
```

```
colored by AGE.
```

```
colored by DIS.
```

```
colored by RAD.
```

```
colored by TAX.
```

```
colored by PTRATIO.
```

```
colored by B.
```

```
colored by LSTAT.
```

## conclusion ¶

This graph shows that boston house price coefficient with RM, PTRATIO, LSTAT.