How to show Point Cloud

An introduction of show point cloud data.

In this tutorial, we visualize iris dataset and show point cloud.
In ReNom TDA we call dimension reduction data point cloud.
  • How to show point cloud.

Requirements

In [1]:
import numpy as np

from sklearn.datasets import load_iris

from renom_tda.topology import Topology
from renom_tda.lens import PCA, TSNE, MDS, Isomap

Dataset

Next, we have to load iris dataset. To accomplish this, we’ll use the load_iris module included in the scikit-learn package.

The iris dataset consists of 150 data and data has 4 columns.

In [2]:
iris = load_iris()

data = iris.data
target = iris.target

setosa = ["setosa"] * 50
versicolor = ["versicolor"] * 50
versinica = ["versinica"] * 50
species = np.array(setosa + versicolor + versinica).reshape(-1, 1)

text_data_columns = ["species"]
number_data_columns = ["sepal length", "sepal width", "petal length", "petal width"]

Define topology instance

Next, we have to define topology instance.

In [3]:
topology = Topology()

Load data

Next, we load data.
We use load_data function to load data in topology instance.
In [4]:
topology.load_data(data, number_data_columns=number_data_columns, text_data=species, text_data_columns=text_data_columns)

Create point cloud

Next, we create point cloud that is projected on 2 or 3 dimention space.

We use fit_transform function to project data with two parameter, metric and lens.

Metric is how to measure distance between data. Lens is the axis of projected space.

This tutorial use metric None and many lens.

In [5]:
metric = None
lens = [PCA(components=[0, 1])]
topology.fit_transform(metric=metric, lens=lens)
projected by PCA.

Colorize point cloud

Next, we colorize point cloud using color_point_cloud funcion.

In [6]:
topology.color_point_cloud(target, normalize=True)

Show point cloud

In [7]:
topology.show_point_cloud(fig_size=(10, 10), node_size=10)
../../../_images/notebooks_tda_how-to-show-point-cloud_notebook_14_0.png

Using another lens

In [8]:
metric = None
lens = [TSNE(components=[0, 1])]
topology.fit_transform(metric=metric, lens=lens)
topology.color_point_cloud(target, normalize=True)
topology.show_point_cloud(fig_size=(10, 10), node_size=10)
projected by TSNE.
../../../_images/notebooks_tda_how-to-show-point-cloud_notebook_16_1.png
In [9]:
metric = None
lens = [MDS(components=[0, 1])]
topology.fit_transform(metric=metric, lens=lens)
topology.color_point_cloud(target, normalize=True)
topology.show_point_cloud(fig_size=(10, 10), node_size=10)
projected by MDS.
../../../_images/notebooks_tda_how-to-show-point-cloud_notebook_17_1.png
In [10]:
metric = None
lens = [Isomap(components=[0, 1])]
topology.fit_transform(metric=metric, lens=lens)
topology.color_point_cloud(target, normalize=True)
topology.show_point_cloud(fig_size=(10, 10), node_size=10)
projected by Isomap.
../../../_images/notebooks_tda_how-to-show-point-cloud_notebook_18_1.png

Search point cloud

In [11]:
metric = None
lens = [PCA(components=[0, 1])]
topology.fit_transform(metric=metric, lens=lens)
topology.color_point_cloud(target, normalize=True)
projected by PCA.
In [12]:
search_dicts = [{
    "data_type": "number",
    "operator": "=",
    "column": "target",
    "value": 1
}, {
    "data_type": "number",
    "operator": ">",
    "column": "sepal length",
    "value": 6.0
}]
In [13]:
node_index = topology.search_point_cloud(search_dicts=search_dicts, target=target, search_type="column")
topology.show_point_cloud(fig_size=(10, 10), node_size=10)
../../../_images/notebooks_tda_how-to-show-point-cloud_notebook_22_0.png