How to use ReNom TDA

An introduction of how to use ReNom TDA.

In this tutorial, we visualize iris dataset. you can learn following points.

  • How to create topology using ReNom TDA.

Requirements

In [1]:
from sklearn.datasets import load_iris

from renom_tda.topology import Topology
from renom_tda.lens import PCA

Load dataset

Next, we have to load iris dataset. To accomplish this, we'll use the load_iris module included in the scikit-learn package.

The iris dataset consists of 150 data and data has 4 columns.

In [2]:
iris = load_iris()

data = iris.data
target = iris.target

Define topology instance

Next, we have to define topology instance.

In [3]:
topology = Topology()

Load data

Next, we load data.
We use load_data function to load data in topology instance.
In [4]:
topology.load_data(data)

Create point cloud

Next, we create point cloud that is projected on 2 or 3 dimention space.

We use fit_transform function to project data with two parameter, metric and lens.

Metric is how to measure distance between data. Lens is the axis of projected space.

This tutorial use metric None and lens PCA. This means dimenstion reduction with normal PCA.

In [5]:
metric = None
lens = [PCA(components=[0, 1])]
topology.fit_transform(metric=metric, lens=lens)
projected by PCA.

Mapping to topological space

Next, we create topology.

We use map function to map point cloud to topological space.

We set three parameter, resolution, overlap, eps and min_samples.

Resolution means the number of division. It effects the number of nodes.

Overlap means the easiness to connect with each nodes.

Eps and min_samples is used by clustering method for data that is in nodes.

In [6]:
topology.map(resolution=15, overlap=0.5, eps=0.1, min_samples=3)
created 70 nodes.
created 192 edges.

Color topology

Next, we colorize topology using color funcion.

In this tutorial, topology is colored by iris label values.

We can select color_method is "mean" or "mode" and color_type is "rgb" or "gray".

In [7]:
topology.color(target, color_method="mode", color_type="rgb")

Show topology

Finally, we show topology.

This graph shows that two cluster exists in this dataset.

Small cluster consists of one label and large cluster consists of two different labels.

In [8]:
topology.show(fig_size=(10, 10), node_size=10, edge_width=2)
../../../_images/notebooks_tda_how-to-use-ReNomTDA_notebook_16_0.png