Convert Excel to CSV

An introduction of converting Excel data to csv file.

In this tutorial, we create csv file for ReNom TDA GUI from Excel file.

Requirements

xlrd==1.1.0

In [1]:
import pandas as pd

load Excel file

In [2]:
xls_file = "test.xlsx"
sheet_name = "Sheet1"
xls_data = pd.read_excel(xls_file, sheetname=sheet_name)
In [3]:
xls_data
Out[3]:
category1 category2 category3 number1 number2
0 テスト1 A 0 0.0 1
1 テスト2 A 0 0.2 2
2 テスト3 B 0 0.4 3
3 テスト4 B 1 0.6 4
4 テスト5 B 1 0.8 5

Check data type of columns

We have to check data type of columns.

ReNom TDA GUI use data that type is int64 or float64 to calculate topology.

And we can use object data as categorical data to search nodes.

In [4]:
xls_data.dtypes
Out[4]:
category1     object
category2     object
category3      int64
number1      float64
number2        int64
dtype: object

Change data type of columns

If we have categorical data that type is integer, we should change data type integer to object.

In [5]:
xls_data["category3"] = xls_data["category3"].astype(object)
In [6]:
xls_data.dtypes
Out[6]:
category1     object
category2     object
category3     object
number1      float64
number2        int64
dtype: object

Select columns

If we don’t use all columns, we should select data to use.

In [8]:
use_columns = ["category1", "category3", "number1", "number2"]
output_data = xls_data[use_columns]
In [9]:
output_data
Out[9]:
category1 category3 number1 number2
0 テスト1 0 0.0 1
1 テスト2 0 0.2 2
2 テスト3 0 0.4 3
3 テスト4 1 0.6 4
4 テスト5 1 0.8 5

Output csv file

Finally, output csv file.

In [10]:
output_data.to_csv("outpt.csv", index=False)