リレーショナルデータベース(Postgresql)をAI開発に活用

企業内のマスターデータの保管先であるリレーショナルデーターベースを有効活用し、AI開発を活性化させる。

企業内で、AI開発を目指す方に役立つ情報を発信する為に、自らが気付いたテクニカルなポイントに対して、備忘ログとして残すとともに、活用いただく為に公開します。

キーワード:PostgreSQL, pyhon, SQL, AI development, psycopg2.

Overview of the notebook

企業内のマスターデータの保管先であるリレーショナルデーターベースを有効活用し、AI開発を活性化させる。

General AI development

・課題に応じて、マスターデータの保管先であるリレーショナルデータベースからデータ抽出し、CSVファイル化

・作成したCSVファイルを、AI開発用環境上へ配置

・Pythonのノウハウを使ってAI開発

Conventional issue

開発を行う場合、効率を考えDockerを利用するが、データ管理が複雑

・データ抽出や、CSVファイルの配置などが手間

・Pythonの勉強をして、データの前処理をプログラミングする必要

Solution of the conventional problem

リレーショナルデータベース上のデータを、Python上のプログラムから直接IOし、課題を解決

・DockerコンテナのPythonプログラムから直接IOでき簡単

・ネットワーク経由でIOする事で、Dockerコンテナーに関する複雑性を排除

・従来のSQL文の知識を使って、レコード抽出、項目選択、欠損データ保管処理等が実施可能

Precondition

リレーショナルデータベースとしては、OSSであり、私の周りでAI開発における利用実績が多い、Postgresqlの使い紹介

・各ソフトウエアのバージョンは、普段利用しているAIソフトウエアGRID社の"ReNom"を参考に決定

・ドライバーに関しては、Webを検索した際の利用動向から独断で決定

environment

・kaggleのタイタニック号の生存者に関する情報をサンプルとして利用します。Dataタグから "train.csv" をダウンロード

[データソース] ( https://www.kaggle.com/c/titanic )

・"titanic_train" という名前で、ネットワーク経由でアクセスできるPostgresql上にテーブルを作成し、データを配置する。

・ダウンロードしたデータを事前にインポートしておく

コンテナ環境

・ubunts 16.04 LTS ・Python:3.6 ・pandas:0.20.3 ・psycopg2:2.7.3.2( https://qiita.com/takahi/items/c9b7fc01cdccd4b7f661 )

sample

In [1]:
# Install "psycopg2" which is common as a driver for Postgresql.
In [2]:
#!/usr/bin/env python
# encoding:utf-8

# Import a library "psycopg 2".
import psycopg2

# Pandas library for relational database prepare.
import pandas.io.sql as psql
In [3]:
# Connect to the Postgresql server via the network.
# (when using Docker, there is no need to be aware of the folder of the source data.)

con = psycopg2.connect("host=(IP) port=(Port) dbname=(DB Name) user=(User) password=(Pass)")
con.get_backend_pid()
cur = con.cursor()
In [4]:
# Of the survivor data of Kaggle's Titanic, download training data in advance and insert it
# into the Postgresql database with the table titled "titanic_train".
In [5]:
# Select in the data frame of Pandas with no prepation.
sql = """select * from titanic_train;"""
df_train_org = psql.read_sql(sql, con)

df_train_org
Out[5]:
passengerid survived pclass name sex age sibsp parch ticket fare cabin embarked
0 1.0 0.0 3.0 Braund, Mr. Owen Harris male 22.0 1.0 0.0 A/5 21171 7.2500 None S
1 2.0 1.0 1.0 Cumings, Mrs. John Bradley (Florence Briggs Th... female 38.0 1.0 0.0 PC 17599 71.2833 C85 C
2 3.0 1.0 3.0 Heikkinen, Miss. Laina female 26.0 0.0 0.0 STON/O2. 3101282 7.9250 None S
3 4.0 1.0 1.0 Futrelle, Mrs. Jacques Heath (Lily May Peel) female 35.0 1.0 0.0 113803 53.1000 C123 S
4 5.0 0.0 3.0 Allen, Mr. William Henry male 35.0 0.0 0.0 373450 8.0500 None S
5 6.0 0.0 3.0 Moran, Mr. James male NaN 0.0 0.0 330877 8.4583 None Q
6 7.0 0.0 1.0 McCarthy, Mr. Timothy J male 54.0 0.0 0.0 17463 51.8625 E46 S
7 8.0 0.0 3.0 Palsson, Master. Gosta Leonard male 2.0 3.0 1.0 349909 21.0750 None S
8 9.0 1.0 3.0 Johnson, Mrs. Oscar W (Elisabeth Vilhelmina Berg) female 27.0 0.0 2.0 347742 11.1333 None S
9 10.0 1.0 2.0 Nasser, Mrs. Nicholas (Adele Achem) female 14.0 1.0 0.0 237736 30.0708 None C
10 11.0 1.0 3.0 Sandstrom, Miss. Marguerite Rut female 4.0 1.0 1.0 PP 9549 16.7000 G6 S
11 12.0 1.0 1.0 Bonnell, Miss. Elizabeth female 58.0 0.0 0.0 113783 26.5500 C103 S
12 13.0 0.0 3.0 Saundercock, Mr. William Henry male 20.0 0.0 0.0 A/5. 2151 8.0500 None S
13 14.0 0.0 3.0 Andersson, Mr. Anders Johan male 39.0 1.0 5.0 347082 31.2750 None S
14 15.0 0.0 3.0 Vestrom, Miss. Hulda Amanda Adolfina female 14.0 0.0 0.0 350406 7.8542 None S
15 16.0 1.0 2.0 Hewlett, Mrs. (Mary D Kingcome) female 55.0 0.0 0.0 248706 16.0000 None S
16 17.0 0.0 3.0 Rice, Master. Eugene male 2.0 4.0 1.0 382652 29.1250 None Q
17 18.0 1.0 2.0 Williams, Mr. Charles Eugene male NaN 0.0 0.0 244373 13.0000 None S
18 19.0 0.0 3.0 Vander Planke, Mrs. Julius (Emelia Maria Vande... female 31.0 1.0 0.0 345763 18.0000 None S
19 20.0 1.0 3.0 Masselmani, Mrs. Fatima female NaN 0.0 0.0 2649 7.2250 None C
20 21.0 0.0 2.0 Fynney, Mr. Joseph J male 35.0 0.0 0.0 239865 26.0000 None S
21 22.0 1.0 2.0 Beesley, Mr. Lawrence male 34.0 0.0 0.0 248698 13.0000 D56 S
22 23.0 1.0 3.0 McGowan, Miss. Anna "Annie" female 15.0 0.0 0.0 330923 8.0292 None Q
23 24.0 1.0 1.0 Sloper, Mr. William Thompson male 28.0 0.0 0.0 113788 35.5000 A6 S
24 25.0 0.0 3.0 Palsson, Miss. Torborg Danira female 8.0 3.0 1.0 349909 21.0750 None S
25 26.0 1.0 3.0 Asplund, Mrs. Carl Oscar (Selma Augusta Emilia... female 38.0 1.0 5.0 347077 31.3875 None S
26 27.0 0.0 3.0 Emir, Mr. Farred Chehab male NaN 0.0 0.0 2631 7.2250 None C
27 28.0 0.0 1.0 Fortune, Mr. Charles Alexander male 19.0 3.0 2.0 19950 263.0000 C23 C25 C27 S
28 29.0 1.0 3.0 O'Dwyer, Miss. Ellen "Nellie" female NaN 0.0 0.0 330959 7.8792 None Q
29 30.0 0.0 3.0 Todoroff, Mr. Lalio male NaN 0.0 0.0 349216 7.8958 None S
... ... ... ... ... ... ... ... ... ... ... ... ...
861 862.0 0.0 2.0 Giles, Mr. Frederick Edward male 21.0 1.0 0.0 28134 11.5000 None S
862 863.0 1.0 1.0 Swift, Mrs. Frederick Joel (Margaret Welles Ba... female 48.0 0.0 0.0 17466 25.9292 D17 S
863 864.0 0.0 3.0 Sage, Miss. Dorothy Edith "Dolly" female NaN 8.0 2.0 CA. 2343 69.5500 None S
864 865.0 0.0 2.0 Gill, Mr. John William male 24.0 0.0 0.0 233866 13.0000 None S
865 866.0 1.0 2.0 Bystrom, Mrs. (Karolina) female 42.0 0.0 0.0 236852 13.0000 None S
866 867.0 1.0 2.0 Duran y More, Miss. Asuncion female 27.0 1.0 0.0 SC/PARIS 2149 13.8583 None C
867 868.0 0.0 1.0 Roebling, Mr. Washington Augustus II male 31.0 0.0 0.0 PC 17590 50.4958 A24 S
868 869.0 0.0 3.0 van Melkebeke, Mr. Philemon male NaN 0.0 0.0 345777 9.5000 None S
869 870.0 1.0 3.0 Johnson, Master. Harold Theodor male 4.0 1.0 1.0 347742 11.1333 None S
870 871.0 0.0 3.0 Balkic, Mr. Cerin male 26.0 0.0 0.0 349248 7.8958 None S
871 872.0 1.0 1.0 Beckwith, Mrs. Richard Leonard (Sallie Monypeny) female 47.0 1.0 1.0 11751 52.5542 D35 S
872 873.0 0.0 1.0 Carlsson, Mr. Frans Olof male 33.0 0.0 0.0 695 5.0000 B51 B53 B55 S
873 874.0 0.0 3.0 Vander Cruyssen, Mr. Victor male 47.0 0.0 0.0 345765 9.0000 None S
874 875.0 1.0 2.0 Abelson, Mrs. Samuel (Hannah Wizosky) female 28.0 1.0 0.0 P/PP 3381 24.0000 None C
875 876.0 1.0 3.0 Najib, Miss. Adele Kiamie "Jane" female 15.0 0.0 0.0 2667 7.2250 None C
876 877.0 0.0 3.0 Gustafsson, Mr. Alfred Ossian male 20.0 0.0 0.0 7534 9.8458 None S
877 878.0 0.0 3.0 Petroff, Mr. Nedelio male 19.0 0.0 0.0 349212 7.8958 None S
878 879.0 0.0 3.0 Laleff, Mr. Kristo male NaN 0.0 0.0 349217 7.8958 None S
879 880.0 1.0 1.0 Potter, Mrs. Thomas Jr (Lily Alexenia Wilson) female 56.0 0.0 1.0 11767 83.1583 C50 C
880 881.0 1.0 2.0 Shelley, Mrs. William (Imanita Parrish Hall) female 25.0 0.0 1.0 230433 26.0000 None S
881 882.0 0.0 3.0 Markun, Mr. Johann male 33.0 0.0 0.0 349257 7.8958 None S
882 883.0 0.0 3.0 Dahlberg, Miss. Gerda Ulrika female 22.0 0.0 0.0 7552 10.5167 None S
883 884.0 0.0 2.0 Banfield, Mr. Frederick James male 28.0 0.0 0.0 C.A./SOTON 34068 10.5000 None S
884 885.0 0.0 3.0 Sutehall, Mr. Henry Jr male 25.0 0.0 0.0 SOTON/OQ 392076 7.0500 None S
885 886.0 0.0 3.0 Rice, Mrs. William (Margaret Norton) female 39.0 0.0 5.0 382652 29.1250 None Q
886 887.0 0.0 2.0 Montvila, Rev. Juozas male 27.0 0.0 0.0 211536 13.0000 None S
887 888.0 1.0 1.0 Graham, Miss. Margaret Edith female 19.0 0.0 0.0 112053 30.0000 B42 S
888 889.0 0.0 3.0 Johnston, Miss. Catherine Helen "Carrie" female NaN 1.0 2.0 W./C. 6607 23.4500 None S
889 890.0 1.0 1.0 Behr, Mr. Karl Howell male 26.0 0.0 0.0 111369 30.0000 C148 C
890 891.0 0.0 3.0 Dooley, Mr. Patrick male 32.0 0.0 0.0 370376 7.7500 None Q

891 rows × 12 columns

In [6]:
# Change the SQL statement only for items necessary for analysis and read.
sql = """select passengerid, pclass, sex, age, sibsp, parch, ticket, fare, cabin, embarked from titanic_train;"""
df_train_01 = psql.read_sql(sql, con)

df_train_01
Out[6]:
passengerid pclass sex age sibsp parch ticket fare cabin embarked
0 1.0 3.0 male 22.0 1.0 0.0 A/5 21171 7.2500 None S
1 2.0 1.0 female 38.0 1.0 0.0 PC 17599 71.2833 C85 C
2 3.0 3.0 female 26.0 0.0 0.0 STON/O2. 3101282 7.9250 None S
3 4.0 1.0 female 35.0 1.0 0.0 113803 53.1000 C123 S
4 5.0 3.0 male 35.0 0.0 0.0 373450 8.0500 None S
5 6.0 3.0 male NaN 0.0 0.0 330877 8.4583 None Q
6 7.0 1.0 male 54.0 0.0 0.0 17463 51.8625 E46 S
7 8.0 3.0 male 2.0 3.0 1.0 349909 21.0750 None S
8 9.0 3.0 female 27.0 0.0 2.0 347742 11.1333 None S
9 10.0 2.0 female 14.0 1.0 0.0 237736 30.0708 None C
10 11.0 3.0 female 4.0 1.0 1.0 PP 9549 16.7000 G6 S
11 12.0 1.0 female 58.0 0.0 0.0 113783 26.5500 C103 S
12 13.0 3.0 male 20.0 0.0 0.0 A/5. 2151 8.0500 None S
13 14.0 3.0 male 39.0 1.0 5.0 347082 31.2750 None S
14 15.0 3.0 female 14.0 0.0 0.0 350406 7.8542 None S
15 16.0 2.0 female 55.0 0.0 0.0 248706 16.0000 None S
16 17.0 3.0 male 2.0 4.0 1.0 382652 29.1250 None Q
17 18.0 2.0 male NaN 0.0 0.0 244373 13.0000 None S
18 19.0 3.0 female 31.0 1.0 0.0 345763 18.0000 None S
19 20.0 3.0 female NaN 0.0 0.0 2649 7.2250 None C
20 21.0 2.0 male 35.0 0.0 0.0 239865 26.0000 None S
21 22.0 2.0 male 34.0 0.0 0.0 248698 13.0000 D56 S
22 23.0 3.0 female 15.0 0.0 0.0 330923 8.0292 None Q
23 24.0 1.0 male 28.0 0.0 0.0 113788 35.5000 A6 S
24 25.0 3.0 female 8.0 3.0 1.0 349909 21.0750 None S
25 26.0 3.0 female 38.0 1.0 5.0 347077 31.3875 None S
26 27.0 3.0 male NaN 0.0 0.0 2631 7.2250 None C
27 28.0 1.0 male 19.0 3.0 2.0 19950 263.0000 C23 C25 C27 S
28 29.0 3.0 female NaN 0.0 0.0 330959 7.8792 None Q
29 30.0 3.0 male NaN 0.0 0.0 349216 7.8958 None S
... ... ... ... ... ... ... ... ... ... ...
861 862.0 2.0 male 21.0 1.0 0.0 28134 11.5000 None S
862 863.0 1.0 female 48.0 0.0 0.0 17466 25.9292 D17 S
863 864.0 3.0 female NaN 8.0 2.0 CA. 2343 69.5500 None S
864 865.0 2.0 male 24.0 0.0 0.0 233866 13.0000 None S
865 866.0 2.0 female 42.0 0.0 0.0 236852 13.0000 None S
866 867.0 2.0 female 27.0 1.0 0.0 SC/PARIS 2149 13.8583 None C
867 868.0 1.0 male 31.0 0.0 0.0 PC 17590 50.4958 A24 S
868 869.0 3.0 male NaN 0.0 0.0 345777 9.5000 None S
869 870.0 3.0 male 4.0 1.0 1.0 347742 11.1333 None S
870 871.0 3.0 male 26.0 0.0 0.0 349248 7.8958 None S
871 872.0 1.0 female 47.0 1.0 1.0 11751 52.5542 D35 S
872 873.0 1.0 male 33.0 0.0 0.0 695 5.0000 B51 B53 B55 S
873 874.0 3.0 male 47.0 0.0 0.0 345765 9.0000 None S
874 875.0 2.0 female 28.0 1.0 0.0 P/PP 3381 24.0000 None C
875 876.0 3.0 female 15.0 0.0 0.0 2667 7.2250 None C
876 877.0 3.0 male 20.0 0.0 0.0 7534 9.8458 None S
877 878.0 3.0 male 19.0 0.0 0.0 349212 7.8958 None S
878 879.0 3.0 male NaN 0.0 0.0 349217 7.8958 None S
879 880.0 1.0 female 56.0 0.0 1.0 11767 83.1583 C50 C
880 881.0 2.0 female 25.0 0.0 1.0 230433 26.0000 None S
881 882.0 3.0 male 33.0 0.0 0.0 349257 7.8958 None S
882 883.0 3.0 female 22.0 0.0 0.0 7552 10.5167 None S
883 884.0 2.0 male 28.0 0.0 0.0 C.A./SOTON 34068 10.5000 None S
884 885.0 3.0 male 25.0 0.0 0.0 SOTON/OQ 392076 7.0500 None S
885 886.0 3.0 female 39.0 0.0 5.0 382652 29.1250 None Q
886 887.0 2.0 male 27.0 0.0 0.0 211536 13.0000 None S
887 888.0 1.0 female 19.0 0.0 0.0 112053 30.0000 B42 S
888 889.0 3.0 female NaN 1.0 2.0 W./C. 6607 23.4500 None S
889 890.0 1.0 male 26.0 0.0 0.0 111369 30.0000 C148 C
890 891.0 3.0 male 32.0 0.0 0.0 370376 7.7500 None Q

891 rows × 10 columns

In [7]:
# If there is an important missing item and you want to skip that item, correct the SQL sentence and skip it.

sql = """select * from titanic_train where age <> 'NaN';"""
df_train_02 = psql.read_sql(sql, con)

df_train_02
Out[7]:
passengerid survived pclass name sex age sibsp parch ticket fare cabin embarked
0 1.0 0.0 3.0 Braund, Mr. Owen Harris male 22.0 1.0 0.0 A/5 21171 7.2500 None S
1 2.0 1.0 1.0 Cumings, Mrs. John Bradley (Florence Briggs Th... female 38.0 1.0 0.0 PC 17599 71.2833 C85 C
2 3.0 1.0 3.0 Heikkinen, Miss. Laina female 26.0 0.0 0.0 STON/O2. 3101282 7.9250 None S
3 4.0 1.0 1.0 Futrelle, Mrs. Jacques Heath (Lily May Peel) female 35.0 1.0 0.0 113803 53.1000 C123 S
4 5.0 0.0 3.0 Allen, Mr. William Henry male 35.0 0.0 0.0 373450 8.0500 None S
5 7.0 0.0 1.0 McCarthy, Mr. Timothy J male 54.0 0.0 0.0 17463 51.8625 E46 S
6 8.0 0.0 3.0 Palsson, Master. Gosta Leonard male 2.0 3.0 1.0 349909 21.0750 None S
7 9.0 1.0 3.0 Johnson, Mrs. Oscar W (Elisabeth Vilhelmina Berg) female 27.0 0.0 2.0 347742 11.1333 None S
8 10.0 1.0 2.0 Nasser, Mrs. Nicholas (Adele Achem) female 14.0 1.0 0.0 237736 30.0708 None C
9 11.0 1.0 3.0 Sandstrom, Miss. Marguerite Rut female 4.0 1.0 1.0 PP 9549 16.7000 G6 S
10 12.0 1.0 1.0 Bonnell, Miss. Elizabeth female 58.0 0.0 0.0 113783 26.5500 C103 S
11 13.0 0.0 3.0 Saundercock, Mr. William Henry male 20.0 0.0 0.0 A/5. 2151 8.0500 None S
12 14.0 0.0 3.0 Andersson, Mr. Anders Johan male 39.0 1.0 5.0 347082 31.2750 None S
13 15.0 0.0 3.0 Vestrom, Miss. Hulda Amanda Adolfina female 14.0 0.0 0.0 350406 7.8542 None S
14 16.0 1.0 2.0 Hewlett, Mrs. (Mary D Kingcome) female 55.0 0.0 0.0 248706 16.0000 None S
15 17.0 0.0 3.0 Rice, Master. Eugene male 2.0 4.0 1.0 382652 29.1250 None Q
16 19.0 0.0 3.0 Vander Planke, Mrs. Julius (Emelia Maria Vande... female 31.0 1.0 0.0 345763 18.0000 None S
17 21.0 0.0 2.0 Fynney, Mr. Joseph J male 35.0 0.0 0.0 239865 26.0000 None S
18 22.0 1.0 2.0 Beesley, Mr. Lawrence male 34.0 0.0 0.0 248698 13.0000 D56 S
19 23.0 1.0 3.0 McGowan, Miss. Anna "Annie" female 15.0 0.0 0.0 330923 8.0292 None Q
20 24.0 1.0 1.0 Sloper, Mr. William Thompson male 28.0 0.0 0.0 113788 35.5000 A6 S
21 25.0 0.0 3.0 Palsson, Miss. Torborg Danira female 8.0 3.0 1.0 349909 21.0750 None S
22 26.0 1.0 3.0 Asplund, Mrs. Carl Oscar (Selma Augusta Emilia... female 38.0 1.0 5.0 347077 31.3875 None S
23 28.0 0.0 1.0 Fortune, Mr. Charles Alexander male 19.0 3.0 2.0 19950 263.0000 C23 C25 C27 S
24 31.0 0.0 1.0 Uruchurtu, Don. Manuel E male 40.0 0.0 0.0 PC 17601 27.7208 None C
25 34.0 0.0 2.0 Wheadon, Mr. Edward H male 66.0 0.0 0.0 C.A. 24579 10.5000 None S
26 35.0 0.0 1.0 Meyer, Mr. Edgar Joseph male 28.0 1.0 0.0 PC 17604 82.1708 None C
27 36.0 0.0 1.0 Holverson, Mr. Alexander Oskar male 42.0 1.0 0.0 113789 52.0000 None S
28 38.0 0.0 3.0 Cann, Mr. Ernest Charles male 21.0 0.0 0.0 A./5. 2152 8.0500 None S
29 39.0 0.0 3.0 Vander Planke, Miss. Augusta Maria female 18.0 2.0 0.0 345764 18.0000 None S
... ... ... ... ... ... ... ... ... ... ... ... ...
684 857.0 1.0 1.0 Wick, Mrs. George Dennick (Mary Hitchcock) female 45.0 1.0 1.0 36928 164.8667 None S
685 858.0 1.0 1.0 Daly, Mr. Peter Denis male 51.0 0.0 0.0 113055 26.5500 E17 S
686 859.0 1.0 3.0 Baclini, Mrs. Solomon (Latifa Qurban) female 24.0 0.0 3.0 2666 19.2583 None C
687 861.0 0.0 3.0 Hansen, Mr. Claus Peter male 41.0 2.0 0.0 350026 14.1083 None S
688 862.0 0.0 2.0 Giles, Mr. Frederick Edward male 21.0 1.0 0.0 28134 11.5000 None S
689 863.0 1.0 1.0 Swift, Mrs. Frederick Joel (Margaret Welles Ba... female 48.0 0.0 0.0 17466 25.9292 D17 S
690 865.0 0.0 2.0 Gill, Mr. John William male 24.0 0.0 0.0 233866 13.0000 None S
691 866.0 1.0 2.0 Bystrom, Mrs. (Karolina) female 42.0 0.0 0.0 236852 13.0000 None S
692 867.0 1.0 2.0 Duran y More, Miss. Asuncion female 27.0 1.0 0.0 SC/PARIS 2149 13.8583 None C
693 868.0 0.0 1.0 Roebling, Mr. Washington Augustus II male 31.0 0.0 0.0 PC 17590 50.4958 A24 S
694 870.0 1.0 3.0 Johnson, Master. Harold Theodor male 4.0 1.0 1.0 347742 11.1333 None S
695 871.0 0.0 3.0 Balkic, Mr. Cerin male 26.0 0.0 0.0 349248 7.8958 None S
696 872.0 1.0 1.0 Beckwith, Mrs. Richard Leonard (Sallie Monypeny) female 47.0 1.0 1.0 11751 52.5542 D35 S
697 873.0 0.0 1.0 Carlsson, Mr. Frans Olof male 33.0 0.0 0.0 695 5.0000 B51 B53 B55 S
698 874.0 0.0 3.0 Vander Cruyssen, Mr. Victor male 47.0 0.0 0.0 345765 9.0000 None S
699 875.0 1.0 2.0 Abelson, Mrs. Samuel (Hannah Wizosky) female 28.0 1.0 0.0 P/PP 3381 24.0000 None C
700 876.0 1.0 3.0 Najib, Miss. Adele Kiamie "Jane" female 15.0 0.0 0.0 2667 7.2250 None C
701 877.0 0.0 3.0 Gustafsson, Mr. Alfred Ossian male 20.0 0.0 0.0 7534 9.8458 None S
702 878.0 0.0 3.0 Petroff, Mr. Nedelio male 19.0 0.0 0.0 349212 7.8958 None S
703 880.0 1.0 1.0 Potter, Mrs. Thomas Jr (Lily Alexenia Wilson) female 56.0 0.0 1.0 11767 83.1583 C50 C
704 881.0 1.0 2.0 Shelley, Mrs. William (Imanita Parrish Hall) female 25.0 0.0 1.0 230433 26.0000 None S
705 882.0 0.0 3.0 Markun, Mr. Johann male 33.0 0.0 0.0 349257 7.8958 None S
706 883.0 0.0 3.0 Dahlberg, Miss. Gerda Ulrika female 22.0 0.0 0.0 7552 10.5167 None S
707 884.0 0.0 2.0 Banfield, Mr. Frederick James male 28.0 0.0 0.0 C.A./SOTON 34068 10.5000 None S
708 885.0 0.0 3.0 Sutehall, Mr. Henry Jr male 25.0 0.0 0.0 SOTON/OQ 392076 7.0500 None S
709 886.0 0.0 3.0 Rice, Mrs. William (Margaret Norton) female 39.0 0.0 5.0 382652 29.1250 None Q
710 887.0 0.0 2.0 Montvila, Rev. Juozas male 27.0 0.0 0.0 211536 13.0000 None S
711 888.0 1.0 1.0 Graham, Miss. Margaret Edith female 19.0 0.0 0.0 112053 30.0000 B42 S
712 890.0 1.0 1.0 Behr, Mr. Karl Howell male 26.0 0.0 0.0 111369 30.0000 C148 C
713 891.0 0.0 3.0 Dooley, Mr. Patrick male 32.0 0.0 0.0 370376 7.7500 None Q

714 rows × 12 columns

In [8]:
# Easy to read the superviser item into another data frame.

sql = """select survived from titanic_train order by passengerid;"""
df_train_03_y =  psql.read_sql(sql, con)

df_train_03_y
Out[8]:
survived
0 0.0
1 1.0
2 1.0
3 1.0
4 0.0
5 0.0
6 0.0
7 0.0
8 1.0
9 1.0
10 1.0
11 1.0
12 0.0
13 0.0
14 0.0
15 1.0
16 0.0
17 1.0
18 0.0
19 1.0
20 0.0
21 1.0
22 1.0
23 1.0
24 0.0
25 1.0
26 0.0
27 0.0
28 1.0
29 0.0
... ...
861 0.0
862 1.0
863 0.0
864 0.0
865 1.0
866 1.0
867 0.0
868 0.0
869 1.0
870 0.0
871 1.0
872 0.0
873 0.0
874 1.0
875 1.0
876 0.0
877 0.0
878 0.0
879 1.0
880 1.0
881 0.0
882 0.0
883 0.0
884 0.0
885 0.0
886 0.0
887 1.0
888 0.0
889 1.0
890 0.0

891 rows × 1 columns