나는야 데이터사이언티스트/PYTHON

[Python]데이터 시각화, 연관성 분석 heat map, pairplot 그리기

우주먼지의하루 2020. 3. 22. 19:48
728x90

데이터는 Kaggle에 있는 bostan marathon 데이터를 참고했다.

 

https://www.kaggle.com/rojour/boston-results

 

Finishers Boston Marathon 2015, 2016 & 2017

This data has the names, times and general demographics of the finishers

www.kaggle.com

 

 

python(heatmap)
In [12]:
#tistory 관련 코드(필요없음)
from IPython.core.display import display, HTML
display(HTML("<style>.container {width:90% !important;}</style>"))
In [1]:
import pandas as pd
In [2]:
marathon_2017 = pd.read_csv("C://Users//User//Desktop//boston-results/marathon_results_2017.csv")
In [10]:
marathon_2017.head()
Out[10]:
Unnamed: 0 Bib Name Age M/F City State Country Citizen Unnamed: 9 ... 25K 30K 35K 40K Pace Proj Time Official Time Overall Gender Division
0 0 11 Kirui, Geoffrey 24 M Keringet NaN KEN NaN NaN ... 1:16:59 1:33:01 1:48:19 2:02:53 0:04:57 - 2:09:37 1 1 1
1 1 17 Rupp, Galen 30 M Portland OR USA NaN NaN ... 1:16:59 1:33:01 1:48:19 2:03:14 0:04:58 - 2:09:58 2 2 2
2 2 23 Osako, Suguru 25 M Machida-City NaN JPN NaN NaN ... 1:17:00 1:33:01 1:48:31 2:03:38 0:04:59 - 2:10:28 3 3 3
3 3 21 Biwott, Shadrack 32 M Mammoth Lakes CA USA NaN NaN ... 1:17:00 1:33:01 1:48:58 2:04:35 0:05:03 - 2:12:08 4 4 4
4 4 9 Chebet, Wilson 31 M Marakwet NaN KEN NaN NaN ... 1:16:59 1:33:01 1:48:41 2:05:00 0:05:04 - 2:12:35 5 5 5

5 rows × 25 columns

In [3]:
marathon_2017.corr()
Out[3]:
Unnamed: 0 Age Overall Gender Division
Unnamed: 0 1.000000 0.259623 1.000000 0.902079 0.449813
Age 0.259623 1.000000 0.259626 0.368879 -0.579735
Overall 1.000000 0.259626 1.000000 0.902077 0.449812
Gender 0.902079 0.368879 0.902077 1.000000 0.403068
Division 0.449813 -0.579735 0.449812 0.403068 1.000000
In [4]:
import seaborn as sns
In [8]:
sns.heatmap(marathon_2017.corr(),annot=True)
Out[8]:
<matplotlib.axes._subplots.AxesSubplot at 0x211329c91d0>
In [11]:
sns.pairplot(marathon_2017,hue="M/F")
Out[11]:
<seaborn.axisgrid.PairGrid at 0x211339f3668>
반응형