728x90
데이터는 Kaggle에 있는 bostan marathon 데이터를 참고했다.
https://www.kaggle.com/rojour/boston-results
In [29]:
import pandas as pd
pd.set_option('display.max_columns',500) #생략없이 출력 가능
In [63]:
#tistory 관련 코드(필요없음)
from IPython.core.display import display, HTML
display(HTML("<style>.container {width:90% !important;}</style>"))
In [41]:
marathon_2015 = pd.read_csv("C://Users//User//Desktop//boston-results/marathon_results_2015.csv")
marathon_2016 = pd.read_csv("C://Users//User//Desktop//boston-results/marathon_results_2016.csv")
marathon_2017 = pd.read_csv("C://Users//User//Desktop//boston-results/marathon_results_2017.csv")
line plot¶
matplotlib¶
In [31]:
from matplotlib import pyplot as plt
In [32]:
marathon_2017.head().append(marathon_2017.tail())
Out[32]:
In [42]:
# Convert using pandas to_timedelta method
marathon_2017['5K'] = pd.to_timedelta(marathon_2017['5K'])
marathon_2017['10K'] = pd.to_timedelta(marathon_2017['10K'])
marathon_2017['15K'] = pd.to_timedelta(marathon_2017['15K'])
marathon_2017['20K'] = pd.to_timedelta(marathon_2017['20K'])
marathon_2017['Half'] = pd.to_timedelta(marathon_2017['Half'])
marathon_2017['25K'] = pd.to_timedelta(marathon_2017['25K'])
marathon_2017['30K'] = pd.to_timedelta(marathon_2017['30K'])
marathon_2017['35K'] = pd.to_timedelta(marathon_2017['35K'])
marathon_2017['40K'] = pd.to_timedelta(marathon_2017['40K'])
marathon_2017['Pace'] = pd.to_timedelta(marathon_2017['Pace'])
marathon_2017['Official Time'] = pd.to_timedelta(marathon_2017['Official Time'])
In [26]:
import numpy as np
In [43]:
# Convert time to seconds value using astype method
marathon_2017['5K'] = marathon_2017['5K'].astype('m8[s]').astype(np.int64)
marathon_2017['10K'] = marathon_2017['10K'].astype('m8[s]').astype(np.int64)
marathon_2017['15K'] = marathon_2017['15K'].astype('m8[s]').astype(np.int64)
marathon_2017['20K'] = marathon_2017['20K'].astype('m8[s]').astype(np.int64)
marathon_2017['Half'] = marathon_2017['Half'].astype('m8[s]').astype(np.int64)
marathon_2017['25K'] = marathon_2017['25K'].astype('m8[s]').astype(np.int64)
marathon_2017['30K'] = marathon_2017['30K'].astype('m8[s]').astype(np.int64)
marathon_2017['35K'] = marathon_2017['35K'].astype('m8[s]').astype(np.int64)
marathon_2017['40K'] = marathon_2017['40K'].astype('m8[s]').astype(np.int64)
marathon_2017['Pace'] = marathon_2017['Pace'].astype('m8[s]').astype(np.int64)
marathon_2017['Official Time'] = marathon_2017['Official Time'].astype('m8[s]').astype(np.int64)
In [44]:
marathon_2017.head()
Out[44]:
In [56]:
plt.figure(figsize = (30,10))
plt.plot(marathon_2017.index, marathon_2017['5K'],label = '5K')
plt.plot(marathon_2017.index, marathon_2017['10K'], label = '10K')
plt.plot(marathon_2017.index, marathon_2017['Half'], label = 'Half')
plt.title("time")
#plt.ylabel('time', fontsize=14)
#plt.xlabel('time', fontsize=14)
plt.legend(loc='upper right')
plt.show()
seaborn¶
In [59]:
import seaborn as sns
In [61]:
plt.figure(figsize=(30,10))
sns.lineplot(x=marathon_2017.index, y=marathon_2017['5K'] )
Out[61]:
반응형
'나는야 데이터사이언티스트 > PYTHON' 카테고리의 다른 글
[Python] 용량이 큰 CSV 파일 빠르게 불러오기 (1) | 2020.03.23 |
---|---|
[Python]데이터 시각화, 연관성 분석 heat map, pairplot 그리기 (0) | 2020.03.22 |
[Python]pandas.cut - 데이터 범주화하기 / if문 쓰지않고 데이터 나누기 (0) | 2020.03.12 |
[Python]데이터 시각화, matplotlib & seaborn - Bar Plot(막대그래프) (0) | 2020.03.11 |
[Python]파이썬 데이터 전처리 기초 정리 (0) | 2020.03.05 |