Kaggle教程的一些整理 – Python量化投资

Kaggle教程的一些整理

整理转载至https://www.kaggle.com/learn/overview

panda

  • import
import pandas as pd
  • read data
# save filepath to variable for easier access
melbourne_file_path = '../input/melbourne-housing-snapshot/melb_data.csv'
# read the data and store data in DataFrame titled melbourne_data
melbourne_data = pd.read_csv(melbourne_file_path) 
# print a summary of the data in Melbourne data
melbourne_data.describe()
  • show columns
melbourne_data.columns
  • choose feature
melbourne_features = ['Rooms', 'Bathroom', 'Landsize', 'Lattitude', 'Longtitude']
X = melbourne_data[melbourne_features]
  • quickly review the data
X.describe()
X.head()
X.tail()

scikit-learn

  1. DecisionTreeRegressor
  • define
from sklearn.tree import DecisionTreeRegressor
# Define model. Specify a number for random_state to ensure same results each run
melbourne_model = DecisionTreeRegressor(random_state=1)
  • fit model
# Fit model
melbourne_model.fit(X, y)
  • predict
melbourne_model.predict(X.head())
  • Mean Absolute Error(MAE)
    error=actual−predicted
from sklearn.metrics import mean_absolute_error
predicted_home_prices = melbourne_model.predict(X)
mean_absolute_error(y, predicted_home_prices)
  1. RandomForest
  • define
from sklearn.ensemble import RandomForestRegressor
forest_model = RandomForestRegressor(random_state=1)

seaborn

  • import
import matplotlib.pyplot as plt
%matplotlib inline
import seaborn as sns
  • plot
# Set the width and height of the figure
plt.figure(figsize=(16,6))
# Add title
plt.title("FIFA rankings")
# Line chart showing how FIFA rankings evolved over time 
sns.lineplot(data=fifa_data)
  • plot subset
# Set the width and height of the figure
plt.figure(figsize=(14,6))
# Add title
plt.title("Daily Global Streams of Popular Songs in 2017-2018")
# Line chart showing daily global streams of 'Shape of You'
sns.lineplot(data=spotify_data['Shape of You'], label="Shape of You")
# Line chart showing daily global streams of 'Despacito'
sns.lineplot(data=spotify_data['Despacito'], label="Despacito")
# Add label for horizontal axis
plt.xlabel("Date")

https://www.jianshu.com/p/ad87b279fad6

「点点赞赏,手留余香」

    还没有人赞赏,快来当第一个赞赏的人吧!
0 条回复 A 作者 M 管理员
    所有的伟大,都源于一个勇敢的开始!
欢迎您,新朋友,感谢参与互动!欢迎您 {{author}},您在本站有{{commentsCount}}条评论