Python for Data Science

Chapter 3 - Regressoin Models

Segment 1 - Simple linear regression

Linear Regression

Linear regression is a statistical machine learning method you can use to quantify, and make predictions based on, relationships between numerical variables.

  • Simple linear regression
  • Multiple linear regression

Linear Regression Use Cases

  • Sales Forecasting
  • Supply Cost Forecasting
  • Resource Consumption Forecasting
  • Telecom Services Lifecycle Forecasting

Linear Regression Assumptions

  • All variables are continuous numeric, not categorical
  • Data is free of missing values and outliers
  • There's a linear relationship between predictors and predictant
  • All predictors are independent of each other
  • Residuals(or prediction errors) are normally distributed
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import sklearn

from pylab import rcParams
from sklearn.linear_model import LinearRegression
from sklearn.preprocessing import scale
%matplotlib inline
rcParams['figure.figsize'] = 10,8
rooms = 2*np.random.rand(100,1)+3
rooms[1:10]
array([[3.24615481],
       [4.86219627],
       [3.17742366],
       [3.03114054],
       [3.73270016],
       [3.58047146],
       [3.23240264],
       [4.63462537],
       [3.91227449]])
price = 265 + 6*rooms + abs(np.random.randn(100,1))
price[1:10]
array([[285.23677074],
       [294.79616144],
       [284.85274605],
       [284.40046371],
       [288.07421652],
       [286.60487136],
       [284.55567969],
       [293.27121913],
       [289.12143579]])
plt.plot(rooms,price,'r^')
plt.xlabel("# of Rooms, 2019 Average")
plt.ylabel("2019 Average Home, 1000s USD")
plt.show()

ML0301output_5_0

X = rooms
y = price

LinReg = LinearRegression()
LinReg.fit(X,y)
print(LinReg.intercept_, LinReg.coef_)
[266.13626468] [[5.9306674]]

Simple Algebra

  • y = mx + b
  • b = intercept = 266.7

Estimated Coefficients

  • LinReg.coef_ = [5.93] Estimated coefficients for the terms in the linear regression problem.
print(LinReg.score(X,y))
0.961246701242803
原文地址:https://www.cnblogs.com/keepmoving1113/p/14317781.html