/
Simple Linear Regression (SLR) Simple Linear Regression (SLR)

Simple Linear Regression (SLR) - PowerPoint Presentation

test
test . @test
Follow
349 views
Uploaded On 2019-12-17

Simple Linear Regression (SLR) - PPT Presentation

Simple Linear Regression SLR Simple Linear Regression The purpose of regression Desc r ibe functional relationships bet w een v a r ia b les Control Prediction of outcomes Simple Linear Regression ID: 770786

mse regression error

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Simple Linear Regression (SLR)" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Simple Linear Regression (SLR)

Simple Linear Regression The purpose of regression?Describe functional relationships between variablesControlPrediction of outcomes

Simple Linear Regression The basic concepts of regressionDescribe statistical relationships between variablesThe statistical relation has two essential ingredients A tendency of the response variable Y to vary with the predictor variable X There is a probability distribution of Y for each level of X. A scattering of points around the curve of statistical relationship. The means of these probability distributions vary in some systematic fashion with X

Example (diamonds.csv) Variables: Response Variable: Price in Singapore dollars (Y) Explanatory Variable: Weight of diamond in carats (X)Goal: Predict the price of a sale for a 0.43 carat diamond ringWhat are the two ingredients in understudying statistical relationship between price and weight ?

Scatter plot The means of the price distributions increase linearly with the weightFor any given weight, the distribution of price varies, and we can see later that the distribution is Normal (the bell-shape distribution). X=0.15 X=0.17 X=0.25 X=0.32 Mean price = intercept+ slope (weight)

Notation for Simple Linear Regression (SLR) Observe a pair of variables (explanatory and response) on each of i = 1 , 2 , . . . , n samplesEach pair often called a case or a data point ()Yi is the value of the response for the i-th caseXi is the value of the explanatory variable for the i-th case  

Simple Linear Regression Model Y i = β0 + β1Xi + i   f or i = 1 , 2 , . . . , nSimple Linear Regression Model Parametersβ0 is the intercept.β1 is the slope.are independent, normally distributed random errors with mean 0 and variance σ2,∼ N (0, σ2 )   iid  

Features of Simple Linear Regression Model Individual obse rvations: Yi = β0 + β1Xi + ε i Since ε i are r andom, Y i are also r andom and Since ε i is Normally distributed, Yi ∼ N (β0 + β1Xi, σ2)E(Yi)=β0 + β1Xi + E(εi) = β0 + β 1 X i V ar ( Yi)= 0 + 0 + V ar ( ε i ) = σ 2 .

Fitted Regression Equation and Residuals The pa rameters β0, β1, and σ2 are unknown and must be estimated from the data. Y ˆ = b 0 + b 1 X b 0 estimates β0 (intercept)b 1 estimates β1 (slope)Yˆi = b0 + b1Xi gives the estimated mean of Y when the predictor is Xi.The residual for the i-th case is e i = Yi − Yˆi = Y i − ( b0 + b1 X i ) s 2 = Var (e i) estimates the error variance σ 2 The residual (in one sample) is NOT the same as the error (in population) !   The “hat” symbolis “point estimation”        

W e want to find the “best” estimates, b0 and b1.Minimize the sum of the squared residuals ,   n i = 1 e 2 , i. e ., find i arg min =[Yi − (b0 + b1Xi)]2 (b0,b1)How? Calculus! T a k e de r i v ati v es with respect to b0 and with respect to b 1 . Set equations equal to z ero and sol v e for both b0 and b .Estimating the parameters with Least Squares (LS) Solution

Estimating the parameters with Least Squares (LS) SolutionThe best estimates of β1 and β0 given the data (X, Y ) are: b 1 = ( X i − X ¯ )( Y i − Y¯ ) (X − X) i¯2=SSXYSS X b 0 = Y ¯ − b 1X¯ This estimate is the “best” because itis unbiased (its expected value is equal to the true value) has minimum varianceSS is “sum of squares”

Estimate the parameters with Maxi mum Likelihood Estimation (MLE)Our model says that Yi ∼ N (β0 + β 1 X i , σ 2 ) . Gi v en X i , the probability of data point i is,β0 and β1 are unknown, but the likelihood of the proposed values (β∗ β∗) given the data is,0 1L(β , β∗ ∗ 0 1 | X , Y ) = f × f × . . . × f   1 2 nL is maximiz ed when β∗ = b 0 and β∗ = b 1. Thus, the LS estimates, b0 and b1, are also0 1the estimated pa rameter values that are most (probabilistically) consistent with the data!  

Estimation of sto c hastic variance,  We estimate σ2 as the sum of the squared residuals, SS E , divided b y the de g rees of freedom: DF E stands f or “degree of freedom of error”  MSE stands for “mean squared error”SSE stands for “sum of squares error” E{ }=   MSE is an unbiased estimator of     This is the residual standard error, which estimates the residual standard deviation ( )  

Estimation of sto c hastic variance,  We estimate σ2 as the sum of the squared residuals, SS E , divided b y the de g rees of freedom: DF E stands f or “degree of freedom of error”  MSE stands for “mean squared error”SSE stands for “sum of squares error” E{ }=   MSE is an unbiased estimator of     MSE measures variability around the fitted regression line, A______( A. smaller/ B larger) MSE is preferred and often used as a criterion for model selection

Estimation of sto c hastic variance,  We estimate σ2 as the sum of the squared residuals, SS E , divided b y the de g rees of freedom: DF E stands f or “degree of freedom of error”  MSE stands for “mean squared error”SSE stands for “sum of squares error” E{ }=   MSE is an unbiased estimator of     MSE measures variability around the fitted regression line, A______( A. smaller/ B larger) MSE is preferred and often used as a criterion for model selection

W e will also estimate variances for other quantities.These will also be denoted s2, but will hav e a subsc r ipt to identify them, e .g. s 2 . { b 1 } Without a n y subsc ript, s2 refers to the the estimated variance of the residuals. And S refers to the standard error of the residuals. A comment on the notation

Identifying statistics and estimates in the R output       after remove 1 observation  

Residual plots Residuals show a random pattern.

P r operties of the LS LineThe least-squares line always passes through the point (X¯ , Y¯ ). The residuals al w a ys sum to z ero: = [ Y i − ( b 0 + b 1Xi)]= Yi − b 0 − b 1Xi= nY¯ − nb 0 − nb 1X¯= n[(Y¯ − b1 X ¯ ) − b 0 ]= 0