***This document uses HTML 3.0 tags as supported by Netscape 2.0.***

Multiple Linear Regression

In the discussion of weighted least squares a need was found for a method to fit Y to more than one X. Further, it is common that the response variable Y is related to more than one regressor variable simultaneously. If a valid description of the relationship between Y and any of these response variables is to be obtained, all must be considered. Also, exclusion of any important regressor variables will adversely affect predictions of Y. In general, the equation to be considered becomes

Y = b0 + b1 X1 + b2 X2 + . . . + bK XK

The Xs may be any relevant regressor variables. Often one X is a (nonlinear) transformation of another. For example, X2 = ln (X1).

When dealing with multiple linear regression, fits to data are no longer lines. For example, with K = 2, the resulting fit would describe a plane in three dimensional space with "slopes" bhat1 and bhat2 intersecting the Y axis at bhat0. Beyond K = 2 the resulting fit becomes difficult to visualize. The terminology regression surface is often used to describe a multiple linear regression fit. Figure 9 shows data and the surface generated by a multiple linear regression relating yield to age and site index.

The surface is not a plane as the equation was

ln (Yield) = b0 + b1/Age + b2ln (Site Index)

Assumptions required for application of least squares methodology to multiple linear regression equations are similar to those cited for the simple linear case. For example, the true relationship between Y and the various Xs must be as given by the linear equation and the spread of the errors must be constant across values of all Xs. Also, a limit exists to the number of Xs that can be considered. Specifically, K + 1 must be less than or equal to the sample size n for a unique set of bhats to be found.

In theory, least squares estimates of b0, . . ., bK are found just as in the simple linear case. The estimates bhat0, . . ., bhatK are the solution from minimizing

The description of the resulting equations and associated summary statistics is best made using matrix algebra. The computations are best carried out using a computer.


[Previous] [TOC] [Next]