Excel - A Regression Example

Simple linear regression analysis can be easily completed using Excel. The Data Analysis... option of the Tools menu provides what's necessary. If that option is not available in the Excel you are running, you can add it by selecting the Add-Ins... options of the Tools menu.

The example we'll use is from pages 27-30 of Avery and Burkhart, 5th Edition. It will be helpful to follow along in the text as you proceed though this Excel example. An Excel file with the data for the example is available. There are 62 observations of basal area growth and crown volume. We wish to see if basal area growth can be predicted from crown volume based on a simple linear relationship. A portion of the data as it appears in an Excel worksheet appears below.

To plot "BA_GRO" versus "CRWN_VOL" select the data (including labels - cells A1:B63) and choose Chart... under the Insert menu to start up the Chart Wizard. Choose the XY (Scatter) type. Select various options in the Chart Wizard steps to format the graph so it looks like:

From an analysis standpoint you should use this plot to examine whether the assumptions of the technique you intend to use are satisifed for these data. For simple linear regression those assumptions include linearity and constant variance.

Given you are satisfied regarding an initial appraisal of assumptions, you can fit a linear regression to the data by choosing Data Analysis... under the Tools menu and subsequently selecting the Regression Analysis Tool. You will be presented with the following dialog:

Specify the BA_GRO data and label for "Input Y Range:" (B1:B63) and the CRWN_VOL data and label for "Input X Range:" (A1:A63). Check the "Labels" box (since you included data labels in your input ranges), provide a new worksheet name under "Output options" (I used the name Results), and check the Standardized Residuals box. You should obtain the following results (subset shown):

R Square is called coefficient of determination in your text. Intercept is the b0 term and CRWN_VOL is the slope or b1 term. The Analysis of Variance table reports the Regression (Reduction in your text) sum-of-squares and the Residual sum-of-squares, the numerator of the residual mean square whose square root is the Standard error of estimate (Standard Error in the Regression Statistics table).

To conduct a residual analysis we want to plot predicted values versus standardized residuals from the output. Select the Predicted BA_GRO label and data and then, holding down on the Ctrl key (Apple key on a Macintosh), select the Standard Residuals label and data:

As before, choose Chart... under the Insert menu to start up the Chart Wizard. Choose the XY (Scatter) type. Once you select various formatting options and insert the chart into your worksheet you will need to complete one extra step. By default the x-axis labels appear in the middle of a residual graph (below a Y value of 0). To move the labels to the bottom of the chart: 1. double click the Y-axis of your chart and 2. under the Scale tab of the resulting dialog specify the axis crosses at the smallest value of Y (-2.5 in the example). Your chart should look like:

The standardized residuals appear to be fairly evenly scattered throughout a horizontal band around 0.0. Hence we might be satisifed regards the assumptions of simple linear regression.

It is possible that an intercept-less equation might be sensible for the "BA_GRO" on "CRWN_VOL" regression. To see this, a new regression would need to be run (checking the "Constant is Zero" box in the Regression dialog box - Excel, in an unfortunate choice of words, chooses to call the intercept parameter "Constant") and the resulting residuals reanalyzed. If an intercept is critical there will be a linear (non-horizontal) trend to the residuals. The two regressions, if both are sound from a residual analysis perspective, can be compared using the Standard Error statistic.