The labor statistics data set of Longley (1967) is noted for being ill-conditioned. Both the ORTHOREG and GLM procedures are applied for comparison (only portions of the PROC GLM results are shown).
Note: The results from this example vary from machine to machine, depending on floating-point configuration.
The following statements read the data into the SAS data set Longley
:
title 'PROC ORTHOREG used with Longley data'; data Longley; input Employment Prices GNP Jobless Military PopSize Year; datalines; 60323 83.0 234289 2356 1590 107608 1947 61122 88.5 259426 2325 1456 108632 1948 60171 88.2 258054 3682 1616 109773 1949 61187 89.5 284599 3351 1650 110929 1950 63221 96.2 328975 2099 3099 112075 1951 63639 98.1 346999 1932 3594 113270 1952 64989 99.0 365385 1870 3547 115094 1953 63761 100.0 363112 3578 3350 116219 1954 66019 101.2 397469 2904 3048 117388 1955 67857 104.6 419180 2822 2857 118734 1956 68169 108.4 442769 2936 2798 120445 1957 66513 110.8 444546 4681 2637 121950 1958 68655 112.6 482704 3813 2552 123366 1959 69564 114.2 502601 3931 2514 125368 1960 69331 115.7 518173 4806 2572 127852 1961 70551 116.9 554894 4007 2827 130081 1962 ;
The data set contains one dependent variable, Employment
(total derived employment), and six independent variables: Prices
(GNP implicit price deflator normalized to the value 100 in 1954), GNP
(gross national product), Jobless
(unemployment), Military
(size of armed forces), PopSize
(noninstitutional population aged 14 and over), and Year
(year).
The following statements use the ORTHOREG procedure to model the Longley data by using a quadratic model in each independent variable, without interaction:
proc orthoreg data=Longley; model Employment = Prices Prices*Prices GNP GNP*GNP Jobless Jobless*Jobless Military Military*Military PopSize PopSize*PopSize Year Year*Year; run;
Figure 66.1 shows the resulting analysis.
Figure 66.1: PROC ORTHOREG Results
PROC ORTHOREG used with Longley data |
Source | DF | Sum of Squares | Mean Square | F Value | Pr > F |
---|---|---|---|---|---|
Model | 12 | 184864508.5 | 15405375.709 | 320.24 | 0.0003 |
Error | 3 | 144317.49568 | 48105.831895 | ||
Corrected Total | 15 | 185008826 |
Root MSE | 219.33041717 |
---|---|
R-Square | 0.9992199426 |
Parameter | DF | Parameter Estimate | Standard Error | t Value | Pr > |t| |
---|---|---|---|---|---|
Intercept | 1 | 186931078.640216 | 154201839.66 | 1.21 | 0.3122 |
Prices | 1 | 1324.50679362506 | 916.17455832 | 1.45 | 0.2440 |
Prices**2 | 1 | -6.61923922845539 | 4.7891445654 | -1.38 | 0.2609 |
GNP | 1 | -0.12768642156232 | 0.0738897784 | -1.73 | 0.1824 |
GNP**2 | 1 | 3.1369569286212E-8 | 8.7167753E-8 | 0.36 | 0.7428 |
Jobless | 1 | -4.35507653558708 | 1.3851792402 | -3.14 | 0.0515 |
Jobless**2 | 1 | 0.00022132944101 | 0.0001763541 | 1.26 | 0.2983 |
Military | 1 | 4.91162014560828 | 1.826715856 | 2.69 | 0.0745 |
Military**2 | 1 | -0.00113707146734 | 0.0003539971 | -3.21 | 0.0489 |
PopSize | 1 | -0.0303997234299 | 5.9272538242 | -0.01 | 0.9962 |
PopSize**2 | 1 | -1.212511414607E-6 | 0.0000237262 | -0.05 | 0.9625 |
Year | 1 | -194907.139041839 | 157739.28757 | -1.24 | 0.3045 |
Year**2 | 1 | 50.8067603538501 | 40.279878943 | 1.26 | 0.2963 |
The estimates in Figure 66.1 compare very well with the best estimates available; for additional information, see Longley (1967) and Beaton, Rubin, and Barone (1976).
The following statements request the same analysis from the GLM procedure:
proc glm data=Longley; model Employment = Prices Prices*Prices GNP GNP*GNP Jobless Jobless*Jobless Military Military*Military PopSize PopSize*PopSize Year Year*Year; ods select OverallANOVA FitStatistics ParameterEstimates Notes; run;
Figure 66.2 contains the overall ANOVA table and the parameter estimates produced by PROC GLM. Notice that the PROC ORTHOREG fit achieves a somewhat smaller root mean square error (RMSE) and also that the GLM procedure detects spurious singularities.
Figure 66.2: Partial PROC GLM Results
PROC ORTHOREG used with Longley data |
Source | DF | Sum of Squares | Mean Square | F Value | Pr > F |
---|---|---|---|---|---|
Model | 11 | 184791061.6 | 16799187.4 | 308.58 | <.0001 |
Error | 4 | 217764.4 | 54441.1 | ||
Corrected Total | 15 | 185008826.0 |
R-Square | Coeff Var | Root MSE | Employment Mean |
---|---|---|---|
0.998823 | 0.357221 | 233.3262 | 65317.00 |
Parameter | Estimate | Standard Error | t Value | Pr > |t| | |
---|---|---|---|---|---|
Intercept | -3598851.899 | B | 1327335.652 | -2.71 | 0.0535 |
Prices | 523.802 | 688.979 | 0.76 | 0.4894 | |
Prices*Prices | -2.326 | 3.507 | -0.66 | 0.5434 | |
GNP | -0.138 | 0.078 | -1.76 | 0.1526 | |
GNP*GNP | 0.000 | 0.000 | 0.24 | 0.8218 | |
Jobless | -4.599 | 1.459 | -3.15 | 0.0344 | |
Jobless*Jobless | 0.000 | 0.000 | 1.14 | 0.3183 | |
Military | 4.994 | 1.942 | 2.57 | 0.0619 | |
Military*Military | -0.001 | 0.000 | -3.15 | 0.0346 | |
PopSize | -4.246 | 5.156 | -0.82 | 0.4565 | |
PopSize*PopSize | 0.000 | B | 0.000 | 0.81 | 0.4655 |
Year | 0.000 | B | . | . | . |
Year*Year | 1.038 | 0.419 | 2.48 | 0.0683 |
Note: | The X'X matrix has been found to be singular, and a generalized inverse was used to solve the normal equations. Terms whose estimates are followed by the letter 'B' are not uniquely estimable. |