For convenience, we assume that X'X and XT 

 are in the correlation form. Metho'3s%f scalin'g 

 X'X and X'Y to the correlation form are dis- 

 cussed by Draper and Smith (1966, p. 11^7). It is 

 well known that /3 is the best linear unbiased 

 estimate of ^. However, when the predictor 

 variables are^ highly correlated, the average 

 distance of ^ to is large. In particular, 

 is large. 



Hoerl and Kennard (1970a) suggested that the. 

 estimator^ '■: 



n £* = (X'X+kI)^'l'Y; k>0 [2] 



be used when the independent^ variables are 

 highly correlated. The estimate^* is called the 

 ridge estimator. If (3' (3 is bounded, there exists a 

 value, of , k>0 such that E[(£*-^)'(£*-/5)] 

 <E[(^^)'(i-_^)]. The ridge estimator'^has tlie 

 property that, as k increases, the variance of 

 i5* decreases, but the bias increases. The best 

 regression estimates of /3* are those that are 

 stable and have a small mean-square error. 



To calculate the ridge estimator ^* from 

 equation [2], one would have to invert the pxp 

 matrix, (X'X-t-kl), for each value of k. This 

 sequence 'o^'matrix inversions could be time- 

 consuming even with a high-speed computer. 

 The ridge estimator can be expressed in a form 

 that may be better for computing purposes. 



We know from matrix theory that, because 

 X'X is symmetric, there exists an orthogonal 

 matrix A and a diagonal matrix D such that 

 A'X'XA = D and A'A = L The matnx A is the 

 matrix of eigenvectors of X'X, and the matrix D 

 is the diagonal matrix of eigenvalues of X'X. 

 Adding W to both sides of A'X'XA=D gives~ 



A'X'XA+kI=p-Hd. [3] 



Multiplying the second term on the left-hand 

 side of equation [3] by A'A, gives 



A'X'XA+kATA = D + kI, [4] 



which can be written as 



A'(X'X+k^)A-D + kL [5] 



Premultiplying both sides of equation [5] by 

 (A')-' and postmultiplying by A-' gives 



X'X + k^=(A')-'(D+ld)A-'. [6] 



Taking the inverse of both sides yields 



(X'X + kI)-'-A(D+kD-'A'. [7] 



Substituting the results of equation [7] in 

 equation [2], we find that the ridge estimator 

 can be written 



This form of the ridge estimator may be ef- 

 ficient for computing in problems with a large 

 number of independent variables. The matrix 

 (D-l-k^) is diagonal, and the elements of the in- 

 verse are the reciprocals of the diagonal 

 elements. The matrix of eigenvectors A and the 

 matrix of eigenvalues D need to be calculated 

 only once. However, the algorithm for computing 

 the eigenvalues is iterative, and the solution 

 may occasionally take more time than 

 calculating the inverses of (X'X4-kI). 



The estimates of the ridge coefficients at k = 

 are the least squares estimates. If the least 

 squares regression is significant, then different 

 values of k should be explored. 



The ridge trace, which is a plot of the ridge 

 coefficients for different values of k, is an im- 

 portant part of ridge regression. The sums of 

 squares of residuals should also be plotted. The 

 ridge trace is examined for trends of the ridge 

 coefficients as k is changed. The best estimates 

 of the ridge coefficients are those where the 

 trace shows that the coefficients have stabilized 

 and the sums of squares of residuals is still 

 small (Marquardt and Sriee 1975). 



Hoerl and Kennard (1970b) discuss the use of 

 the ridge trace to eliminate variables with the 

 least predicting power. Thus, ridge regression 

 can be used as a guide for selecting the best sub- 

 set of variables; that is, ridge regression is an 

 alternative for stepwise regression. 



Program Ridge 



Program RIDGE is written in ASA Fortran 

 IV for the IBM 370/168 computer. Information 

 needed for the control cards is listed in the 

 appendix. A variable format statement is used 

 to input the data. The dependent variable is 

 positioned by the program, hence special 

 arrangement of the data is not necessary. A 

 maximum of 19 independent variables is al- 

 lowed for program RIDGE. This capacity may 



2 



