Linear Discriminant Analysis 
Robert H. Riffenburgh 1 and Charles W. Clunies-Ross 2 
The problem to be considered here is that of 
identifying, or of classifying, an observed in- 
dividual as being a member of one of two 
"populations/’ This problem arises in some form 
in most sciences. A recent example is the prob- 
lem, associated with certain international ten- 
sions, of classifying salmon caught in the North 
Pacific fishery as having arisen from the Asiatic 
or American salmon populations. 
The populations are to be considered as giving 
rise to observable individuals each of which 
may be (partially) characterized by a set of 
k measurements. The measurements of individ- 
uals from either population are distributed as 
if they were independent observations on a 
multivariate distribution of probability. These 
distributions are assumed to be multivariate 
normal, with known parameters, for each pop- 
ulation. 
1. Statement of the Problem 
When an individual is misclassified, there may 
or may not be loss functions associated with the 
misclassification. For the problems of this paper 
explicit results are not obtainable for general 
loss functions; we shall assume loss functions to 
be constants. Let us designate as a the loss as- 
sociated with misclassification of an individual 
from population I and as (3 the loss associated 
with misclassification of an individual from 
population II; a, /3 > 0. Also, there is the ques- 
tion of whether or not anything is known about 
the mixed population from which the individual 
to be classified is drawn; in particular, whether 
or not there are known a priori probabilities, 
1 Present address: Department of Mathematics, Uni- 
versity of Hawaii. This paper is a portion of a disser- 
tation submitted in partial fulfillment of the Ph.D. 
degree at the Virginia Polytechnic Institute; research 
was in part sponsored by the National Cancer Institute 
of the U. S. Public Health Service. Manuscript received 
June 8, 1959. 
2 Virginia Polytechnic Institute, Blacksburg, Virgi- 
nia. Research was sponsored by the National Science 
Foundation under grant NSF-G-1858. 
under a random drawing, that an individual be- 
longs to either of the parent populations. Let us 
designate the prior probabilities as p for popula- 
tion I and q m 1 — p for population II. 
It follows that there are four levels of the 
classificatory problem to be considered: 
(1.1) ( a ) with loss functions and prior 
probabilities 
(1.2) (b) with prior probabilities only 
(1.3) (c) with loss functions only 
(1.4) (d) with neither 
Misclassifications are undesirable; however, 
there are no adequate common units in which 
the "undesirability” can be measured for all of 
the above levels. At each level there are two 
quantities for which some form of joint mini- 
mization is desired, viz.: 
(1.5) (a) apPi, /JqP n 
(1.6) (b) pPi, qPn 
(1.7) (c) aP L pFn 
(1.8) (d) Pi, Pii 
where Pi is the probability that a random in- 
dividual of population I is classified as having 
arisen from II, and Pn is the probability that 
a random individual of II is classified as having 
arisen from I. 
These four pairs of quantities will be referred 
to indiscriminately as "error quantities.” 
Now either error quantity of a pair may be 
reduced to zero, but not both jointly. Thus, joint 
minimization of the error quantities is, to a 
certain extent, arbitrary. While various specifi- 
cations of joint minimization can be formulated, 
the more reasonable are those which have al- 
ready been proposed elsewhere in the literature, 
viz.: 
(i) joint minimization may be specified as 
that which minimizes the sum of error 
quantities; let us denote this criterion as 
"minisum”; 
(ii) joint minimization may be specified as 
that which minimizes the larger of the 
error quantities; let us denote this cri- 
terion as "minimax.” 
251 
