26 W. A. SANDS 



Canonical variate analysis has similar purposes to principal component analysis, 

 except that it requires all individuals to be assigned to taxa and each taxon to be 

 represented by more than one specimen. The weighting of variables is then directed 

 to those providing the best discrimination between the taxa. 



Both types of analysis treat individuals or individual taxa as points in a hyper- 

 space, their positions defined by the numerical values of all their measured variables. 

 Both seek new sets of orthogonal (uncorrelatcd) co-ordinates corresponding to 

 successive axes of maximum variation of the scatter-cloud of points. The difference 

 between the two is that where principal component analysis is concerned with the 

 dispersion of individuals, canonical variate analyses measures the dispersion of the 

 ends of the mean vectors of the taxa. Thus the characters weighted by the two 

 analyses will not necessarily be the same. However, there is likely to be a tendency 

 for this to happen in a large body of data. Both analyses call for the extraction of 

 the latent roots and vectors of a matrix. The vectors provide weighting coefficients 

 by which the transformation of the variables (characters) to the new set of co- 

 ordinates is achieved. In principal components either the variance-covariance 

 matrix or the correlation matrix is used. In the latter case the variables are 

 standardized, being expressed in standard deviation units with a variance of i. 

 This is the commoner procedure, and was employed here. In canonical variate 

 analysis the 'between-taxa' and 'within-taxa' dispersion matrices are together used 

 to compute a further matrix, of which the latent vectors give the required multiple 

 discriminant functions. 



The total number of latent roots and vectors produced is the same as the number 

 of original variables. The size of successive latent roots indicates the proportion 

 of the total variance of the matrix taken up by each of the new co-ordinates in turn. 

 The number of roots, and hence the corresponding vectors, considered significant 

 depends on their relative size. One convention recommended by Kaiser (1960) 

 and Harman (i960) is to disregard roots smaller than i-o. However, when using 

 the analyses mainly for descriptive purposes, as here, it seemed more appropriate 

 to examine the elements of the vectors to determine the point at which large weight- 

 ing coefficients cease to be attached to new characters. This would suggest that 

 little further significant information was being extracted. 



It would also have been possible to carry out a principal component ('R'-type) 

 analysis of a correlation matrix based on the coded character data described earlier. 

 However, Gower (1966) pointed out that the 'Q'-type approach of principal co- 

 ordinate analysis based on a similarity matrix is mathematically equivalent to 

 the 'R'-type, but is computationally simpler and statistically more appropriate 

 when many qualitative variates are included. 



In order to arrive at an objective assessment of the taxonomic value of measure- 

 ments to be used, several principal component and canonical variate (multiple 

 discriminant) analyses were undertaken. More measurements of both imago and 

 worker castes were made than were likely to be put to practical use, in the expecta- 

 tion that the analyses would pick out the most valuable. Some of those suggested 

 by Roonwal were rejected because they are those of parts easily altered by distortion 



