MACY: CLASSIFYING LONG-FINNED SQUID INTO SEXUAL MATURITY STAGES 



Statistical analyses were performed using the 

 Biomed Computer Programs P-Series (BMDP) 

 (Brown 1977) on an Itel AS-5 R computer 2 of the 

 University of Rhode Island Academic Computer 

 Center. The initial data matrix consisted of the 

 20 variables listed in Table 1 , from 675 males and 

 693 females randomly selected from the 1976 

 Cryos offshore and 1976 Narragansett Bay in- 

 shore samples. After standardization of the vari- 

 ables, principal components analysis (BMDP 

 program 4M) (Morrison 1976) was employed to 

 group variables and to determine their impor- 

 tance in accounting for observed variance. Clus- 

 ter analysis (BMDP 2M) was then used with the 

 Euclidean-distance metric as the amalgamation 

 algorithm to group cases (Anderberg 1973). 

 Finally, stepwise linear discriminant analysis 

 (BMDP 7M) (Anderson 1958) was used with dif- 

 ferent variable combinations to generate a series 

 of functions which best discriminated between 

 the groups identified in the cluster-analysis 

 stage. A goal of 95% or better overall correct clas- 

 sification of individuals was set. 



RESULTS 



Development of 

 the Discriminant Functions 



The initial cluster analysis revealed only two 

 major groupings for each sex. Further examina- 

 tion, however, suggested that the major clusters 

 consisted of different size-based groupings of 

 mature and immature individuals. Spent squid 

 (using the Vovk (1972) scale) did not group to- 

 gether. Weight variables— WW, GW, and GI 

 (Table 1) — were then dropped from the data 

 matrix because it was known that length mea- 

 sures correlate well with their respective weight 

 counterparts and because principal components 

 analysis had not indicated any particular advan- 

 tage to using one variable type or the other. Clus- 

 ter analysis was then rerun using the remaining 

 17 variables. Four clusters of developmental 

 stages could then be recognized, corresponding 

 to "ripe/spent," "nearly mature," "advanced im- 

 mature," and "immature" (barely sexable). Sev- 

 eral size-based subgroups were still evident 

 within the major clusters. After scoring the 

 squid on a 1-4 scale based on the cluster results, 

 subsequent discriminant analysis produced 



moderately good separation of the four groups. 



Efforts were then focused on improving class 

 separation and reducing the number of variables 

 required. First, those cases which were suspect- 

 ed to be misclassified based on posterior prob- 

 ability and Mahalanobis D 2 statistics (Lachen- 

 bruch and Mickey 1968) were corrected. By this 

 time a rather clear picture of the characteristics 

 of each stage had been formed, and thus inspec- 

 tion of the raw data was often sufficient to deter- 

 mine if reclassification was warranted. Using 

 different combinations of variables in the dis- 

 criminant analyses, the number of variables was 

 further reduced by retaining only those which 

 improved classification accuracy, as indicated 

 by a pseudojackknife test (see BMDP documen- 

 tation; Lachenbruch and Mickey 1968). 



Best results were obtained with the following 

 input variables: MW and MWI, SPL or AGL, 

 API or AGI, TLI or NGI, and ASP or AEOV 

 (Tables 1, 2). In males 94.6% correct classifica- 

 tion (Table 2) was obtained using only the three 

 most important variables, SPI, MWI, and TLI 

 (determined by their order of entry into the step- 

 wise analysis), while 96.6% of the females were 

 correctly classified using the first four varia- 

 bles— AEO, AGI, MWI, and NGI. Stage 2 squid 

 were incorrectly classified 19% in males and 

 10.7% in females, but other squid were correctly 

 grouped in at least 90% of the cases. A plot of the 



Table 2.— Classification of Loligo pealei into stages of sexual 

 maturity using linear discriminant functions. The variables to 

 be measured are listed below with their coefficients or weight- 

 ing factors for each maturity stage. To classify an individual, 

 construct four linear equations, one for each stage, using the 

 measured values and the appropriate coefficients and constant 

 and solve. The equation resulting in the largest value indicates 

 the stage into which the individual has been assigned. 



2 Reference to trade names does not imply endorsement by 

 the National Marine Fisheries Service, NOAA. 



451 



