49 



data sets that include only phylogenetically informative characters (see Sanderson 1989; 

 Kluge & Wolf 1993) that are sufficient in number and adequately distributed to reconstruct 

 all portions of the true phylogeny. Given the view that true homoplasy does not exist [as 

 homoplasy merely represents inadequately or improperly described features (see Hennig 

 1966)], each set of phylogenetically informative characters should yield the true phylogeny 

 (or at least very close to it) under these ideal conditions. 



Ignoring any potential flaws in the logic of the cladistic method, the main problem is that 

 we cannot a priori discriminate between characters that have been shaped by evolution 

 via common descent (i.e., are phylogenetically informative) and those that have been 

 influenced by a host of other processes. The variable inclusion of these latter, 

 phylogenetically misinformative, characters will, when they conflict with the informative 

 characters, deflect us away from the true phylogeny to varying extents. This, undoubtedly, 

 is the cause of the many conflicting systematic hypotheses for a given group present 

 throughout the literature. Thus, our data sets probably possess biased estimates of the 

 actual distribution, and the various tests that aim to place confidence intervals on the 

 distribution implied by the data are, in most cases, placing confidence intervals on this 

 biased distribution (but see Felsenstein & Kishino 1993; Hillis & Bull 1993). This is 

 unwittingly illustrated for the bootstrap in Fig.l of Hillis & Bull (1993). The bootstrap 

 pseudosamples (= replicates here) are one step too far removed to be able to estimate the 

 true phylogeny (without the additional assumption that all the characters are 

 phylogenetically informative). 



Yet, Hillis & Bull (1993) indicate that, under certain circumstances, the bootstrap actually 

 provides a conservative estimate that an indicated group is also found in the known true 

 phylogeny. (The phylogeny was known in this instance as it was computer generated or 

 created in the laboratory using viruses.) But, if this is the case, then how does one explain 

 equally high (and sufficiently high so as to indicate the reality of the clade with some 

 confidence) bootstrap frequencies in conflicting solutions? To illustrate this point, we have 

 run bootstrap analyses equivalent to the one performed here for the "rivar hypotheses of 

 Wyss (1987, 1988a), Wyss & Flynn (1993), and Berta & Wyss (1994). [Where possible, 

 the data matrices were analyzed as indicated in the respective study. The only changes we 

 made were to include all-zero state ancestors for Wyss (1987; 1988a) to polarize the 

 characters, and to change state 9 ("known, but not described") to a question mark for Wyss 

 (1987). This coding more properly reflects that the data are really missing, whereas Wyss's 

 (1987) coding implies that the act of not having a known state described is a putative 

 homology. These changes did not result in a different most parsimonious solution for either 

 study.] In each case, the bootstrap generally supported the findings of the respective 

 conflicting parsimony analyses with bootstrap frequencies about on a par with those 

 observed here (Figs. 6 and 8 respectively). This apparently anomalous result of equally 

 (and sufficiently) supported, but highly contrasting solutions supports our contention that 

 at least the bootstrap, and probably most of the remaining tests are merely elucidating 

 how strong the underlying, potentially biased distribution is in each set of characters, and 

 not how well each data matrix estimates the actual phylogeny. 



