4 
THE WILSON JOURNAL OF ORNITHOLOGY • Vol. 123, No. 1, March 2011 
default settings of RAVEN 1.3 (Charif et al. 
2008), except the display was set to smooth, 
overlap was adjusted from 50 to 93.7% depending 
on recording quality, and contrast was adjusted 
according to recording intensity with care taken to 
retain all elements of the vocalization. Cursor 
measurements were typically at scales of 0.07 sec/ 
cm and 0.6 kHz/cm. A concern that voices of 
males and females might differ as in some other 
thamnophilid species (e.g., Isler et al. 2002, 
2007a) dictated that the analysis initially distin¬ 
guished recordings of males and females. Unfor¬ 
tunately, individuals in the Willisomis complex 
often vocalize from beneath dense cover, and 
many recordings did not identify either male or 
female. Consequently, the analysis proceeded in 
an iterative fashion, aggregating samples when 
results did not indicate differences between males 
and females or between those identified and 
unidentified to gender. For example, we compared 
samples of male loudsongs of populations (except 
gutturalis , whose recording inventory of male 
vocalizations was insufficient) before adding 
samples of recordings of females and those 
unidentified to gender. Sample sizes cited reflect 
number of individuals, not number of vocaliza¬ 
tions measured. 
Diagnostic differences had to be discrete, non¬ 
overlapping character states that have the poten¬ 
tial for unambiguous signal recognition (Isler et 
al. 1998, 1999). Ranges of samples of continuous 
variables could not overlap, and the likelihood 
that ranges would not overlap with larger sample 
sizes was estimated by requiring the means (x) 
and standard deviations (SD) of the population 
with the smaller set of measurements (a) and the 
population with the larger set of measurements ( b) 
to meet the test: 
ka “I" ^aSD a <Xb /bSDb 
where t x = the r-score at the 97.5 percentile of the 
t distribution for n — 1 degrees of freedom. 
A similar test could not be used for ratios which 
were not normally distributed. Thus, we used a 
non-parametric bootstrap simulation to examine 
statistical significance. We compared Difference 
Between Means (DBM) of the two taxa being 
analyzed and two groups of generated data of the 
same sample sizes. The method generated 10,000 
sample population pairs, with replacement, and 
compared the DBM between the two compared 
species to the distribution of DBMs of the 
simulated populations. The result was distributed 
normally, and significance was assigned accord¬ 
ing to the rules of this distribution. 
We recommend species status under the 
Biological Species Concept (BSC) for populations 
that differed diagnostically in both vocalizations 
and morphology. We accepted current subspecies 
definitions as reflecting diagnostic morphological 
differences described in the literature (Cory and 
Hellmayr 1924, Zimmer 1934, Ridgely and Tudor 
1994, Zimmer and Isler 2003) after finding them 
to be consistent in large series of specimens 
examined at major museums. Vocal differences 
were considered diagnostic if the analysis re¬ 
vealed three or more diagnostic characters 
following the “yardstick” developed by Isler et 
al. (1998). For brevity, we use subspecies names 
to reference populations. 
RESULTS 
Subspecies differed from their geographic neigh¬ 
bors by at least one diagnostic plumage character in 
every instance (100% diagnosable) with the 
exception of duidcie and lepidonotus , and apparent 
hybrids between poecilinotus and duidae. We 
examine the biogeography of parapatric popula¬ 
tions after reporting the results of vocal analyses. 
Vocalizations 
Vocal repertoires of Willisomis populations 
include five principal vocal types: (1) loudsongs , 
(2) contact calls , (3) chirrs , (4) raspy series , and 
(5) other calls. Soft songs were recorded too 
infrequently to be useful in the analysis. 
Loudsongs. —All subspecies deliver a series of 
long, upslurred notes separated by shorter inter¬ 
vals, the series generally rising in pitch (Fig. 2). 
Individual variation of loudsong characteristics 
within populations was high. For example, of 37 
loudsongs analyzed for poecilinotus , nine con¬ 
tained 3-5 notes, 11 contained 6-8 notes, 13 
contained 9-11 notes, and four contained 12-15 
notes. The unusually large variability could not be 
related to gender or age in our samples. No 
diagnosable differences in loudsongs were found 
between populations as a consequence of the large 
within-population variability with one exception. 
Notes of nigrigula and vidua were frequency- 
modulated in an even pattern, whereas such 
modulation was erratic or lacking in loudsong 
notes of the other subspecies (Fig. 2). Differences 
in note shape allowed perfect allocation of 
loudsong recordings to the two groups, and 
