DNA barcoding for Nee Soon flora and fauna 
157 
all individuals from specimen rich bulk samples with cost-effective high-throughput 
pipelines also allows for presorting using DNA barcodes and mitigates downstream 
morphological work on presorted units (Wang et al., 2018). DNA barcodes have the 
additional advantage of enabling associations between different life history stages 
(e.g. larvae and adults; see Yeo et al., in press), and the identification of animal and 
plant parts that are otherwise not diagnostic. For example, DNA fragments can be used 
to carry out a diet analysis based on DNA remnants in faecal matter (Srivathsan et al., 
2015, 2016), while free-floating DNA in water can be used to assess which animals 
were swimming in the water (Lim et al., 2016). 
Advantages aside, the use of DNA barcodes in species identification comes with 
several caveats that we need to bear in mind; some stem from the nature of species, 
while others are essentially technical. For example, DNA barcoding uses genes that 
are not functionally related to the origin of species (Kwong et al., 2012b). Instead, the 
species-specific signatures in barcode genes are due to the fact that most species pairs 
are old enough that sister species are distinguishable based on the genetic differences 
that accumulated over evolutionary time through a mixture of genetic drift and natural 
selection (Meier, 2008). Predictably, recently diverged species pairs can share DNA 
barcodes; i.e., they cannot be distinguished based on these barcodes. Based on ten 
years of experience with barcoding, this is fairly rare in animal species and about 
90% of all species have their own signature in COI sequences. A bigger problem that 
is more technical in nature is that a large proportion of animal species are not yet 
barcoded which interferes with the use of DNA barcodes for species identification 
(Kwong et al., 2012a). This is unfortunate because many environmental problems can 
be diagnosed using DNA barcodes, e.g. the presence of invasive species (Collins et 
al., 2012; Ng et al., 2016). As for plants, their genes evolve slower so that there is a 
larger proportion of closely-related species that are indistinguishable based on DNA 
barcodes (Hollingsworth, 2008; Hollingsworth et al., 2011). This means that DNA 
barcodes can often only distinguish plant genera. One solution to this problem - which 
was also pursued in this study - is sequencing multiple genes or whole chloroplast 
genomes (see “genome skimming”; Straub et al., 2012). 
Other technical problems with DNA barcodes are mostly related to cost and 
time. In particular, traditional Sanger-based DNA barcodes are very expensive 
(consumables and manpower). Fortunately, we recently developed Next-Generation- 
Sequencing (NGS)-based DNA barcodes that circumvent these problems (Meier et 
al., 2016). This is why we were also able to barcode a large number of Nee Soon 
specimens and use this information for species discovery. Another technical problem 
is the large amount of Polymerase Chain Reaction (PCR)-inhibitors in DNA extracts of 
plants. This interferes with amplifying plant barcodes. We addressed this issue through 
the use of different extraction techniques and by using genome skimming for obtaining 
chloroplast genomes. The latter has fewer amplification problems and yields more data 
at roughly the same cost because the cost per base pair of DNA is much lower for NGS 
than Sanger sequencing. 
Identification of specimens via DNA barcodes is slower than identification via 
morphology for those species with obvious diagnostic morphological features. For 
