Corresponding author: Jeremy deWaard ( dewaardj@uoguelph.ca ) © Valerie Levesque-Beaudin, Meredith Miller, Torsten Dikow, Scott Miller, Sean Prosser, Evgeny Zakharov, Jaclyn McKeown, Jayme Sones, Niamh Redmond, Jonathan Coddington, Bernardo Santos, Jessica Bird, Jeremy deWaard. This is an open access preprint distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Citation:
Levesque-Beaudin V, Miller ME, Dikow T, Miller SE, Prosser SW.J, Zakharov EV, McKeown JT.A, Sones JE, Redmond NE, Coddington JA, Santos BF, Bird J, deWaard J (2022) A workflow for expanding DNA barcode reference libraries through ‘museum harvesting’ of natural history collections. ARPHA Preprints. https://doi.org/10.3897/arphapreprints.e84304 |
Developing an efficient and effective protocol for capturing biological data held in natural history collections is critically important for many emergent projects in biodiversity, such as the construction of a validated, global DNA barcode reference library. To this end, we developed and streamlined a workflow for ‘museum harvesting’ of authoritatively identified Diptera specimens from the Smithsonian National Museum of Natural History (USNM). Our detailed workflow includes both on-site and off-site processing through specimen selection, labeling, imaging, tissue sampling, databasing and DNA barcoding. This approach was tested by harvesting and DNA barcoding 941 voucher specimens, representing 32 families, 819 genera, and 695 identified species collected from 100 countries. We recovered 867 sequences (> 0 base pairs) with a sequencing success of 88.8% (727 of 819 sequenced genera gained a barcode > 300 base pairs). While Sanger-based methods were more effective for recently-collected specimens, the methods employing next-generation sequencing recovered barcodes for specimens over a century old. The utility of the newly generated reference barcodes is demonstrated by the subsequent taxonomic assignment of nearly 5000 specimen records in the Barcode of Life Data System.