ARPHA Preprints, doi: 10.3897/arphapreprints.e82955
Research Infrastructure Contact Zones: a framework and dataset to characterise the activities of major biodiversity informatics initiatives
expand article infoVincent Stuart Smith, Lisa French§, Sarah Vincent§, Matt Woodburn§, Wouter Addink|, Christos Arvanitidis#¤, Olaf Bánki«|, Ana Casino», Francois Dusoulier˄, Falko Glöckler˅, Donald Hobern¦, Martin R. Kalfatovicˀ, Dimitrios Koureas|, Patricia Mergenˁ, Joe Miller, Leif Schulman, Aino Juslén
‡ The Natural History Museum, London, United Kingdom§ Natural History Museum, London, United Kingdom| Naturalis Biodiversity Center, Leiden, Netherlands¶ Distributed System of Scientific Collections - DiSSCo, Leiden, Netherlands# LifeWatch ERIC, Seville, Spain¤ Institute of Marine Biology, Biotechnology and Aquaculture, Heraklion, Crete, Greece« Catalogue of Life, Amsterdam, Netherlands» Consortium of European Taxonomic Facilities, Brussels, Belgium˄ Muséum national d'histoire naturelle, Paris, France˅ Museum für Naturkunde Berlin, Leibniz Institute for Evolution and Biodiversity Science, Berlin, Germany¦ International Barcode of Life, Guelph, Canadaˀ Smithsonian Institution Libraries and Archives / Biodiversity Heritage Library, Washington, United States of Americaˁ Meise Botanic Garden, Meise, Belgium₵ Royal Museum for Central Africa, Tervuren, Belgiumℓ GBIF, Copenhagen, Denmark₰ Finnish Environment Institute, Helsinki, Finland₱ University of Helsinki, Helsinki, Finland₳ Finnish Museum of Natural History, Helsinki, Finland
Open Access
Abstract

The landscape of biodiversity data infrastructures and organisations is complex and fragmented. Many occupy specialised niches representing narrow segments of the multidimensional biodiversity informatics space, while others operate across a broad front but differ from others by data type(s) handled, their geographic scope and the life cycle phase(s) of the data they support. In an effort to characterise the various dimensions of the biodiversity informatics landscape, we developed a framework and dataset to survey these dimensions for ten organisations (DiSSCo, GBIF, iBOL, Catalogue of Life, iNaturalist, Biodiversity Heritage Library, GeoCASe, LifeWatch, eLTER, ELIXIR), relative to both their current activities and long-term strategic ambitions.

The survey assessed the contact between the infrastructure organisations by capturing the breadth of activities for each infrastructure across five categories (data, standards, software, hardware and policy), for nine types of data (specimens, collection descriptions, opportunistic observations, systematic observations, taxonomies, traits, geological data, molecular data, and literature), and for seven phases of activity (creation, aggregation, access, annotation, interlinkage, analysis, and synthesis). This generated a dataset of 6,300 verified observations, which have been scored and validated by leading members of each infrastructure organisation. The resulting data allows high-level questions about the overall biodiversity informatics landscape to be addressed, including the greatest gaps and contact between organisations.

Keywords
methods, data visualisation, coordination, alignment, community, biodiversity informatics