ARPHA Preprints, doi: 10.3897/arphapreprints.e107169
Deliverable D1.3 Best practice manual for findability, re-use and accessibility of infrastructures
expand article infoWouter Addink§, Niki Kyriakopoulou§, Lyubomir Penev|, David Fichtmueller, Ben Benjamin Norton#, David Peter Shorthouse¤
‡ Distributed System of Scientific Collections - DiSSCo, Leiden, Netherlands§ Naturalis Biodiversity Center, Leiden, Netherlands| Pensoft Publishers & Institute of Biodiversity and Ecosystem Research, Bulgarian Academy of Sciences, Sofia, Bulgaria¶ Botanic Garden and Botanical Museum Berlin, Freie Universität Berlin, Berlin, Germany# North Carolina Museum of Natural Sciences, Raleigh, North Carolina, United States of America¤ Agriculture & Agri-Food Canada, Ottawa, Canada
Open Access
Abstract
United and coordinated efforts of biodiversity data infrastructures are needed to bring together various data forms from many different scientific areas. Biodiversity data are considered of great importance and use when they form a network of knowledge that can be seamlessly integrated and presented to various audiences, promoting both research and education. The Biodiversity Community Integrated Knowledge Library (BiCIKL) project seeks to maximise the potential of integrated data sources by striving to connect fragmented data derived from biological, paleontological, and geological specimens and collections, as well as all derived information such as literature in the form of taxonomic treatments, research papers etc., taxonomic information and molecular sequences provided by these infrastructures, under the umbrella of common digital practices and policies in curation, data sharing and open data access over different scientific fields. One of the main goals of BiCIKL is to create bi-directional links between various data types, a process enabled by: a) the adoption of globally unique and persistent identifiers upon agreement among all stakeholders, that link to digital specimen objects, collections, taxonomic treatments, people, sequence data and taxa, and b) implementation of the best practices for the generation, management and curation of interlinked data by the host infrastructures. At the same time, infrastructures should be readily discoverable and accessible by end users, providing data that enable re-usability. In this manual we give an overview of the best practices and their associated recommendations for infrastructures on making the most out of their services and data, for establishing a network of knowledge with other infrastructures, for servicing researchers, data providers and other end users. These guidelines have been developed in collaboration with the infrastructures through Technical RI Forum meetings organised in the context of the BiCIKL project. Practices and recommendations were divided into six categories: 1) modalities of access, 2) building communities and trust, 3) technology and standards, 4) versioning of APIs and their data, 5) bi-directional linking between infrastructures and 6) API design patterns and naming conventions. A second division into three user groups (Infrastructures, Data providers, Users e.g. Researchers, Developers and Citizen scientists) is presented in Appendix I.
Keywords
BiCIKL; Research Infrastructures