<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//TaxonX//DTD Taxonomic Treatment Publishing DTD v0 20100105//EN" "../../nlm/tax-treatment-NS0.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:tp="http://www.plazi.org/taxpub" article-type="research-article" dtd-version="3.0" xml:lang="en">
  <front>
    <journal-meta>
      <journal-id journal-id-type="publisher-id">102</journal-id>
      <journal-id journal-id-type="index">urn:lsid:arphahub.com:pub:73abe0ce-d97c-5d7c-bee5-b8e6e6fe6a17</journal-id>
      <journal-title-group>
        <journal-title xml:lang="en">ARPHA Preprints</journal-title>
        <abbrev-journal-title xml:lang="en">preprints</abbrev-journal-title>
      </journal-title-group>
      <publisher>
        <publisher-name>Pensoft Publishers</publisher-name>
      </publisher>
    </journal-meta>
    <article-meta>
      <article-id pub-id-type="doi">10.3897/arphapreprints.e114920</article-id>
      <article-id pub-id-type="publisher-id">114920</article-id>
      <article-categories>
        <subj-group subj-group-type="heading">
          <subject>Project Report</subject>
        </subj-group>
        <subj-group subj-group-type="scientific_subject">
          <subject>Computer &amp; Information sciences</subject>
        </subj-group>
        <subj-group subj-group-type="sdg">
          <subject>Life on land</subject>
        </subj-group>
      </article-categories>
      <title-group>
        <article-title>Milestone MS32 The design and prototype of a workflow integrating Wikidata into validation and linking</article-title>
      </title-group>
      <contrib-group content-type="authors">
        <contrib contrib-type="author" corresp="yes">
          <name name-style="western">
            <surname>Dillen</surname>
            <given-names>Mathias</given-names>
          </name>
          <email xlink:type="simple">mathias.dillen@plantentuinmeise.be</email>
          <uri content-type="orcid">https://orcid.org/0000-0002-3973-1252</uri>
          <xref ref-type="aff" rid="A1">1</xref>
        </contrib>
        <contrib contrib-type="author" corresp="no">
          <name name-style="western">
            <surname>Plank</surname>
            <given-names>Andreas</given-names>
          </name>
          <xref ref-type="aff" rid="A2">2</xref>
        </contrib>
      </contrib-group>
      <aff id="A1">
        <label>1</label>
        <addr-line content-type="verbatim">Meise Botanic Garden, Meise, Belgium</addr-line>
        <institution>Meise Botanic Garden</institution>
        <addr-line content-type="city">Meise</addr-line>
        <country>Belgium</country>
      </aff>
      <aff id="A2">
        <label>2</label>
        <addr-line content-type="verbatim">Botanical Garden and Botanical Museum, Berlin, Germany</addr-line>
        <institution>Botanical Garden and Botanical Museum</institution>
        <addr-line content-type="city">Berlin</addr-line>
        <country>Germany</country>
      </aff>
      <author-notes>
        <fn fn-type="corresp">
          <p>Corresponding author: Mathias Dillen (<email xlink:type="simple">mathias.dillen@plantentuinmeise.be</email>).</p>
        </fn>
        <fn fn-type="edited-by">
          <p>Academic editor: </p>
        </fn>
      </author-notes>
      <pub-date pub-type="collection">
        <year>2023</year>
      </pub-date>
      <pub-date pub-type="epub">
        <day>31</day>
        <month>10</month>
        <year>2023</year>
      </pub-date>
      <volume>4</volume>
      <uri content-type="arpha" xlink:href="http://openbiodiv.net/BD194349-841D-5B69-B7ED-A5BB2BDCDBB1">BD194349-841D-5B69-B7ED-A5BB2BDCDBB1</uri>
      <history>
        <date date-type="received">
          <day>30</day>
          <month>10</month>
          <year>2023</year>
        </date>
        <date date-type="accepted">
          <day>30</day>
          <month>10</month>
          <year>2023</year>
        </date>
      </history>
      <permissions>
        <copyright-statement>Mathias Dillen, Andreas Plank</copyright-statement>
        <license license-type="creative-commons-attribution" xlink:href="http://creativecommons.org/licenses/by/4.0/" xlink:type="simple">
          <license-p>This is an open access preprint distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.</license-p>
        </license>
      </permissions>
      <abstract>
        <label>Abstract</label>
        <p>In this task, the aim is to develop a workflow that should facilitate the linking process of collector name strings to PIDs for those collectors. Such a workflow should help scale up the number of links being made, make the process more efficient and should take advantage as much as possible of existing work and infrastructures, so as not to reinvent the wheel. As such, the work can be roughly split into a few subtasks:<br/>- Make existing linking workflows more easily implementable in other contexts and by other infrastructures. This includes finding ways for such workflows to produce links that can easily be published, i.e. in a standardised format compatible with existing infrastructure. The suitability of different infrastructures for making established links available should also be assessed.<br/>- Establish, document and improve the comprehensiveness, findability and interoperability of the content in PID-minting resources, in particular Wikidata as it can be edited openly.<br/>- Refine the decision making process of establishing links, by implementing and improving the methods that can be used to validate potential links.</p>
        <p>In this document, the focus lies on linking people. We will propose a workflow to &amp;#39;roundtrip&amp;#39; links established through the Bionomia platform back to the collections holding the attributed specimens, as well as making them available for use by other BiCIKL infrastructures. We will also refine existing automated linking workflows and pilot the new functionalities on the (botanical) collections of the task partners. These refinements will be influenced by an assessment of the current state of Wikidata, investigated through shape expressions constructed from commonly used queries and from Wikidata records which have been linked in previous efforts such as the Botany Pilot, Bionomia and published specimen data to GBIF.</p>
      </abstract>
    </article-meta>
  </front>
</article>
