Corresponding author: Rebecca Dikow ( dikowr@si.edu ) Corresponding author: Corey DiPietro ( dipietroc@si.edu ) © Rebecca Dikow, Corey DiPietro, Michael Trizna, Hanna BredenbeckCorp, Madeline Bursell, Jenna Ekwealor, Richard Hodel, Nilda Lopez, William Mattingly, Jeremy Munro, Richard Naples, Candace Oubre, Drew Robarge, Sara Snyder, Jennifer Spillane, Melinda Tomerlin, Luis Villanueva, Alexander White. This is an open access preprint distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Citation:
Dikow R, DiPietro C, Trizna MG, BredenbeckCorp H, Bursell MG, Ekwealor JTB, Hodel RGJ, Lopez N, Mattingly WJB, Munro J, Naples RM, Oubre C, Robarge D, Snyder S, Spillane JL, Tomerlin MJ, Villanueva LJ, White AE (2023) Developing responsible AI practices at the Smithsonian Institution. ARPHA Preprints. https://doi.org/10.3897/arphapreprints.e113335 |
Applications of artificial intelligence (AI) and machine learning (ML) have become pervasive in our everyday lives. These applications range from the mundane (asking ChatGPT to write a thank you note) to high-end science (predicting future weather patterns in the face of climate change), but because they rely on human-generated or mediated data, they also have the potential to perpetuate systemic oppression and racism. For museums and other cultural heritage institutions, there is great interest in automating the kinds of applications at which AI and ML can excel, e.g., tasks in computer vision including image segmentation, object recognition (labeling or identifying objects in an image), and natural language processing (e.g. named-entity recognition, topic modeling, generation of word and sentence embeddings) in order to make digital collections and archives discoverable, searchable, and appropriately tagged.
A coalition of staff, fellows, and interns working in digital spaces at the Smithsonian Institution, who are either engaged with research using AI or ML tools, or working closely with digital data in other ways, came together to discuss the promise and potential peril of applying AI and ML at scale and this work results from those conversations. Here we present the process that has led to the development of an AI Values Statement and an implementation plan, including the release of datasets with accompanying documentation to enable these data to be used with improved context and reproducibility (dataset cards). We plan to continue releasing dataset cards, and for AI and ML applications, model cards, in order to enable informed usage of Smithsonian data and research products.