Loading...
Building and Maintaining Metadata Aggregation Workflows Using Apache Airflow
Finnigan, Leanne ; Toner, Emily
Finnigan, Leanne
Toner, Emily
Citations
Altmetric:
Genre
Journal article
Date
2021-09-22
Advisor
Committee member
Group
Department
Permanent link to this record
Collections
Research Projects
Organizational Units
Journal Issue
DOI
http://dx.doi.org/10.34944/dspace/6936
Abstract
PA Digital is a Pennsylvania network that serves as the state’s service hub for the Digital Public Library of America (DPLA). The group developed a homegrown aggregation system in 2014, used to harvest digital collection records from contributing institutions, validate and transform their metadata, and deliver aggregated records to the DPLA. Since our initial launch, PA Digital has expanded significantly, harvesting from an increasing number of contributors with a variety of repository systems. With each new system, our highly customized aggregator software became more complex and difficult to maintain. By 2018, PA Digital staff had determined that a new solution was needed. From 2019 to 2021, a cross-functional team implemented a more flexible and scalable approach to metadata aggregation for PA Digital, using Apache Airflow for workflow management and Solr/Blacklight for internal metadata review. In this article, we will outline how we use this group of applications and the new workflows adopted, which afford our metadata specialists more autonomy to contribute directly to the ongoing development of the aggregator. We will discuss how this work fits into our broader sustainability planning as a network and how the team leveraged shared expertise to build a more stable approach to maintenance.
Description
Citation
Finnigan, L., & Toner, E. (2021). Building and Maintaining Metadata Aggregation Workflows Using Apache Airflow. Code4Lib, 52.
Available at: https://journal.code4lib.org/articles/16171
Available at: https://journal.code4lib.org/articles/16171
Citation to related work
Has part
Code4Lib, Iss. 52
ADA compliance
For Americans with Disabilities Act (ADA) accommodation, including help with reading this content, please contact scholarshare@temple.edu