Show simple item record

dc.creatorFinnigan, Leanne
dc.creatorToner, Emily
dc.date.accessioned2021-09-24T13:25:44Z
dc.date.available2021-09-24T13:25:44Z
dc.date.issued2021-09-22
dc.identifier.citationFinnigan, L., & Toner, E. (2021). Building and Maintaining Metadata Aggregation Workflows Using Apache Airflow. Code4Lib, 52.
dc.identifier.citationAvailable at: https://journal.code4lib.org/articles/16171
dc.identifier.issn1940-5758
dc.identifier.urihttp://hdl.handle.net/20.500.12613/6955
dc.description.abstractPA Digital is a Pennsylvania network that serves as the state’s service hub for the Digital Public Library of America (DPLA). The group developed a homegrown aggregation system in 2014, used to harvest digital collection records from contributing institutions, validate and transform their metadata, and deliver aggregated records to the DPLA. Since our initial launch, PA Digital has expanded significantly, harvesting from an increasing number of contributors with a variety of repository systems. With each new system, our highly customized aggregator software became more complex and difficult to maintain. By 2018, PA Digital staff had determined that a new solution was needed. From 2019 to 2021, a cross-functional team implemented a more flexible and scalable approach to metadata aggregation for PA Digital, using Apache Airflow for workflow management and Solr/Blacklight for internal metadata review. In this article, we will outline how we use this group of applications and the new workflows adopted, which afford our metadata specialists more autonomy to contribute directly to the ongoing development of the aggregator. We will discuss how this work fits into our broader sustainability planning as a network and how the team leveraged shared expertise to build a more stable approach to maintenance.
dc.format.extent6 pages
dc.languageEnglish
dc.language.isoeng
dc.relation.ispartofTemple University Libraries
dc.relation.haspartCode4Lib, Iss. 52
dc.rightsAttribution CC BY
dc.rights.urihttp://creativecommons.org/licenses/by/3.0/us/
dc.subjectDigital Public Library of America
dc.subjectMetadata
dc.titleBuilding and Maintaining Metadata Aggregation Workflows Using Apache Airflow
dc.typeText
dc.type.genreJournal article
dc.contributor.groupPA Digital
dc.relation.doihttp://dx.doi.org/10.34944/dspace/6936
dc.ada.noteFor Americans with Disabilities Act (ADA) accommodation, including help with reading this content, please contact scholarshare@temple.edu
dc.description.schoolcollegeTemple University. Libraries
dc.temple.creatorFinnigan, Leanne
dc.temple.creatorToner, Emily
refterms.dateFOA2021-09-24T13:25:44Z


Files in this item

Thumbnail
Name:
FinniganToner-JournalArticle-2 ...
Size:
436.1Kb
Format:
PDF

This item appears in the following Collection(s)

Show simple item record

Attribution CC BY
Except where otherwise noted, this item's license is described as Attribution CC BY