EAC-CPF at the AMNH, Part 1
One year has passed since we began planning for and creating descriptive records for the AMNH expeditions and the people who participated in them. The project team set out to encode the descriptions of the entities in the EAC-CPF metadata schema. EAC-CPF stands for Encoded Archival Context – Corporate bodies, Persons, and Families and was endorsed by the Society of American Archivists (SAA) as a technical standard in 2011. The schema itself was envisioned in 2001 when a group of archivists got together to create a high-level model and draft a strategy for implementation and testing. Now, in 2014, there are few tools for archivists to create and manage EAC-CPF records. We have a plan to dive into these tools, but more on that later… This lack of resources could prohibit a lot of CPF adoption in the archival community in the current landscape. But we can’t let that stand in our way. We have a grant-funded mission, and deliverables to produce, and …Excel.
Our go-to program from capturing general inventories to cataloging minimal-level records, Excel has facilitated data gathering with little overhead. In fact, prior to this project, the Research Library already had spread sheets of about 2,000 names recorded into two separate Excel files: one for personal names and the other for expeditions. These descriptions are based on the Special Collections vertical files, a heavily-used resource for our library visitors. The spread sheets capture bare-bones information that can be found in the vertical files, such as dates of existence, summaries generalized into one or two sentences, affiliations with the AMNH. As EAC-CPF began entering our daily conversations at the Library, we realized that the spread sheet headers could be mapped to CPF elements. Hence, the vision of this current effort!
Excel laid the groundwork for these so-called “skinny” CPF records. It made sense to us to keep building on it as a tool for richer descriptions of people and expeditions. Barbara advocated for a modular system of descriptions: rich CPF descriptions for creators of archival materials on the one hand, and EAD finding aids detailing the contents of archival collections on the other. As the team discussed the various ways to create rich entity records, our road of least resistance always led back to Excel.
During our project planning last year, the tools available to create EAC-CPF records were limited. ArchivesSpace was in beta-testing and the other CPF projects in our awareness were using a custom FileMaker database or coding in xml. ICA-AtoM had implemented CPF into their system, but the open-source software was, at that point, new to us and we were hesitant to adopt it when a direct migration from Archivists’ Toolkit to ArchivesSpace was being developed. Corey Harper at NYU suggested we look at xEAC, an X-Forms based application for EAC-CPF records, but we argued for a system that could support both EAD and EAC-CPF (though we are now strongly considering implementing it for CPF creation and management). It was clear that there was no clear solution for this fledgling schema. However, as a result of weighing the pros and cons of the programs out there, we decided the next step would be to define our needs for an archival content management system. I’ll be writing more on this later. A draft of our functional requirements can be read here in the meantime.
Back to Excel, back to the basics, back to where it started, back to simple tables with headers, a field and a value, a place to put data until a more elegant solution is reached. After all, what is the point of all this data management without the actual content? We evolved our “skinny” master spread sheets into a single worksheet for a single entity that could then be pushed into a traditional table layout to support multiple records.
Our original “skinny” headers transposed to stand upright, we could open up the cells to carry more descriptive information – paragraphs for the Biographical and Historical notes! We thought that this new view could greatly enhance data gathering without losing the utility of mapping field headings to CPF metadata tags. “ID” for <recordId>; “Expedition Name” for <nameEntry/part>; “Purpose” for <biogHist>; etc. However, the hierarchical architecture of the schema (also operative in EAD) pushed our descriptions out of the flat box into new tables held in separate sheets of the Excel file. See a table for <cpfRelation> below.
One of the (dare I say) sexy aspects of EAC-CPF, is the relations element. Recording associated names and their roles creates a virtual network of entities. An expedition record is enhanced by its member components, which each have their own networks of family, colleagues, expeditions and institutions. From a focused center, the netting expands. Not only that, but attaching a URI to that name opens up that network to other connections in the Linked Open Data (LOD) landscape. The Smithsonian generated CPF records for their scientific expeditions; no doubt our institutions hosted the same scientists and artists. A virtual link can be drawn seamlessly, if the metadata supports it. While there are a lot of unknowns and possibilities in the linked data landscape, we grounded ourselves in knowing that the best thing to do is to prepare our data for the LOD environment. So our spread sheet grew legs – in the form of Timeline and Relationship tables.
Suddenly our basic workform was getting complicated and we had to rethink our approach. Our deliverables specified EAC-CPF records, not multi-level Excel files.
…to be continued: Part 2 will uncover the next big evolutionary leap for our Excel worksheet. Stay tuned!
4 Responses to EAC-CPF at the AMNH, Part 1
Archives
Available Finding Aids (PDF)
Recent Comments
- Paul V. Guizzardi on Exploring geographic expedition exhibition records
- Tom Norris on Dr. Albert E Parr Museum Director 1942-1959
- Joseph R Ornig on Theodore Roosevelt and the AMNH
- Joe Kish on Dr. Albert E Parr Museum Director 1942-1959
- Dustin Angell on Phase I: The Complete Story
Authors
Tags
Ainu AMNH library catalog Anthropology Archives Archbold Archival Arrangement archives Authority Names CAT Cataloging CLIR 2010 clir 2012 Correspondence Crocker Land Department of Preparation and Installation Department Records EAC-CPF expeditions Fall 2011 Field Notes Finding Aid finding aids Hayden Planetarium Herpetology Archives hidden connections IMLS LARA linked data Mammalogy Archives Manuscript Collection Museum History Non-Curatorial Field Notes Ornithology Archives Paleontology Archives Phase 2 photographs Photo Print Collection Processing Research Library Risk Assessment Slide Collection Spring 2011 Spring 2012 Summer 2011 Summer 2012 T. Don CarterLinks to Related Sites
- Encoded Archival Context – Corporate Bodies, Persons, and Families (EAC-CPF)
- Bowdoin College Crocker Land Expedition Blog
- PACSCL Hidden Collections Processing Project
- Smithsonian Institution Field Book Project
- Yale Peabody Museum of Natural History Archives and Special Collections Blog
- Metropolitan New York Library Council
- Linked Data
- Field books
I’m in the process of migrating EADitor to bootstrap, and when I release the next beta of it in the next month or two, you’ll be able to hook it up to a xEAC installation for linking archival materials to corporate bodies, people, and families. Ultimately, I plan to implement an optional SPARQL endpoint publishing mechanism in both EADitor and xEAC which will link content in the two systems more seamlessly. EADitor is currently being beta tested by a New York archival consortium pilot group, as well as being used by the ANS in production. The idea is that EADitor and xEAC can be standalone applications in their own right (xEAC has a lot of scholarly potential outside of the traditional archival implementation of EAC-CPF), but can be hooked together with minimal effort.
Thanks, Ethan! We were wondering if EADitor worked with xEAC. I’ll keep an eye out for the update. This is very timely as we plan to test some content management systems this summer. Out of curiosity, are you implementing the experimental “relations” tag in EADitor, or does the SPARQL endpoint address the overlap of the elements in both schemas? We definitely want to represent the relationships in both finding aids and entity records, but do not want to duplicate efforts if the “bioghist” links back to the entity record. It will be interesting to see how this develops and gets used in the archival community.
As of right now, EADitor implements EAD 2002, not EAD v3, but I will be migrating to EAD 3 this summer. I’ve already written the XForms stuff for handling the control element, but it has not been incorporated into the main editing form. I haven’t given much thought to handling relations in EAD.
[…] month I described how we grew our Excel worksheet to support gathering descriptions for AMNH expeditions and personnel in data fields mapped to […]