ARCHIVES 2015 has ended
Tuesday, August 18 • 4:00pm - 4:30pm
Research Forum Session 9: Enabling Access - A Detailed Analysis of Three Million MARC Records for Archival Materials

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Archivists have used the MARC format for description since the 1980s, but deep analysis of the corpus of records has never been done. As of February 2014, OCLC’s WorldCat database included 3,000,000 archival records. This research establishes a detailed profile of data element occurrences, providing a view of 30+ years of practice. The data challenge some common assumptions.

MARC provides no straightforward way to extract all records for “archival materials.” We therefore scoped a filter to extract the archival subset. A simplified description of the filter: it pulls in records for “unpublished” materials in any format (e.g., text, visual, moving image, sound recording) held by a single institution. It excludes records for published materials, theses and dissertations, and bibliographies. The filter itself suggests a rich discussion question: What are the characteristics of “archival material” in the context of the MARC format?

I analyzed the dataset from numerous perspectives in order to address questions such as these: In what significant ways do descriptions differ from one type of material to another? To what extent does use of the archival control byte successfully encompass the universe of archival descriptions? Is it true that archivists usually describe materials at the collection level? How often is DACS used as the content standard?

Some high-level findings:

  • About 50% of records are coded as “mixed materials,” while 25% each are textual manuscripts or visual materials.

  • 28% are coded as being under “archival control.”

  • 58% describe collections, 42% describe single items.

  • 85% include one or more indexed creator names 

  • 75% include one or more indexed subject terms.

  • The fields used vary significantly from one type of material to the next.

I also explored implications for effective discovery, including those relative to the findings of the Bron et al. study of the 120,000 EAD instances in the ArchiveGrid database.

About the Author:

Jackie Dooley is a Program Officer in OCLC Research, where she undertakes projects to address current challenges faced by archives and special collections libraries in research libraries. Past projects have included detailed surveys of special collections and archives in the US/Canada and the UK/Ireland, the data from which have helped OCLC define its work agenda for the past five years.

In previous positions Dooley worked with archival, visual, and rare book collections at the University of California at Irvine, the Getty Research Institute, the University of California at San Diego, and the Library of Congress.

Dooley has lengthy experience working with descriptive standards, including as a member of the original EAD development team. She’s held a variety of positions within SAA, including serving as President (2012-2013).

Dooley’s latest publication is The Archival Advantage: Integrating Archival Expertise into Management of Born-Digital Library Materials (OCLC Research, 2015), which describes ten areas of archival knowledge that are essential for managing digital materials such as research data that often are managed by library units outside the archives.

avatar for Jackie Dooley

Jackie Dooley

OCLC Research, Retired from OCLC Research
Jackie Dooley retired in 2018 from her position as Program Officer in OCLC Research. She is an SAA Fellow and a past president of the Society.

Tuesday August 18, 2015 4:00pm - 4:30pm EDT
Room 26A Cleveland Convention Center, 300 Lakeside Avenue, Cleveland, OH 44114

Attendees (0)