Concordia University - Digital Preservation and Access With Avalon

The following is a guest post by Tomasz Neugebauer of Concordia University.


In early December 2016, the participants of the Literary Audio Symposium at Concordia University explored the literary historical study, digital development, and critical and pedagogical engagement with collections of spoken recordings. Many recordings are already accessible online through repositories such as PennSound and the Cylinder Archive Project, but the potential benefits of collaborative and coordinated development of digitized spoken-audio archives for scholars, teachers and the public are only beginning to be realized.


Jared Wiercinski, Tim Walsh, and Tomasz Neugebauer gave a presentation about digital preservation and access with Avalon and Archivematica at the symposium. Concordia Library’s interest in digital preservation and access to audio/video collections was primarily inspired by an ongoing discussion with Concordia’s Center for Oral History and Digital Storytelling. COHDS developed a custom software solution in 2010 called Stories Matter for managing audio/visual oral history data, but the software had faced challenges with sustainability of development, insufficient digital preservation functionality, and lack of integration with a web browser. A summary of that presentation is provided below.


We began our needs analysis for supporting audio/visual research collections by looking to support the functionality developed in Stories Matter, including:

  • The ability to manage the structure of oral history metadata, which is divided into projects, interviewees, sessions, and clips
  • Providing for item level permission levels for access
  • Attaching transcripts, additional documents, and interviewer observations
  • Save/edit/browse playlists

We also desired a specialized research workflow functionality for interoperability within the library platform. For example, it should be possible to extract the data from the repository using formats to be ingested into research tools that can generate tag clouds and network visualizations of the data.


In an ideal world, digital preservation and access for all formats would be accomplished within the same system, but in reality, different access systems exist for target formats and audiences. Avalon Media System is designed to provide access to large collections of digital audio and video, alongside a community of educational, media, and open technology institutions. As audio/video collections are of interest to Concordia Library, Avalon is highly appealing, but it also presents a challenge, as we also need to provide access to text and image objects. Ensuring the enduring usability, authenticity, and discoverability of these materials requires a dedicated digital preservation system supporting a wide range of document types. Archivematica was selected as the open source digital preservation solution that will be used for preservation planning, identification, characterization, and migration tasks.


Our rationale for selecting Avalon as a top choice centers on our need of a system for qualitative research data in audio/video format. The strengths of Avalon include its evolution out of the successful Variations Digital Music Library project from Indiana University Bloomington, and its strong development community with a proven record of feature-packed releases. The key features of Avalon that make it a top choice for an access system include:

  • Hierarchical structure of objects, with units, collections, items, sections, and time-stamped start and end points
  • Sophisticated access and permission controls at collection and item level
  • Robust metadata and faceted discovery
  • Playlists including items and sections, public or private
  • Captions and subtitles
  • Integration with HydraDam2 and the Spotlight exhibition tool


Additionally, while Avalon’s excellent support for video is a key requirement for oral history, the fact that Avalon acknowledges audio as a distinct format with unique requirements is important for spoken word content and other audio types. Audio research data is important for scholarly research in many disciplines; Mark R. Roosa’s (2015) “Sound and Audio Archives” ( In M. V. Cloonan (Ed.), Preserving our heritage: perspectives from antiquity to the digital age (pp. 278-287). London: Facet Publishing.) lists multiple audio types such as Linguistic, Folklore, Oral history, Ethnographic, Dialectic, Music, Ethnomusicology, Bioacoustics, and Spoken word (p. 279). Although commercial tools like Soundcloud and Spotify offer robust functionality, there is a large gap left from these tools concerning how audio is utilized in institutional repositories. Avalon helps to fill this functionality.


The fact that an upcoming version of Avalon will integrate the Spotlight exhibition tool developed at Stanford University is particularly promising. Spotlight extends the repository ecosystem by providing a means for reusing digital content in other scholarly websites, allowing for the possibility of pulling content out of Avalon and into new research contexts.


Our concerns for Avalon, in addition to its limited capabilities for non-audio/video documents, include its composition of a complex set of software components that can be difficult to install and maintain. There are also missing features in Avalon that we hope will be added, such as administrative, technical and provenance metadata; transcription integration; the Oral History Metadata Synchronizer integration mentioned in the development roadmap; specialized research workflow interface (comparable to a platform like Databrary); and user annotations. Community sustainability is also a concern, as the Andrew W. Mellon Foundation grant that secured its development is ending in January 2017.  However, we are delighted to learn about the recent IMLS grant to support the sustainability of Avalon through, among other developments, the integration of Archivematica. Ideally, in the long term, a workflow for ingesting content from Archivematica’s Dissemination Information Packages (i.e., access files optimized for the Web) into Avalon would need to be developed. In the short term, it is possible to keep the workflows separate and rely on Avalon directly for all transcoding during ingesting.
The presentation at the Literary Audio Symposium raised questions that we continue to consider. From a software architecture point of view, what conditions are necessary to facilitate the development of a sustainable feature-rich access platform for audio/video content? Does audio/video content need to be accessible in a separate system, or can it be accommodated in the same digital asset management system as all other special collections formats? Given that there is an inherent lack of awareness around digital preservation issues, how do we promote the development of sustainable and responsible preservation planning for audio and video? Should we continue to build a wide variety of niche repositories, or aim towards a strategy of using centralized repositories? Ultimately, governments and cultural institutions including libraries, archives, and museums have the responsibility of preserving digital research and cultural content, as well as making it accessible.