This summer I found myself thinking a lot about a hypothetical web app for collections managers, archivists, and anyone else who uses the ArchivesSpace assessment module and wants a simple way to get both a birds-eye view and a record-by-record account of their surveyed collections. It would take a report exported from ASpace, detect different categories of variables based on data type and column name, and spit out an interactive visualization and table.

I spent some time playing around with Streamlit, a framework built for ML engineers which enables users to turn Python scripts into interactive web apps. I’m not an ML engineer or data scientist, and I’ve never built a web app, but I read their launch post and building an app seemed pretty straightforward. (“If you know how to write Python scripts, you can write Streamlit apps.”) A few late nights later I had a working prototype hosted on Streamlit Sharing. (Sharing requires your script to be in a public GitHub repository, and you need an invitation to Sharing, but I had no problem getting one.)

screenshot_overview

The prototype app features a heatmap for assessment ratings variables. The dark box in the "Housing Quality" row indicates a large share of collections in this sample have very poor housing, posing preservation risks.


I had to make some sacrifices to the original concept. First, I used mock data instead of allowing users to upload their own assessment reports. Second, I limited the filters to variables for ratings, surveyed extent, and survey purpose. Trying to accommodate all (or even most) variations of the assessment report was more than I wanted to tackle on my own before I knew if there was even a need for something like this. At the same time, since everyone can use the assessment module differently by adding custom fields, new ratings variables, employ different units for measuring extents, etc., it would be a slippery slope from making an app that accommodates the many variations to reinventing Tableau for collections managers. The potential for impacts from differences among the different versions of ASpace people may be using further complicates the issue.

Nevertheless, the prototype lets users filter by surveyed extent and assessment ratings, and visualize key variables related to special formats, administrative documentation, and intellectual control. Users can then download a CSV of records matching their criteria, selecting only the columns they need.

screenshot_results

Filter ratings, select columns, and download the results.


With this prototype, I’m trying address what I believe is a growing need - a need within a niche within a niche, but a need nevertheless - for an approach to analyzing and drilling down into assessment data that foregrounds the aggregate picture and enables data-informed decisions in not only collections management and preservation, but in how archives advocate for themselves and advance their missions as related to the collections in their care. The code is on GitHub, so anyone can play around with it and adapt the script (and improve it, no doubt!) to suit their situation.

I should add that I felt inspired by some examples of how archives are using assessments to address their backlogs and inform collections management, development, and preservation decisions. One was an article by Tressa Grave, Assessing Audiovisual Materials: Tips from Ohio State University Libraries, in the May/June 2021 issue of Archival Outlook; another was a webinar from OCLC in which staff from New York University Libraries discuss a survey they carried out at NYU Special Collections. Both of these resources provide great insights into the challenges and benefits of designing and conducting collections assessments, and show the pros and cons of the survey tools the authors used (PSAP and a custom database based on PACSL, respectively).

Both of those tools have built-in capabilities for analyzing assessments in the aggregate and drilling down into assessment data. But what about my old friend, the ASpace assessment module? I see this thing I made as another potential way to approach it. If you check it out, please let me know what you think.

The thing: Collections Assessment App

The Code: collections-assessment-app