In our excitement to develop products for the future do we neglect the past? Wendy Hagenmaier (Georgia Tech) discusses with Henry on the importance of maintaining our history, especially in software itself. They chat all about archival: what is it, what should concern an archivist, differences b/t physical/digital, artifacts/process, value/worth of things to preserve, struggles, places where archival can happen (personal, libraries, companies, museums), and our shared responsibility and knowledge. (42 min)
Conversations may be edited for clarity. (edit)
Henry: Today, I have with me, Wendy. She's the digital collections archivist at Georgia Tech Library. I thought that it would be cool to talk with her because archival in a way it seems to relate a lot to maintenance, especially with digital archiving, it's related to software. So yeah, thanks for joining with me today.
Wendy: Yeah. Thanks so much, Henry. I originally emailed you because I have been thinking a lot about software preservation and software archives lately. And at the same time, a lot of archivists have been talking about and getting involved with maintainer communities. And one of my archivist colleagues saw one of your tweets about your maintainers podcast and passed along your contact. So yeah, I'm really excited to talk with you today.
Henry: Yeah. I think it might be good just to start with talking about archival and being archivist in general. I think I have basic ... I don't have a lot of knowledge in that area, and in a sense it's like I didn't even know that was a full time job in itself. Yeah.
Wendy: I think you're not alone. Probably most people would never think of it as a job. Honestly, I didn't really either until I ended up in library school, didn't know what I wanted to focus on. There was a professor, I was at UT, Austin and the professor there, Dr. David Gracie, was at the orientation in this thick Texas accent was talking about the importance of records and archives, and how they're evidence that you were born and they're evidence that you die, and they're with you throughout your life. It had never really struck me the power that sort of built into these artifacts, whether they're physical or digital, that we create and interact with through the course of our lives and our societies. Since then, I've become very immersed in the archives world, but until that point, I never thought of it either.
Wendy: And I guess, yeah, the way I tend to ... Here at Georgia Tech, we have students come in, mostly undergraduate students for classes and we talk to them a lot about, how do you use the word archive? because it's become quite common, right, like in software applications, archive this email, but not in the same sense that as professional archivists we tend to think of it, which is sort of the archive can be either the records which are not just old stuff but very frequently things that are being created right now, or it can also refer to sort of the place where the records are preserved and made available. And that can either be a physical place or a digital place.
Wendy: I also like to tell students that with the archives as records thing, unlike a library that has awesome stuff, but it's stuff that exists in a lot of other libraries, so there's redundancy there, in archives, it's often materials that only exist in this single place, in this single instance. So it's that uniqueness and the rare quality of things that makes them particularly special. And it's also things that somebody has like the unique stuff that somebody has decided is worth saving for the long term. And that considering who that somebody is and what their motivations are is really important as an archivist, but also as a user of archives.
Wendy: Yeah. I studied English as an undergraduate and always loved literature and film and sort of these cultural artifacts and cultural spaces, particularly things like classic movies like old Hollywood, and just sort of had this fascination with family history and places, yeah, worlds from the past. I never really thought about library school. I was working for a publisher in San Francisco and I had a friend there who decided to go to library school. And that's really what put the idea in my head, and the more I write about it and how you could be someone working in the maintenance of these cultural worlds. I was like, this is what I should do. This totally fits with what I love. But yeah, it's not growing up. Maybe if you go to the public library you might consider, oh, that's a career path, but it's not something like, you could be a lawyer, you could be a doctor or an archivist. It doesn't come up very often. Yeah. The origin stories how people got into the field tend to be pretty interesting.
Henry: Yeah. Because I think about in tech, people are like, oh, I want to work at Google or Facebook, that kind of thing. But then it's like, I don't know, I guess with being an archivist, it's like ... I guess maybe ... I don't know if you have a particular end goal where you want to be in, or is it just that I could work anywhere kind of thing?
Wendy: Yeah. Yeah. That's really interesting. I think for me in my experience, most people tend to think about sort of what category or what type of archives they might like to work in rather than a specific institution although I'm sure there's people with those kind of aspirations. So I work in an academic archives or academic library, which is kind of its own universe, and that Georgia Tech largely collects and documents its own history. That's sort of the mandate as a public institution. But we also have special collections like science fiction materials that relate to what students here are interested in as well as architecture and design and that kind of thing.
Wendy: So academic archives, as an academic archivist, there's sort of a little more emphasis on research. It's not like you're not a professor. But there's a little bit more research and teaching that goes on. So that's like some people like that, some people don't. There are so many other pathways like working for state archives, National Archives, corporate archives, like Coca-Cola has an archives here across the street. Once you start looking for them, archives, they're sort of everywhere. We are sort of unified as a profession, but the individual daily work that we do is quite diverse.
Henry: Yeah. Because you were saying if it's a corporate thing, it's a specific information for them. I think it's interesting because I guess even for academic stuff, it's not like that data necessarily has to be private though, right? You can still share it with everyone. Because you mentioned it being unique versus it might be redundant across different libraries or something like that.
Wendy: Right. Yeah. And I think that this is a sort of stereotype of the archive as this closed place that's mostly about preservation, and like, don't come use our materials, or wash your hands first and don't come in here. Most archivists are much more ... I mean we care about preservation and security, absolutely, but there's no point in securing or preserving anything unless it's going to be available and made accessible and open. So yeah, as you know, I work at a public institution. We have a mandate to serve our users and provide access, and there's definitely movements in archives to open data, to open metadata as much as possible, to be explicit about sort of what the end user is legally allowed to do with the intellectual property and sort of making transparent these materials as evidence of the activities through which they were created as sort of materials with integrity and things that can be trusted. So yeah, openness is definitely quite a mandate for archivists, I think, in general.
Henry: Yeah. I think that relates pretty well to open source where we kind of the desires that anyone should be able to use all that code freely available, at least to consume. My next question is kind of what do archives think about in terms of preservation? What are the concerns that are irrelevant?
Wendy: Yeah. Yeah. I mean I think that is one of the hardest things that archives grapple with, because as maintainers, we don't always have a choice about that. We have sort of the mandates of our employers. If we're talking about sort of archivists working within institutions, most archives have collection development policies that sort of spell out the scope of what they collect and try to be transparent about that. Archives are really interesting in that they tend to try to collect sort of think more holistically.
Wendy: Certainly you want your institution's collection to be something valuable that researchers will want to use. But if we learn about some collection that's absolutely unrelated to other materials that we've already collected, we might refer that donation to another archives. So there's sort of this collaborative ... We're in this all together for a larger purpose impulse about collecting and collection development that exists in archives that's pretty unique, I think, compared to other more sort of for-profit activities.
Wendy: And I think there's more and more discussion about we're not the only archivist. You could say that everyone is an archivist of their own materials. In fact, I would encourage everyone to feel like that, that you're creating important evidence of your life every day and to be sort of aware of those choices you're making about where you're storing your materials, who you're sharing them with, what companies you're sharing them with. Do you control them? Do you want to delete them? Can you? All of those questions about what's worth preserving that the individual archivist makes or the community archives makes on their own. But yeah, in the archives world, we tend to sort of talk about the term appraisal as this evaluation of what is worth collecting, but also what's worth saving. Yeah.
Henry: Yeah. I guess that sounds like triage if you're trying to think about what's priority and what's important. And I guess maybe in some sense I feel like there is an impulse to just want to save everything, and then there's another impulse where it's like if there's too much, then the one's ... what you're going to say like, no one's going to come look at it anymore because its [inaudible 00:13:47] is so much, there's no value.
Wendy: Right. Yeah. Or how to design systems that help people navigate to what's relevant to them to be used again. And certainly, I mentioned software preservation, it's something that we're thinking a lot about in archives and sort of what is it about, thinking a lot about what is it about software that's essential to preserve. Is it the source code? Is it screenshots? Is it evidence of the community that created the software? And I guess, yeah, I'm curious from your perspective working on software and open source software, what do you see as worth, or the most essential thing to preserve of your own work?
Henry: Yeah. That's really interesting because I don't know if I've thought about it in the way of using the word preservation. We tend to use a word like sustainability, but that's more about the future. But then, a lot of the future is based on the past. Yeah. I think the obvious thing is saving code, but then if we're using open source and version control, that's kind of built in. And the things that we miss are kind of like the metadata that is lost in the code, right?
Henry: It's funny you brought up screenshots even. I think in certain ways, if you have the code you should be able to reproduce what something looks like by running the code. But then you would have to ... As long as you have the source code, you should be able to rebuild everything and see what it looks like. A lot of times people think it's cool to like, say, you go to a website and then you see how it's changed over time, maybe saving the screenshots would be nice. I don't know if you would save all of them for every [inaudible 00:15:45] or every version. I don't know. There's probably a lot of things that are worth preserving, but we don't save them in a way that it could be preserved, say like people just have conversations or meetings and we don't record them or we don't take notes. Those aren't there anymore.
Henry: It's kind of like maybe the problem with preservation of the software is that as developers, we're not putting enough thinking into preservation in the first place and so there's not that much to preserve, because the thing that's left is the code. And if all you had was that, you wouldn't really understand any of the context and like, oh, why did they come up with this decision, things like that.
Wendy: Yeah. That's a really interesting point in that it's a huge problem that there's disconnect between archivists and the rest of the world, or in this case software developers. We need to be talking if anything is going to be sustainable, right, or preservable. And there is this movement in archives to take this sort of proactive approach to partnering with the creator of the records as the records are created to ensure that they can be preservable. At the same time, you don't want to bias the creation process by being present or dictating the way it's created or the way it's documented. So there's sort of this balance between being active and being sort of neutral which you're not. You're never neutral. Yeah.
Wendy: But I think sustainability is such a key word for archives as well because I feel like a lot of what I do is sort of sustainability work that's surrounding either project management here at work or coordinating sort of social, professional infrastructure like community work more broadly here in Georgia and the Society of American Archivists. And it's that kind of building that infrastructure both social and technical that archivists are very much concerned with, because we know that it's absolutely essential for sustainability of our work in the future, particularly as the stuff we're preserving becomes more complex.
Henry: Yeah. I think with open source, we're kind of doing the same thing where ... I mean even in my work, I am mostly doing a lot of nontechnical work too, whether it's, yeah, program management or fundraising and all the social stuff, because I think a lot of people are realizing we need program managers and all these different kinds of people rather than just people that write code because things are more complicated. A lot more people are involved. The scale of things is way bigger and it's hard to manage things, just thinking about the pure code or pure data kind of thing.
Wendy: And I think, yeah, on the flip side and in archives, in libraries, that there is maybe this impulse to organize the community or be sort of organizing projects or whatever, managing projects, but less ... When I went through library school, there wasn't a whole lot of emphasis on programming skills, technical skills. That's, I think since even changed, since I was in school, it's changing all the time and becoming more sort of current with the job market.
Wendy: But one of the things I think about a lot going forward is as archivists will we be able to sort of be developing and sustaining our own technical infrastructure in addition to sort of social infrastructure for the work that we do? And how are our values of openness and providing access and ensuring the integrity, ensuring trust between the record and the user, the archivist and the user are so dependent on our technical infrastructure, but we can't sustain ... We don't have necessarily all the skills we need, or we're dependent on so many things for us sustaining that infrastructure. So realizing that we may need more technical infrastructure going-
Henry: Yeah. It feels like for an archive, it's like the whole technology and just the whole digital thing probably changed it so much because it was so physical before and now it's like people are trying to digitize things that were physical. And then we have all these different kinds of operating systems or computers and formats, and it's hard to think about how you would even manage that. It's funny because you're mentioning unity and it's like, well, I wonder if there are standards in how people format the archives, or people are just kind of doing their own thing and then kind of have to ... It's kind of like when you're doing research and you're googling things, you have to go through all these different things and synthesizing yourself.
Wendy: Yeah. Yeah. And I definitely feel like people are working together, at least. I mentioned how archives and libraries are kind of unique in the collaboration impulse that we have. I think also impulses to create standards, and there's too many standards probably. But we do have sort of professional organizations and working groups, for example, I'm working on software preservation stuff as part of organization and sort of a movement called the Software Preservation Network. So there's these sort of grassroots efforts that become more formal entities of archivists with common concerns who want to work together in order to sustain. What is the best practice for creating metadata about source code? I don't know. But let's work together to figure it out so that we're not reinventing the wheel, and so that our metadata will be interoperable and can be aggregated. Yeah.
Henry: Yeah. It's interesting because ... Yeah. Especially in tech, we have standards, but I think a lot of those standards are mostly run through people working at companies, or just maybe a library just like universities and that kind of thing.
Wendy: Yeah. And I think that increasingly sort of underlying everything we do is sort of these companies that create the technologies that we're preserving or that we're using to communicate about preservation and where that will go in the future, especially in academic archives and public libraries, whatever, this sort of public ownership or the commons is interesting to think about what that will look like going forward.
Henry: Yeah. Because you brought up comments like the other podcast we talked about ... Well, there's Elinor Ostrom and she does a lot of work on that. She had a work called Governing the Commons. And I think it's funny because thinking about all this is interesting because there's kind of this dichotomy between not having any infrastructure and you kind of just want people to kind of do whatever they want like 10th grassroots kind of deal, but then that has its own issues. And the opposite is this top down thing where you tell people what to do, or like, here, here's the standard. It seems like both of those have issues, and this other approach of trying to figure out how to work together with ... It's almost like the least amount of governance possible or the right amount. How do you figure out what that optimal amount is? It's really a difficult problem to think about.
Wendy: Right. Yeah. Yeah. And I think the organizations will shift. The software preservation network will take on a different shape in the future depending on sort of the needs of its members and the importance of being flexible about governance and being able to change according to the needs or the pain points of the members. Definitely.
Henry: Yeah. I wonder now, so with open source, technically you could be just one person and you put a project on GitHub and now you are an open source maintainer. Or you could work at a company that has a huge project just for that company. Or it could be like Linux where there's multiple companies and foundations. So there's a huge range of possible projects and they all deal with problems in different ways. I'm curious if you see that with archivists or libraries. I mean maybe there's different sizes of those and the problems that they face.
Wendy: Yeah, for sure. Yeah. I feel like I'm ... My thinking is really influenced by another interview that I participated in this week with a researcher named Roger Schonfeld from a research organization called Ithaka S+R, which is sort of ... they study the field of libraries and museums and scholarly communication. The topic was how many different kinds of collaboration vehicles or sort of governance models there are in this field. There's membership organizations, and then there's what he calls trust networks and these grassroots things. And then there's regional collaboration just because you're near each other and it's fun to meet up. Yeah. Or like LISTSERVs, Google groups, Slack channels. Yeah. I think those are just sort of professional models of professional governance.
Wendy: But then if you look at something like digital repository systems which is also really important to what I do and what other archivists do, there's a lot of different models for how ... There's a lot of open source work going on there to create the technology underlying it, but then also to sort of govern that technology. And then there's also closed source solutions or technologies, and there's vendors who provide the open source. I guess that's definitely a point of intersection where your world meets the archives' world, or archives' world meets the open source world. Yeah, those different models.
Henry: It's interesting too because you brought up ... Well, the first episode that we did for this podcast with Eric about speedrunning, he brought up that ... I forgot what exactly I asked, but he was saying how before everyone was talking on forums and LISTSERVs and those kind of things. And now everyone has a Slack or a Discord. And the problem is that all that information, all the people talking is lost because it's in this proprietary thing. And I think that is true for open source and probably a lot of communities because the forum kind of died in a way unless they use something like Reddit, I suppose.
Wendy: Yeah. Like the openness of your documentation or again ... I feel like that ties into sort of my thinking about the choices that you make as an individual over how and where you're documenting your work or your personal activities, and whether knowing, well, it might disappear because I'm storing it here, but that's okay with me. Or I want this to live forever, so I'm going to use this tool. Thinking about that in advance is not ... It's become a really complicated thing for individuals or organizations to think about versus 100 years ago they would have had a ledger book or a diary for recording things and activities, and that would be just so much more straightforward. You wouldn't want to put it in your attic or your frozen basement. But other than that, there was one way of recording stuff.
Henry: That's really interesting, because ... yeah. Even personally, just taking notes and thinking about all the different ways, it's like the physical notes I have the Google Keep and Notion and all these different products, and I have notes in every single one of them. And so I don't even know where I wrote something anymore. Yeah. It's really hard to kind of consolidate all that.
Wendy: Yeah. Yeah. A lot of sort of personal archives, training materials that I've seen before tend to say, start with just thinking through what you have, what you create, which seems so obvious, but yeah, do you know where it is? Do you know what it is? We create so many things in so many places these days. That's hard to know what you're creating and where.
Henry: And even just thinking through ... Like with the browser, I feel like I used to take a lot of bookmarks and then now I don't do that as much, and there's a lot of information there. And even looking back at what you used to do is really funny. I remember hearing about ... I forgot where I saw it from, but someone was talking about how they would look at their old Wi-Fi access points and it would remind them of the places they've been just by looking at the name, it reminds them of something.
Wendy: Yeah. Yeah. Yeah. One of the things I think is really interesting is that we tend to think of artifacts or physical papers as carrying so much sort of emotional heft like there's this sort of materiality or this aura surrounding those things like magic kind of sense, and that people sometimes say, well, let's just ... With digital, it's like, it's just a word document. No one will care in the future. It doesn't carry that kind of weight.
Wendy: But what I observe in the story you just told, and here at Tech in the library, we have this small lab space where we have some older computers and older video game consoles set up for people to use. And the whole idea is that when people see these things, even when they see the 1980s typewriter that's in that lab space, they get super emotional. And a lot of these people are highly technical people. That's what they do for their work. But they don't come in and say, oh, I know how that works. They say, I feel like I'm five, and I remember seeing that in my grandparents' house. And that emotional reaction to technology, whether it's a digital file or a piece of software or a piece of hardware or whatever is really fascinating. It makes me hopeful, I think, for the future of how people will value their archives. They will see the value in the word document in a way that we can't understand right now.
Henry: I remember when I was a lot younger, I wanted to make my own system of bookmarking, which I never implemented. I was like, I thought that all this metadata around it would be a lot more interesting than just the URL, but it's like where were you when you decided to favorite this thing, and what time, and what place, what people did you see? I thought that all that stuff was really important to remembering, but it's like that stuff is not there, and it's like, kind of the metadata around data is just as important, but I kind of ... Yeah. Just like what you said, makes it complicated and just more stuff to think about.
Wendy: Yeah. That's really definitely interesting. I feel like that ties into what archivists tend to really emphasize context, the context in which there's the records and they're evidence of the context in which they were created. That sort of phrasing is used a lot, and the metadata about your bookmarks, the emotion that's tied to or the circumstances tied to the creation of some piece of code. And that's definitely ... Here at Tech, when we're talking about software preservation, we're just as much interested in preserving the code and maybe providing access to it through emulation as we are doing oral histories or documenting the stories that helps to contextualize why and where and how these records were created.
Henry: Yeah. I mean I even think about people always ... when you become friends, everyone's always like, oh, when did I meet you? Or how did I meet you? That kind of thing. And people are like, oh, I forgot. The other person reminds them this whole story. Yeah. It's interesting.
Henry: Yeah. I mean it makes me think that the thing that we want to preserve isn't just that artifact itself, and not even just metadata, but almost the ... especially for code or something like the process, the culture of that project. And I kind of wanted ... Even the Notre Dame, restoring that, it's like, do you want to preserve the original thing or are you going to add something new to it? And what exactly are we trying to preserve? Is it trying to make it new or the same exact thing as before?
Wendy: Yeah. Yeah. It was a really interesting question. Certainly different traditions or impulses would exist in something like historic preservation as a profession or architecture, or even preserving an artwork or storing an artwork, or if there's sort of preservation versus conservation debates that happen. And yeah, it's like, what are sort of ... We talk about what are the significant properties of the file or the building or whatever that you think are essential to preserve, but how do you decide what those are? Is it that people can walk into the cathedral? Its use is the most important thing, or do you want it to look exactly the same and accessibilities as important as preserving the exact look and feel, who's determining those requirements? Yeah. It's probably the people funding the restoration.
Henry: I mean it reminds me of open source too, because we had similar issues when we were running security. There's a heartbeat incident and all these other things that come up where these huge things that affect the whole society come up and then people start getting scared of the things that we've not been thinking about. But it's like in the day-to-day, this stuff happens all the time, but in the general public, no one cares, right?
Wendy: Yes. Yeah. Preservation is important, of course. Security is important, of course, but you don't think about it until it impacts you. Yeah. And I often think that, yeah, there has to be an immediate value proposition to archives. It can't just be about preservation. And I think that's with access. It's providing daily access to things that you need. It's like utility. It's the fact that it's relevant in the current moment. But also this storytelling value and this emotional connection value that can be persistent. It doesn't have to only arise when crisis reminds you. It's kind of like ...
Wendy: An example that pops to mind is like Facebook reminding you what you posted five years ago is creating value out of your archive, not in the way that I would want to, but it's an example of sort of reminding you that your archive is valuable without destroying it, or you didn't lose something and then ... yeah.
Henry: Yeah. I guess it's just really hard to be aware of that in the moment. You were talking about, it's almost that knowing the more aware I am that I am creating history, the more I get started getting to be scared or not be able to just do the work kind of thing, right?
Wendy: Yeah. Yeah. Like I was saying about being ... this balance between being proactive versus neutral. You don't want to ... You want to just be able to do your work and not be influenced by the voice in your head saying, thinking too much about the future and what you should be documenting, and how much weight there is on whether your code is good or whatever.
Wendy: I guess maybe if we could refer people to the software preservation cohort, the fostering community of practice, that might be good. People could get from there to wherever if they want to know more about our specific organizations.
Henry: Thanks for listening. Check out our website, maintainersanonymous.com for show notes and transcripts. If you have any feedback, ideas or guest suggestions, you can reach me on Twitter @left_pad. If you'd like to support the show, you can visit patrion.com/henryzhu.