George Oates: The Need for Human Intervention
Published on June 4th, 2016
* * * * * * * * * * * *
In 2015, Dutch-based architecture and urban design practice MVRDV (founded in 1993) decided to present its digital archive to Het Nieuwe Instituut. MVRDV’s digital archive is used by the institute’s conservators as a case study to explore aspects of digital archiving and preservation. As part of an attempt to look for future preservation efforts, George Oates was asked to reflect and think about ways of highlighting relations between documents, files, and people in digital archives.
Interpretations of the Archive #4 : Bringing out subjective relationships Relations of technics, concepts and actors in digital archives
The archive is not one and the same as forms of remembrance, or as history. Manifesting itself in the form of traces, it contains the potential to fragment and destabilize either remembrance as recorded, or history as written.
Charles Merewether, The Archive (2006:10)
For the fourth commission of New Archive Interpretations the focus is on the practical work that takes – or could take place – at Het Nieuwe Instituut. When it moved to a new office space in 2015, Dutch-based architecture and urban design practice MVRDV (founded in 1993) decided to present its archive to Het Nieuwe Instituut. MVRDV is on Het Nieuwe Instituut’s list of ‘key archives’: architecture agencies that occupy a central position in the history of Dutch architecture. As such, these archives have priority in the active collection policy of the institute. Besides the various models, more than 80% of the archive consists of digital material: text documents, various types of images, videos and technical drawings. MVRDV’s digital archive is used by the institute’s conservators as a case study to explore aspects of digital archiving and preservation. Rather than focusing on traditional methods of selection and deletion, the aim will be to capture the various relations between different files, documents and people that could aid future preservation efforts.
In general an archive consists of documents. At first these documents had to be written or printed on paper (Hirtle 2000), but already at the beginning of the 20th century this assumption was challenged and a wider adoption of what a document could be became current. In 1951 documentalist Suzanne Briet even argued that an archive could include all kinds of objects, or even any physical or symbolic sign – her famous example being the antelope in a zoo – that were “intended to represent, to reconstruct, or to demonstrate a physical or conceptual phenomenon” (1951:7). In the past decades several court cases tried to argue (unsuccessfully) to include ‘instant messages’ (a type of online chat which offers real-time text transmission over the Internet) or other social networking software in evidence as documents (Hirtle 2000 and Balasubramani 2011b). The reason for the dismissal of these ‘documents’ in court is that they cannot easily, or more importantly with accuracy, be traced back to one person, or one specific computer for that matter. In another case, e-mail was said to be sufficiently authenticated ‘based on surrounding evidence’ (Balasubramani 2011a). What this means is that digital information is valued based on additional information and the relations attached to the main ‘document’.
Although it could be argued that contextual information has always been important to assess the value of documents (Levy 1994), a digital environment is inherently relational (Fuller 2005). As early as 1970 computer scientist Edgar F. Cobb conceived of a digital database that was based on a relational model of data, in which different connections could be made between different tables. Even though relations and the value of looking at the environment in which something takes place are recognised, in most preservation methods attention is given to digital preservation of files, software, source code and documents, and little attention goes to the relations between these and other elements that are important for preserving a work, project or digital archive. Some of these relations can be significant, meaning that it is important to preserve them too. In other cases the relation should be maintained but single elements might be changed, for example, an image needs to be opened in Photoshop, but this could also be another programme as long as it performs in the same way. However, in other cases Photoshop might be an important reference for the work. But where are the people who created the documents, the software, the code, the projects and the archive for that matter? As rightly argued by Jardine and Kelty, “archives are never just about representation or preservation—they also perform, create, and remake collectives. They participate in governing just as much as they represent some reality or object of study”. How can we trace the relations between the ‘human hardware’ in and with the digital documents? In other words, how are people being reimagined and remade in the creation of archives and databases?
We asked designer, entrepreneur and expert on digital archives George Oates to reflect and think about ways of highlighting relations between documents, files, and people in digital archives. George has worked on and around the web since 1996, mainly focusing on the front-end of the web(sites). Of particular interest to this particular commission, is her previous work from creating web software with a human voice, and inventing The Commons on Flickr, a program to help public institutions like The Library of Congress and the Smithsonian share photographs from their collections with the Flickr community. Oates maintains a blog, on which she also posts about her commission for Het Nieuwe Instituut.
The Need for Human Intervention
by Annet Dekker (May 2016)
“Imagine if you could view corporate archives through a feeling of human activity and depth, and not just a set of static documents”.
Australian-born designer and entrepreneur George Oates has worked on and around the web since 1996, mainly focusing on the front-end of the web(sites). She is best known for being the first designer of the photo-sharing website Flickr and creating the Flickr Commons program. Since 2007 she has worked in the cultural heritage sector and is regarded as increasingly a go-to expert on digital archives. In 2009 she started work as director of the Open Library project at the Internet Archive. In her time there she also designed new interfaces for the Book Reader, the Wayback Machine, and the 9/11 Archive. In 2011, Oates was appointed a Research Associate at Smithsonian Libraries. She is also a non-executive director of Postal Heritage Services, a subsidiary of The Postal Museum, and is on the advisory board of the British Library Labs initiative, a Mellon Foundation-funded program to increase access to the library’s collections. She also wrote the book If Only The Grimms Had Known Alice, a retelling of Grimms‘ fairy tales to include female characters. In 2014 she launched her own company called Good, Form & Spectacle, which has completed projects for institutions like The British Museum, The Victoria and Albert Museum and Wellcome Library.
Could you briefly tell us something about your background? Is there a golden line or a fragmented line full of cuts? And when did you become interested in archives?
What an interesting way to phrase it… You might have to ask someone else what they think, but from my point of view, I’ve always been opportunistic, hard working and curious. I was lucky to fall into the web way back in 1996, when it was really a green field, and particularly appealing to liberal arts students like me.
Even though the Internet has become a much more structured space in just those twenty years or so, there’s still plenty of room for exploration and experiments alongside entrepreneurialism. I think I’ve always been sociologically minded, even from a pretty young age. I was one of those kids that adored fairy tales and Greek goddesses and gods. My lovely Dad gave me a subscription to National Geographic when I was very young, and that instilled in me a deep love of culture that persists today.
I’d say it was in 2008, when I was developing the Flickr Commons program, that I really had my first proper taste of the cultural sector. After about twelve years being an interaction designer and producer type, I started to meet the folks who look after our collective cultural histories, and, well, I fell in love. Since then I’ve stayed very close, but on the periphery. (So perhaps it’s a small, slender gold line, or maybe a sparkly orbit, and not a fragmented line full of cuts.) And now, with my new business, I’ve planted a flag in the sand that declares my interest in this territory. I feel excited that my skills in design and software development — with a hearty dose of the commercial sector drive and scepticism — can make interesting and fun things happen in cultural tech and design. So, we’ll see!
As for archives, I just love the humans who decide to look after our stuff for us, and I love all the stuff we’ve made over the centuries to express our joy and pain and ingenuity and beauty. Most of us collect and classify things pretty naturally, and that can express motive, interest, and power, which I find endlessly fascinating. It’s not just the stuff though, because a museum is also a forum for ideas, and we need lots of those, so I’m trying to help support them and possibly even make more.
The HNI invited you to work with the archive of MVRDV, as a way to think what the possibilities could be when thinking about integrating a fully born-digital archive in their collection. Can you describe how your process started and how it progressed?
It was a very pleasant surprise to hear from you through Heather Corcoran at Rhizome! That correspondence was the genesis of the project, so I’ve included those initial few emails in my own archive for this project.
This is a new style of work for me. I’ve never had a commission from a curator before, asking me to simply think about a thing. You might recall at the beginning of the project I asked you, ‘what do you want me to make for you’? Designers are essentially problem solvers. I suppose I recognised the challenge of this particular digital archive fairly quickly; that it is going to be very difficult to reproduce, or even emulate, some of the really early ‘datascapes’ and other visualisations that the MVRDV crew were doing early on. One thing I learned working at the Internet Archive is that there’s a big digital black hole from about 1960-2000 or so, for lots of different reasons, and MVRDV was formed in the midst of that.
With the MVDRV archive, I took a crash course in outstanding Dutch architecture, and then came to Rotterdam to meet you and visit their office. It was great to visit, and see the physical archive downstairs. I enjoyed the organised chaos, and I’m sorry I couldn’t spend much more time there. MVRDV have used a stable identifier for all their projects since they began: TP000, where 000 is the number of the project. Even something as simple as having a spine like that is hugely helpful, and I wish I’d thought of that as a young designer getting her first job. What if I’d kept track of every project I’d ever made since then using some similar tracking number? I wonder how many projects I’ve done?! I have no idea. Must be hundreds by now. I digress.
Already during the visit, I found myself curious about the people in the company. The partners were absent, but the studio was relatively full, mostly, at a glance, of junior architects and administrative staff. What were they working on? How do they save their work? How many staff are there? How many projects does each person work on? How does the firm develop work? When do they decide to enter a competition? What’s the process for closing a project?
If you look at the public presence of MVRDV (and just about every other design-related agency on the Internet), it reads like a well-manicured garden. You are shown exactly what the company wants you to see. Having worked in an agency or two, I know full well that this version is like the proverbial duck: calm on top of the water, but paddling like mad underneath. The work was also all being represented at the project level. While that’s understandable, it’s not a true reflection of the what, how, and whom of the company. The folks at MVRDV were happy to give me a spreadsheet that listed all the TP123 style projects. The data told me number, name, country, built/not built, type, and a few other facets of each project. It’s probably obvious to say, but the company revolves around the projects it makes, but the output – or at least the public output – doesn’t particularly suggest the project dynamics, and that’s what I found myself drawn to. For such a creative, productive, indeed prolific studio, I wanted to hear the conversations that generated the radical typologies MVRDV is famous for.
One of the main issues with collecting archives is that it’s done posthumously. It misses the voices of the creators; we’re left with assumptions, even very qualified ones. The huge opportunity here is that the people who made this work are still alive. So, I started by sending Jan and Isabel a short survey that basically asked a single question: What are the four most important projects of MVRDV, and why? We got back about 25 responses (thank you), and, as Jan suspected, they largely reflected the best-known, most popular projects.
What does your ideal archive look like?
I like that it always takes time to get to know an archive. They are places that require exploration and rummaging.
As Ross Parry notes in his brilliant book, Recoding the Museum, as our information experts were deploying computers across the institution to help organise and maintain their vast information systems, that work actually ended up destroying a bunch of the human, creative, messy descriptions of things. The individual expression previously encoded in the daybooks of museum registrars practically disappeared in the march towards rationalisation and standardisation. Archives further still, perhaps, because a lot of the description of archives is narrative.
One of my favourite archives so far is the Historic Photographs and Ship Plans archive of the National Maritime Museum in Greenwich. I had the pleasure of spending a few days there back in 2008 when I was asked to curate a selection of photographs for the museum to add to the Flickr Commons. Heaven! Apart from the stunning photographs and learning the word “thrumming”, my favourite thing about it was Jerry, the head archivist. It was as if he’d looked in every single nook and cranny in the archive, remembered its contents exactly, and then taken great delight from now until then in interpreting random general questions from visitors and retrieving the perfect selection of materials. Actually, I don’t recall using a computer there at all.
So, what do you think will happen when archives all become digital. How will we be able to ‘rummage’ through it?
I think primarily we need to be conscious as data creators of the objects and information we’re putting all over the web or even on our hard drives, and making an effort to describe them at that moment of creation. I was curious to hear about a distinct role and job function called ‘Studio Mirror’ turning up at a design shop in Los Angeles, ‘a single person to own the documentation process’. If only we were all able to afford one of those, but, it’s an interesting role I think, and as the article claims, an investment of the company in itself, because it’ll be helpful for new team members to get up to speed, and also for the company to remember itself later.
Companies today don’t have a filing cabinet with all the things to be kept. They use hundreds or thousands of additional services owned by other people to keep their most precious information. That’s a big hairy problem that, for example, Slack, the ‘real-time messaging, archiving and search for modern teams’, is trying to solve. This software-as-a-service allows teams to connect all the various files from other services into Slack, and has spent great time and effort integrating hundreds of these other services into Slack, and then putting a search engine on top. It’s fun to use, and the company is growing really fast. The export functionality looks good, although you have to pay, but I guess that’s understandable since they provide a commercial service. I should mention that I know the four co-founders – we used to work together at Flickr – but I still think it’s an interesting new model, in particular that deep integration instead of a walled garden piece. Probably bears saying though, that this very important archive is held by someone other than you though. Perhaps that’s why the company is valued so highly.
As a consumer of a web service, it’ll be worth figuring out how easy it is to get your stuff out again, otherwise, you’re just throwing a bit of your digital existence into the air.
Going back to MVRDV, the company is well known for their subversive architecture, not only the final result, but also in the development and documentation of their projects. In what way has their approach influenced your thinking about a potential future archive strategy?
Actually, I was happily surprised to feel a lot of resonance with the MVRDV style and approach. The projects I’ve looked at are about malleable use, handsome exceptions and modularity. The designs are always considered as a part of an existing urban system. I’m also impressed with the sheer volume of output of the company. As well as all their buildings and not built designs, there’s a bunch of thick manifestos, and the Internet is covered in their imagery if you know the right things to search for. They’re certainly experts in the simulation of idyllic, sensitive, possible realities, which is something we should all be aiming for.
A lot of MVRDV’s work — at least to me — is about symbiosis with the dynamism of an urban environment, recognising that constant change and moving with it. Perhaps MVRDV might just be radical enough to make private public. The public facade is a performance, but what if the real history of the company is represented through lots of different dynamic. What precursors brought about the recent flurry of new partners in the firm? It’s a sign of success and growth, to be sure. Imagine if you could look around all the correspondence that surrounded those promotions. When did the conversation begin? Who started it? How long did it take? What did it cost? How many meetings did the team have about it? Where were they, and when? The way we work has changed a great deal since the web landed. Our work is recorded in real time. That’s a very different kind of archive than a classic ‘bag of files’ and some photographs.
You mentioned your interest in the human involvement of and in archiving. What are you trying to bring out?
A lot of my research over the last 18 months or so has focussed on another type of ‘datascape’, designed to illustrate the contents of a big database. I’ve been looking for new representations of these catalogues, because I’ve spotted that a lot of the object-level data in these big cultural systems is really dry, and miles away from the curatorial richness borne out in exhibition catalogues and academic writing.
As I mentioned before, if only every archive could also contain the voice of the creator describing the work and the dynamics of an organisation. That’s difficult, and it’d take a brave company (or larger corporation) to be radically open about the work archives. How much were projects worth? What was the worst, most difficult aspect? Why did you fire that person? How did you know that was a success? What happens when someone wants a pay rise? What was the e-mail you should have said in person? What’s the contract you shouldn’t have taken? Now that I’ve had a peek at the digital files that HNI has of MVRDV, this challenge of representing company dynamics digitally is made much more explicit. It’s hard to spot joyous moments or difficult meetings in a ‘bucket o’ files’ for each project. You need witnesses, not just information.
Talking about born-digital archives, in this case that’s not just CAD files and photographs, but a truckload of other types of documentation that speak to the dynamics of the company itself. In a way, something along these lines can be seen in software. Lots of software is described by a README file, where the software creator(s) document the program, and ideally, how to run it elsewhere. That’s something I’ve started to do as I make projects. In each project folder, I write a simple README, which is a note from me about what, the project is and how it came about.
Looking at some of your previous work (Flickr, Internet Archive, British Museum, V&A), you tend to work with large databases/archives and find ways to make them smaller… almost trying to avoid the large gestures by bringing it back to the human level. What interests you in the tension between ‘big data’ and small translation of it? Or, how do you balance between these, which may at first seem like extremes?
Perhaps this approach is a product of my experience designing Flickr. It was there that we made a ‘collection management system’ that now holds billions of images. We were never prescriptive about how participants should ‘catalogue’ their ‘collections’. That made it all the more amazing and brilliant when, over time, collections began to emerge in front of us because of a shared, small, social language sometimes, or gathering around a newsworthy event, or self-selection of a social group or theme. We used to call the people who used it ‘the Flickr community’ but as it grew, it became very obvious that there was no single community, but millions of smaller ones, each with their own protocols and shared language. I wonder if it was (is) one of the world’s first truly variegated and networked cultural catalogues.
These days, I’ve been enjoying working with specific collections in recent R&D, instead of the utopian ‘union catalogue’ that is continually promised through the beneficent ideal of linked data. By looking only and directly at a single collection, one can discover its contours and peaks, and can represent it as a unique thing again.
It’s idyllic to suggest that we put our keyword in one place and look into every catalogue on the planet, but, what we’re seeing in reality, today, is that the result of that is the very definition of bland. The various linked data repositories tend towards aggregation and not linkage, so they’re just sort of carrying around each other’s big data instead of understanding it or improving it or doing a really great job of intelligent, delicate interconnection. It’s this intelligent, delicate interconnection that historians and other researchers often do manually over the course of years as they study archives. We need human intervention.
Erica Scourti, who did one of the earlier commissions, mentioned that one of the perils of digital archives is that we won’t be able to find things (the warnings of the dark archive), at the same time it is more difficult to preserve things. What is your take on loss – or at least loss of control – in relation to archives?
I’m not convinced the difficulty of finding things is unique to digital archives. I’ve heard stories of archivists making discoveries from within their own collections quite often!
In some ways loss is a steady state in archives, even for digital things. Loss is something like invisibility in both cases (except in the case of loss through destruction or deaccession). You might say the challenges of long-term retrieval are quite the same: good descriptions and an overall information system of some sort are needed in both cases. As you say when ‘archiving process starts with creation’, there is a curious difference with digital archives, where it’s possible to make more simple hooks captured automatically as we create digital materials. Digital things have myriad date stamps, or there’s EXIF data in a digital photograph, for example. It’s like an equivalent of a dated handwritten letter, but for each object. (Provided you can access it in the first place!) Thanks to the incredible pervasiveness of cell phones, many of us are logging our lives day by day, perhaps without particularly thinking about it too hard. That volume and breadth of record is new.
I’m curious about how we might preserve this kind of networked data well. I check into places on Foursquare, I have five Twitter accounts (and three defunct ones I kept at previous jobs, and two or three I wrote representing organisations), I use Flickr, Instagram, Facebook, etc. etc. I’m sure my phone knows better where I am sometimes than I do, especially in London! When I started the Flickr Commons, I became much more aware of how a folder of photographs falls so short of the richness of information that surrounds each photograph. In addition to my own tags and other basic metadata, these objects live in a dynamic system with practically infinite actors. When do you stop crawling that network of actors and their actions? When do you stop considering their interaction with the materials as archive-worthy? The right answer is probably that you don’t, if you want a true record. And there’s the rub. Even the biggest national institutions would surely struggle to maintain a service of the scale of a Flickr, or any of the other giant networked platforms. The good part is, big systems like this are all fantastically well-defined information systems, built on code and logical design. Perhaps they’re simpler and quicker to understand than 300 boxes of random bits of paper left to you by someone you never met or who died quite a while ago.
It’s interesting you equate ‘findability’ with a loss of control in your previous question. The more I work with this stuff, the more I’m convinced that what you want to encourage is many, many descriptions of things, especially in a digital archive because they can handle it in ways that physical archives (or index cards or whatever) can’t. Contemporary software expects this networking quite natively now, and yet a lot of cultural search systems still expect you to know what you’re looking for, what it will be called in that system, and that you only want that thing. Instead, you want to give your archive objects a kind of ‘surface area’ so they don’t sink into invisibility. The more points of entry you can give people, the more surface area the objects will have. I love how the ‘surface area’ gives rise to projects like the brilliant Google Poetics, which shows us our own inner workings, and only possible as a result of massive data entry across the planet.
At the same time, I’ve seen more than one digital metadata record that describes something like ‘Untitled, 19—’ and that is not going to help anyone, in particular, the keepers of such materials. It seems now like every day there’s a new project where the masses are released on a database to help describe things, for hugely positive results, and a new, intimate kind of engagement with the materials. I haven’t heard a single case of individual ‘foul play’ in eight years. There has been corporate destruction of services though, to be sure. I’ve also met researchers who do very detailed analysis of cultural materials, creating their own supplemental databases for it, but have no way of feeding this new work back to the institution. It would be good to fix that.
This reminds me of something Jill Sterrett, Head of Collections at SFMOMA, once said; when referring to the preservation of contemporary artworks, instead of rigid solutions of standardisation she proposes to think of ‘planting finds’. In a way similar to how archaeologists connect value to fragments they find (i.e. documents with information value), which account for the variables that are present in the presentation and conservation of many contemporary artworks. This could lead to a new situation where museums would need to reassess their finds each time from a new context. What are your thoughts on this position, and how can we insert (or incorporate) such finds in a digital environment?
I like what Johanna Drucker says about this sort of question, that humanities methods are fundamentally interpretive. We should continually acknowledge that we’re dealing with partial evidence, situated knowledge, and uncertainty. Catalogues in institutions are constantly tweaked, massaged, and even wholly redeveloped over time (although it can also be very difficult to do this easily if you’re constrained by inflexible centrally controlled software systems). I’d like to hear from ‘The Professionals’ whether this ‘reassessment’ happens quite naturally as staff circulate in and out of museums. Perhaps that’s even something that could be part of an archive of an institution. What are the sedimentary layers of research and classification that build up in museums?
I’d equate ‘finds’ with public input too. We’ve seen more and more examples of description and presentation of cultural objects happening outside the institution, but we’re still figuring out ways to bring this into official channels. In the first year of the Flickr Commons, it was hugely exciting to see some of the research and commentary from the Flickr community make its way into the official catalogues of institutions like the Library of Congress. That was amazing! In the digital realm, the more people that access something, the more it is preserved (in our collective memories), and that’s practically the opposite of the way preservation of old physical things works, where we need to be careful about physical things like light and dirt and bugs.
This was one of the reasons I became interested in an archaeological approach to current work and artefacts, because archaeologists are always looking to determine how something from the past may have actually been used. They also look to reveal social systems around objects, and get a sense of the people who made them. Today, in our digital work, we can see in near real time how people are making things together, but we’re a fair way away from capturing that in a non-corporate archive. While we’re trying to figure that out, perhaps we’d all do well to consider a bit more how a stranger might approach our work or life and what they might find when they arrive.
 At the time, in the 1970s and 80s, the ‘database debates’ took place, discussing the difference and value between relational and networked databases. For more information see, for example, Castelle (2013) and for a historical reference of early developments in storage technology and its relation to data management, see Haigh (2009).
Balasubramani, Venkat. 2011a. “Massachusetts Supreme Court Finds Email Sufficiently Authenticated Based on Surrounding Evidence — Commonwealth v. Purdy.” Technology & Marketing Law Blog, 21 May.
Balasubramani, Venkat. 2011b. “Connecticut Court of Appeals Tackles Authentication of Facebook Messages — State v. Eleck.” Technology & Marketing Law Blog, 19 August.
Briet, Suzanne. 2006. What is documentation? (Qu’est-ce que la documentation?). Translated and edited by Ronald E. Day and Laurent Martinet with Hermina G.B. Anghelescu. Lanham, MC: Scarecrow Press.
Castelle, Michael. 2013. “Relational and Non-Relational Models in the Entextualization of Bureaucracy.” Computational Culture. A Journal of Software Studies, Issue three.
Codd, Edgar F. 1970. “A Relational Model of Data for Large Shared Data Banks.” Communications of the ACM. Vol. 13, Nr. 6, pp. 277–387.
Fuller, Matthew. 2005. Media Ecologies. Materialist Energies in Art and Technoculture. Cambridge, MA: The MIT Press.
Haigh, Thomas. 2009. “How the Data Got its Base: Information Storage Software in the 1950s and 1960s.” IEEE Annals in the History of Computing, October-December, pp. 7–25.
Hirtle, Peter B. 2010. “Archival Authenticity in a Digital Age”. In Authenticity in a Digital Age, edited by Abby Smith. Washington: Council on Library and Information Resources, pp. 8–23.
Jardine, Boris and Christopher Kelty. 2016. “Preface: The Total Archive.” Limn, The Total Archive, Issue 6, March.
Levy, David M. 1994. “Fixed or Fluid? Document Stability and New Media.” European Conference on Hypertext Technology 1994 Proceedings. New York, NJ: Association for Computing Machinery, pp. 24–31.
Photo: MVRDV, office November 2015. Photo credits: George Oates
* * * * * * * * * * * *
T O    T O P