Michael Murtaugh: Portable SKOR

Published on May 19th, 2012


* * * * * * * * * * * *

Online archives are challenging the dynamics of the database by trying to liberate content from the categories that bind it in order to re-aggregate it into new forms.

An e-mail conversation between Michael Murtaugh (MM) and Annet Dekker (AD)

part of NetArtWorks: Online Archives

SKOR, May 2012

 

Online archives are challenging the dynamics of the database by trying to liberate content from the categories that bind it in order to re-aggregate it into new forms.

This act of sorting, arranging and displaying relationships can be traced throughout history. Aby Warburg, a German art historian, made one of the most interesting examples of alternative displays. In 1924 he presented his Mnemosyne, a display combining photographs, reproductions from books and visual materials from newspapers and magazines. With this Warburg wanted to stress the contextual constellation of history as a visual process presented in three dimensions: type, history and location. To this day people struggle with this concept of mapping and visualisation, now aided by multimedia interfaces, to make sense of large collections of data. Although there are many interesting examples around, most experiments fail to transcend the limitations of their own database structures and only focus on making the available data accessible.

What would happen if we could go beyond simply ‘providing access’ to data, to supporting the circulation of material, including new productions, created from the material flowing in and out of the archive. In other words, instead of holding on to data, distribute and share it by making it publicly available. So-called distributed archives or peer-to-peer practices, also referred to with the technical term ‘seeding’, appear to be very sustainable, and online archives organised in this way thrive on redistribution and reuse.

Moving from one idea to the next, Michael Murtaugh developed Portable SKOR. While exploring a way to visualise all the different paths in the SKOR website, another possibility surfaced. When the site’s data was converted, it became possible to capture all the content, and transfer it to a USB stick. With a portable SKOR archive that can be transferred to any server, the website can be re-worked and reused by anyone in any way. A distributed network of keepers and users can continue to disseminate SKOR’s knowledge and expertise online. This is not merely about nostalgia and clinging to cultural heritage, it primarily signals the possibility to resurface, add to, and recreate.

 

Annet Dekker (AD): While you studied at MIT in the late 1990s you experimented with interactive documentaries. Could you describe an example and explain what you wanted to achieve?

Michael Murtaugh (MM): I was part of the Interactive Cinema group led by Glorianna Davenport between 1994 and 1996. At that time, we talked about the ‘Evolving Documentary’, the idea of producing documentaries with multiple compositional possibilities that could grow over time and follow an ongoing or long-term story. Besides shooting and editing video, I worked on technical models for supporting a kind of ‘editor in software’ where editing decisions are made ‘on-the-fly’ based on viewer selections and the history of what footage has been watched. I also worked on an early online video biography about Jerome Wiesner, director of the Research Laboratory of Electronics (from 1946) and president of MIT (1971–80). I find it interesting that this last project is actually still online and, despite the relatively poor video quality, is basically working as well as it did in 1995. This is partly because of the formats we used out of necessity at the time, i.e., HTML [HyperText Markup Language], frames and Java. Many of my projects since then haven’t fared as well; I’ve made a number of Flash and database-driven projects that have stopped working and then eventually been taken down completely at a certain point in time, as they are difficult to keep running (and I can no longer salvage them because I lost the permission and the ability to access the databases). This experience has sensitized me to the issue of how to keep resources available for a longer term (in this case only 10 years) and the consequences that can arise because of the (technical) choices one makes as a programmer/designer.

AD: You are currently working with Constant Association for Art and Media in Belgium on the Active Archives, among others. What is your interest in (online) archives?

MM: Active Archives (AA) was initiated by Nicolas Malevé as a Constant project in 2006, in which we worked with cultural institutions on evaluating the tools and platforms they use as integral parts of their (digital) archive, the idea being to go beyond simply ‘providing access’ to files to supporting the circulation of material, including new productions, with the material flowing in and out of the archive. Contrary to a conventional physical/analogue archive where preservation often involves controlling and restricting access to the objects (boxes in the basement), digital archives thrive on redistribution and reuse.

The project involved a series of workshops that brought together different kinds of practitioners, archivists, designers, artists and academics that are researching the semantic web. Ultimately, AA is very much concerned with workflows and how tools and systems can be designed to support flexibility. Ideally, archiving is a continuous process, from planning new events (a symposium, for example), to the actual event itself, and then afterwards, of course, when recorded materials can form the basis of future works. The trick in all of this has been to design software without trying to offer a particular monolithic system or ‘solution’, and thus become prescriptive. We work with Free Software because the licenses and the related toolsets support this flexibility and multi-directionality.

AD: When I approached you for the commission for SKOR NetArtWorks, the idea initially appeared to be quite simple: I asked you to think of a way to strengthen the website’s community database drawing on your activities with Active Archive, but what did this mean for you? Can you describe the your process? What problems did you encounter?

MM: I started by crawling the material on the current website, using scripts to automatically visit the public pages and extract information, recreating the documents on my laptop and adding mark-up to enable later re-indexing of the documents. Later, I was given an SQL [Structured Query Language; a standard language for a relational database management system] dump of the MySQL database [a popular open source database application supporting SQL], and worked directly with the structure of the data behind the website, but I think it was important to first start with a ‘public-facing’ view of the collection.

The SQL was interesting to look at, as it was quite revealing and displayed problems inherent to SKOR but also representative of issues common to content management systems (CMS). Instead of using individual tables to represent the various kinds of information (a Work table, an Artist table, an Organisation table), SKOR’s database uses a single table called ‘Item’ in which rows can represent any of these categories (an Artist, a Work, an Image). Items then have a ‘Type’ column to distinguish what it is. A secondary table, ‘ItemLinks’, represents the relations between items (relations being central to the logic of a relational database such as MySQL). This is in fact the most populated table in the SKOR database with over 20,000 entries (rows), half of which connect items to related images. Each link/relation also has a notion of a ‘Type’, enabling the expression of information such as ‘the Stedelijk Museum CS is the location of Project X’. In some ways this is a ‘modern’ take on using SQL, as it moves towards a more abstract ‘web’ style of data as documents (or ‘nodes’) with links (relations) between them. However, representing it this way in SQL can result in the worst of both worlds. You lose the content-specificity of having different table structures for different kinds of data, while retaining the rigidity of structure and abstract brittleness of numerically keyed cross-references.

The abstract nature of the SQL helped to obscure a more serious problem of representation that became clear as I started following the links and putting the pieces together. The relation Types (like ‘partner’) are redundantly encoded in the relation and in the type of Item that is referred to. So in the case of ‘Project X’ where the Stedelijk Museum CS is both a ‘partner’ and a ‘location’, there are two links, which is normal, but to two different Stedelijk Museum CS ‘Items’ (one with Type partner, and a completely different one with the same name but with Type location). In addition, translated texts are represented as ‘parallel’ items. So in this case, there would be two parallel projects (‘Project X’ in English and ‘Project X’ in Dutch) linked to four different ‘Stedelijk Museum CS’ items (partner/Dutch, partner/English, location/Dutch, location/English).

Beyond being a database designer’s nightmare (as it is now necessary to ‘join’ disparate rows of the database only by virtue of them having the same names), it is clear that such a structure could only come about with the support of a content management system that can automatically generate and maintain the complexity all the parallel relationships in a way that no human editor ever could.

Another problem relates to the entries that appear in Dutch and in English. Often the two texts for an Item (Dutch and English) are in fact duplicates (such as ‘Stedelijk Museum CS’ in Dutch / ‘Stedelijk Museum CS’ in English), because they only appear once, in one language. The duplication is understandable because it follows the working of the front end of the site, which shows only the text matching the selected language. In other words, you paste the English text into the Dutch field when there is no Dutch equivalent otherwise Dutch visitors would see nothing. The consequence of this duplication, however, is that from a representational point of view there’s no way of knowing the actual original language of duplicated texts (was it Dutch pasted into English as in the ‘Stedelijk Museum CS’, or vice versa?). Changing the front end so that the language can be set as a preference (if a text is not available in the selected language, then show the translation) would help to solve this problem that the structure of the current database helps to create.

I proceeded by making a script to ‘hypertextualise’ the database, merging all the parallel entries back into single documents (one project, one document) incorporating both languages (Dutch and English texts co-existing side by side anticipating an in-browser language filter to hide/show as appropriate). Structure that was previously the database’s job to maintain is now represented as RDF(a) mark-up in the HTML documents [RDFa is a Resource Description Framework in attributes that adds a set of attribute-level extensions to XHTML for embedding rich metadata within a Web document]. In other words, the text ‘Stedelijk Museum CS’ can be marked up as a link with multiple relations (‘partner’ and ‘location’). A separate indexing script can then be run that crawls all the documents and records an index of the relational links contained within them. Javascript is used to provide a ‘dynamic site map’ interface based on the resulting index. In the future, documents may be edited, removed or added (following the same style of mark-up), and the indexing scripts can be rerun to actualise the interface. Lastly, the conversion and indexing scripts and resulting static HTML and Javascript documents are stored in a git repository [a distributed revision control and source code management system] so that the tools and content can be distributed and separately edited and used. Git includes tools that can track changes such that it supports the eventual merging of two divergent repositories later, if desired.

AD: Now there is a Portable SKOR website that everyone can host and use. What are the practical implications and, from a more ‘philosophical’ point of view, what are the consequences of such shared responsibility?

MM: One consequence is considering what it actually means to ‘share’ – one must be prepared to receive something back and not just transmit something. As a programmer, I’ve experienced this when moving from working on my own project to one that invites other developers to participate. One consequence is that I’ve realised that I have to resist a ‘let’s throw everything away and start from scratch’ approach, and recognise the value of sometimes spending more time thinking about ‘standards’, or adapting my style of working to someone else’s. There are rewards to this sharing, but it often requires time and ‘social’ skills that aren’t always effortless.

AD: What do you think the future will bring in terms to online archives? Whereas HTML and HyperText were the norm of the 1990s, in the 2000s the database and CMS systems have become the preferred modus operandi for dealing with online archives and information structures? How do you see this (hierarchical) shift in attention? What are the pros and cons?

MM: Frameworks and web services such as WordPress and YouTube have been instrumental in scaling the experiences of early net publishing up to a kind of critical mass. At the same time, many of the core advantages of HyperText (its decentralised nature, its loose and writerly structure) have been obscured. In many practical situations I now suggest using wiki software to replace CMS-based websites, and people are amazed by the ease with which they can create new pages simply by referring to them, and easily write text outside the fields of a form.

In terms of databases, there is a lot of interest in the tech community in increasingly ‘lightweight’ solutions that are more easily intermixed and laterally combined than within a traditional framework. Graph databases and a variety of so-called ‘NoSQL’ projects are providing alternatives to the traditional rigid table structure of SQL. The internal IDs of SQL are being replaced by ‘public facing’ URLs that make merging and cross-referencing different data sets possible.

AD: One of your (other) interests is in tools for new forms of online reading and writing and interactive documentaries. Do you think open source publishing has a future, and if so what would be the (preferred) characteristics? What would be necessary to make it work?

MM: What’s most interesting to me about FLOSS [Free/Libre/Open-Source Software] is the way it necessitates active choice making. It’s important to see software and other technical systems as not being neutral or pre-determined, but rather as the result of many decisions, some technical, but usually social, political, economic and aesthetic in nature. Understanding that these decisions play a major part in determining what such a system then allows (what is ‘easy’ to do) or is natural in terms of use is crucial to informing the choices an institution or an individual makes.

AD: Could you describe an example of these new forms of online reading/writing? How do you think this might influence offline behaviour? For example, would an online git repository also have offline effects or impact?

MM: Many people equate the digital with 24/7 global accessibility. Besides this being false (full global access has yet to be achieved), it’s questionable whether it’s even desirable in its current form. As digital librarians and companies like Google know all too well, data centres have enormous resource/energy costs, so allowing data to sometimes be offline and decentralised and portable can have a definite ecological impact. From a social perspective, one can imagine a near future where people rediscover ‘privacy’, and sharing again becomes an active aspect of one’s social life rather than something continuously extracted via ‘frictionless’ APIs [the application programming interface is a specification intended to be used as an interface by software components to communicate with each other]. A future where data is literally carried on the body and people literally are the network (as opposed to simply absorbing the Wifi radiation).

One project I occasionally work on (over the course of many years) is the Department of Reading, initiated by Sönke Hallman. In this work, Sönke organises readings where people gather, mostly physically, some online, to ‘read’ a text. Participants read and annotate a text on individual laptops over the course of an evening. Along with Tsila Hassine, we designed a system that links a text chat with an online wiki, so that, for instance, when a wiki text is quoted in the chat, the chat message is added as an annotation to the original. Later a book was made based on the contents of the wiki. For me this project is an example of interesting ‘hybrid’ workflows, where writing takes place in a variety of forms and time scales.

 

Michael Murtaugh is the founder of automatist.org, a new media design firm specialised in community databases, interactive documentary, and tools for new forms of reading and writing online. He also teaches in the Master Media Design and Communication programme at the Piet Zwart Institute. He is a member of Constant, a Brussels based collective engaged in the fields of free and open source software, feminism, copyright alternatives, and collaborative networks. With Constant he is currently working on Active Archives, a platform for diverse material ranging from texts to images and video. Seeing the project as both technical and cultural, the system facilitates, re-use of material while enriching content through metadata, vocabularies, and taxonomies. Murtaugh did his Master’s study (1994-1996) in the Interactive Cinema Group at MIT.


* * * * * * * * * * * *

T O    T O P