Notes from a talk about archiving by Simon Pockley. 17th July 2013
Part of: 108 547 Duplication, Digitisation and Archiving - Melb University
Primary proposition:

The practice of digital archiving is inextricably linked to continuity of access or 'liveness'. Liveness is aligned to principles that support appropriation, reuse, re-presentation, open access and proliferation. For the archivist, digital content is best approached as a continuous, mutating and evolving stream. [Word 91KB]

Machine understanding and the poetics of the network

In the electronic world, markup languages lend intelligence to digital objects in order that machines can understand what they are and what to do with them. If machines are to exchange information, then they need to be able to understand the meta-languages that are being used to describe the information. Markup languages are inherently meta-languages. A structured set of terms (elements and attributes) is called metadata. When groups of people agree to share metadata, they create standards. The Dublin Core metadata element set is a standard for cross-domain information resource description. Here an information resource is defined to be anything that has identity

Clay Shirky divides his time between consulting, teaching, and writing on the social and economic effects of Internet technologies. His consulting practice is focused on the rise of decentralized technologies such as peer-to-peer, web services, and wireless networks that provide alternatives to the wired client/server infrastructure that characterizes the Web.

Today I want to talk about categorization, and I want to convince you that a lot of what we think we know about categorization is wrong. In particular, I want to convince you that many of the ways we're attempting to apply categorization to the electronic world are actually a bad fit, because we've adopted habits of mind that are left over from earlier strategies. What I think is coming instead are much more organic ways of organizing information than our current categorization schemes allow, based on two units - the link, which can point to anything, and the tag, which is a way of attaching labels to links. The strategy of tagging - free-form labeling, without regard to categorical constraints - seems like a recipe for disaster, but as the Web has shown us, you can extract a surprising amount of value from big messy data sets.

A case study - The Flight of Ducks:

part history, part novel, part data-base, part postcard, part diary, part museum, part poem, part conversation, part shed.

For an embedded description see:

How Flight of Ducks came into being

The Flight of Ducks began in 1990 when my father died. I extracted from his belongings a collection of artefacts, a pile of photographs, and journals relating to a camel expedition into Central Australia in 1933. I'd grown up with stories from this trip and I felt a duty to protect the integrity of this collection......herein lies the significance of the title, the Flight of Ducks. It refers to a song at the heart of the expedition journal, to imaginative flight, and to the shape and form that the project began to assume after I found that I had lost all my typing to a corrupted hard drive and began to use the World Wide Web as a space to hold the story....

For an embedded description see:

The social dimensions of information

I hadn't anticipated that other people would find the material and talk back to it. My central story and research rapidly became encrusted with other people's stories and observations. This inclusive aspect make Flight of Ducks one of the first blogs. It is easy to forget that in late 1994, any understanding of a web-enabled poetic was drawn more from the vision of such prescient thinkers as Vannevar Bush and Ted Nelson than from actual experience.

Vannevar Bush was an American engineer and science administrator. In 1945 he wrote an article for the Atlantic Monthly called As We May Think in which he showed remarkable prescience by imagining the memex - a networked machine that anticipated the concept of the World Wide Web.

Consider a future device for individual use, which is a sort of mechanized private file and library. It needs a name, and, to coin one at random, "memex" will do. A memex is a device in which an individual stores all his books, records, and communications, and which is mechanized so that it may be consulted with exceeding speed and flexibility. It is an enlarged intimate supplement to his memory.

It consists of a desk, and while it can presumably be operated from a distance, it is primarily the piece of furniture at which he works. On the top are slanting translucent screens, on which material can be projected for convenient reading. There is a keyboard, and sets of buttons and levers. Otherwise it looks like an ordinary desk...

Theodor Holm Nelson, has been called both a genius and a madman. He is also a designer and generalist, has been a software designer and theorist since 1960 and a software consultant since 1967. His principal design work includes Project Xanadu and xanalogical systems, the transcopyright system, and the theory of virtuality design. Nelson has written several books, the most recent being The Future of Information (1997), as well as numerous articles, lectures, and presentations. He is best known for discovering the hypertext concept and for coining various words which have become popular, such as "hypertext," "hypermedia," "cybercrud," "softcopy," "electronic visualization," "dildonics," "technoid," "docuverse," and "transclusion."

...So now I want to tell you about another identic relationship. I am calling it transclusion. Think of it as hypersharing if you like. What it is this. There is only one copy, one master copy of anything. Let's call it a cosmic original. Every other copy you see is a manifestation of this cosmic original. I use these terms because I don't believe they are currently in use. So when you see the Lord Shiva over the road, is it a copy of Lord Shiva? Of course not, it is the real guy. And so it should be with all text. We should never have to type anything twice. So this repurposes the entire computer system into a box which maintains the connections between all of the transitory and cached pieces whose identity is maintained with its original.

See also Nelson, T, H. Embedded Markup Considered Harmful October 02, 1997

Practical things:

File naming conventions:

Rights Management:

When Marshall McLuhan [] articulated his prescient insight that 'new media makes old media content' he did more than anticipate the ease with which old media (e.g. film) would be transferred to new (at that time, video). He provided an insight into the way in which resources could be absorbed by a technology and re-purposed beyond the scope of the licenses and agreements that governed their use.

Usability and Jakob Nielson

Jakob Nielsen is regarded as a leading authority on web usability. Engineering-oriented and emphatically not a graphic designer, he is noted for his harsh criticisms of popular websites, contending that many concentrate too heavily on design features which (like Edward Tufte) he views as unnecessary and gimmicky. He regards animation, Flash and graphics as windo dressing at the expense of usability, particularly for disabled visitors. Nielsen has written extensively on the subject of web design.


Tim Berners-Lee is regarded as one of the founders of the World Wide Web. In 1989 he write the first browser/editor/server application while working at the European Organization for Nuclear Research (CERN).