Link Search Menu Expand Document

Artifacts and Infrastructure

Dragan Espenschied, 29 Jun 2021

For the preservation of artworks, Rhizome handles 4 types of digital artifacts: static files, Linux containers, layered disk images, and web archives. Each type requires infrastructure for management and reperformance. Rhizome only uses free/libre open source software for preservation purposes, and engages with communities maintaining them where effevtive.1

Static files

Static files are non-executables that have to be served over the network as-is, via the HTTP protocol family. Rhizome uses standard web servers for this.

Open source framework Standard web servers (Apache, nginx, …)
Fungibility Very high (there is hardly any performance difference between web servers that would be relevant to artistic integrity of an artwork; even serving from object storage is an option)
Stewards Multiple open source actors, commercial and non-profits2
Longevity Very solid, industry standard tools used on millions of servers
Risks Almost none

Web archives

Web archives are logs of interactions between a web client (typically a browser) and an arbitrary amount of web servers, stored in WARC format. Such a log can be created via a single session of a user manually browsing websites, via actions enacted by some sort of automation, via a “web crawl” in which a robot follows links according to a rule-set, and combinations thereof. From these logs, an access system can restage interactive websites. Rhizome uses the set of tools produced by the Webrecorder project to create and make accessible web archives, and to offer the free web archiving service Conifer.

Open source framework Webrecorder tools
Fungibility Medium (In principle, the creation of web archives and making them accessible is a well-understood process with many tools available. In practice, the Webrecorder toolkit offers advanced functions and tweaks for technically complex websites that would be very difficult to reproduce.)
Stewards Webrecorder LLC
Longevity Webrecorder tools are used by leading web archiving institutions; modular design of the toolset allows for targeted development and flexible implementation.
Risks Developer capacity (a small number of developers relative to the project’s usage)

Linux containers

Linux Containers are a core feature of the Linux kernel that allows running multiple, fully independent operating systems under a single kernel. A container can be optimized to run a single application and be built to just contain the dependencies this application requires (as common with Docker) or contain a full Linux operating system like Ubuntu, CentOS, Arch, etc. There are multiple frameworks available to manage Linux Containers, with different design goals. Rhizome uses LXC/LXD to preserve legacy servers.

Open source framework LXC/LXD
Fungibility Medium (in principle, Linux Containers can be run without any particular framework, or be moved into emulation)
Stewards Canonical
Longevity LXC/LXD is a core service offering from Canonical, the makers of Ubuntu Linux, with many full-time developers working on it, and a vibrant community.
Risks Low (strategy change at Canonical rather unlikely at this point); containers could be moved to another framework, or into emulation

Layered disk images

Layered disk images are bitstream copies of digital media, such as hard disks, SSDs, floppy disks, or CD-ROMs, in which changes to the media’s contents are tracked. For instance, a disk image might represent a snapshot of a full operating system where a file resides on the desktop. A preservationist might mount this disk image, delete the file on the desktop, and take a new snapshot. This would generate a new disk image “layer” which stores data representing the difference between the original disk image and its altered state. Ideally, the preservationist would annotate the layer, describing that the file on the desktop was deleted, and why. Creating disk image layers rather than creating a full version of the disk image every time a change is made creates a chain of proof of the modifications applied to an artifact, and easily allows returning to earlier versions.

Rhizome uses four types of disk images:

  1. Base images contain fresh, unmodified installs of stock operating systems, for instance Windows 98 or Mac OS 7.5. They serve as the foundation to create synthetic images.
  2. Synthetic images are built on top of base images and are constructed to meet the needs of a class of artworks. For instance, a Windows 98 system is enriched with browser software and a version of the Flash plugin to serve in an environment fitting for a certain period of net art works using Flash.
  3. Imaged media are bitstream copies of CD-ROMs, floppy disks, or similar media, that contain artworks, games, utilities, or any kind of software.
  4. Imaged systems are bitstream copies of media from computer systems as they have been used. These could for instance be old servers, artist workstations, or computers prepared for exhibitions.

Rhizome uses Emulation as a Service (EaaS) to manage and layered disk images and to connect them with emulators.

Open source framework Emulation as a Service
Fungibility Medium (In principle, EaaS orchestrates open source emulators such as qemu, Basilisk, etc, and stores all metadata required for usage independent of the framework. In practice, EaaS offers lots of advanced management and access functionality that would be very difficult to reproduce.)
Stewards University of Freiburg, Open SLX
Longevity Computer science research project with international partnerships in digital preservation and reproducible science; service company with development being undertaken in the library sector
Risks Developer capacity (a small number of developers relative to the project’s usage)
  1. Engaging with projects that already enjoy wide IT industry support or are embedded into large foundations is not as impactful as supporting less well-resourced, emerging, or specialized open source projects. 

  2. Apache is supported by the Apache Foundation, nginx by F5 Inc