MoMA on GitHub

The Museum of Modern Art has followed in the footsteps of Tate and Cooper Hewitt and published their collections data on GitHub.


As I’m currently in the final phase of my PhD, I have to dedicate more time to writing and less to doing. Even so I can’t let MoMA’s datasets go by unnoticed.

The above screenshot is from a timeline tool I developed for visually analysing large cultural collections. I imported the MoMA dataset and visualised the object records along their production dates. We can see the timeframe the collection spans, with earliest pieces from the late 1700s and – obviously – a focus on twentieth century and contemporary items.


The block shape around 1820 and the rectangular spike at 1900 represent large numbers of items that have the same, or very similar, production dates. Such anomalies can stand for series of items in the collection, they can be traces of curatorial decisions in cataloguing, they could be mistakes in dating, etc.

I inspected a few records in the 1900 spike and encountered a few photographs, which gave me the idea that the spike could represent a larger series of photographs – this would explain the high production output in a short timeframe. The tool allows me to colour records according to a field value, so I gave it a try and coloured all photographs in green:

Continue reading

Beautiful Data at Harvard University

Beautiful Data, a summer institute for telling stories with open art collections brings together museum professionals, scholars and technologist to work on new ways of making use of the growing amount of digital collections data that is becoming accessible

Arts at garden

Image: The workshop will take place at Arts @ 29 Garden, the creative project space of Harvard University

During the next two weeks, I will take part in this summer institute organised by Harvard’s metaLAB and funded by The Getty Foundation. I have been invited as one of 22 professionals and academics in the field of museums, archives and digital humanities to work on concepts and practical solutions for new ways of art-historical storytelling using open digital collections and to critically discuss the ethical, curatorial and intellectual challenges of digital media in a cultural context.

I’m not sure how much I’m allowed to give away of the programme here, but I’m impressed by the number of high-profile speakers and participants the organisers managed to gather for, what can only be, an exciting, stimulating and challenging workshop. Watch this space for updates and outcomes.

Tree of Time: ChronoZoom

I have sacrificed parts of my Christmas break to develop something for Visualizing.org’s Visualizing Time challenge: A Hierarchical Time Tree visualisation which offers new insights into the dataset that powers the original ChronoZoom interface.

Screen Shot 2014-01-02 at 15.51.59

In 2009, Walter Alvarez was looking for a way to communicate the enormous time frames that make up the history of the universe and conceived the idea of ChronoZoom, which was released as an early prototype in 2010. Since then, the visualisation evolved quite a bit and by now contains a rich collection of data. What I only found out recently, is that all of this data is also accessible separate from the visualisation through a dedicated API.

Recently, the makers of ChronoZoom launched a contest in collaboration with the platform Visualizing.org in order to, I’d guess, promote the use of this API and at the same time tackle some of the problems they encountered.
Continue reading

Behind the Scenes: ChronoZoom

In this post I will try to reproduce the steps that lead to my visualisation of ChronoZoom timelines. I tried to save the important mile stones as individual files and you can find them at the beginning of this post. It is fairly technical and in a way written more as a record for myself than for a general audience. So bear with me, should you decide to read this and feel free to ask questions in the comment section.



v0.1, v0.2, v0.3, v0.4, v0.5, v0.6, v0.7, v0.8, v0.9, v0.10, v0.11, v0.12, v0.13, v0.14, v0.15, v0.16, v0.17, v0.18, v0.19, v0.20, v0.21, v0.22, v0.23, v0.24, v0.25, v0.26, v0.27, v0.28, v0.29, v0.30, v0.31, v0.32, v1.0b, Final


Visualizing.org has partnered up with the people behind ChronoZoom. ChronoZoom is both a dataset containing curated timelines of the history of the cosmos and the world, as well as visual interface for exploring those timelines. Much in the same tradition as some of the earliest timelines, which aimed to map all of time – from the Creation to the last Judgement – ChronoZoom contains events since the Big Bang up to our current times. If you somehow haven’t come across it yet, you should give it a try here.
Continue reading

Challenges for Time as Digital Data

I have recently been invited to present my research at the Herrenhausen Conference on Digital Humanities. The Volkswagen Foundation, who organised the event, offered travel grants for young researchers to present their research topic in a short talk and a poster. Instead of presenting my research as a whole (which we PhD students have to do over and over again), I chose to talk only about an aspect of it: the problem of representing time digitally.


Read on for the paper on which my talk was based. I presented it, along with this poster, at the Herrenhausen Conference: “(Digital) Humanities Revisited — Challenges and Opportunitiesin the Digital Age” at the Herrenhausen Palace, Hanover/Germany, December 5-7, 2013.

In digital humanities, there usually is a gap between digitally stored data and the collected data. The gap is due to the (structural) changes that data needs to undergo in order to be stored within a digital structure. This gap may be small, in the case of natively digital data such as a message on Twitter: a tweet can be stored close to its ‘original’ format, but it still looses a lot of its frame of reference (the potential audience at a specific point in time, the actual audience, the potential triggers of the message etc.). In digital humanities this distance may become so large that some researchers argue, the term data should be largely abandoned and replaced with capta. Capta, the taken, in contrast to data, the given, should emphasise the interpretative and observer dependent nature of data in the humanities [1]. This problem relates to all kinds of data, whether categorical, quantitative, spatial or temporal. I will however focus only on the last type. Time and temporal expressions are particularly susceptible to multiple modes of interpretation and (unintended) modifications due to limitations in digital data structures, but also due to the ambiguous and subjective nature of time itself.

Continue reading

[1] Johanna Drucker, Humanities approaches to graphical display, 2011