Visualising Cultural Data


I started this blog in October 2012 to document my PhD on timeline visualisations for cultural data. The results of my research – captured in my thesis – are now available online:

Kräutli, Florian, 2016. Visualising Cultural Data: Exploring Digital Collections Through Timeline Visualisations. PhD Thesis. London: Royal College of Art. Available at: http://researchonline.rca.ac.uk/1774/

Download PDF

Time to Submit

There is something about seeing the results of one’s work materialised in print. Next stop on my PhD journey: delivering this pile of pages for examination. See below for the thesis abstract.



This thesis explores the ability of data visualisation to enable knowledge discovery in digital collections. Its emphasis lies on time-based visualisations, such as timelines.

Although timelines are among the earliest examples of graphical renderings of data, they are often used merely as devices for linear storytelling and not as tools for visual analysis. Investigating this type of visualisation reveals the particular challenges of digital timelines for scholarly research. In addition, the intersection between the key issues of time-wise visualisation and digital collections acts as a focal point. Departing from authored temporal descriptions in collections data, the research examines how curatorial decisions influence collections data and how these decisions may be made manifest in timeline visualisations.

The thesis contributes a new understanding of the knowledge embedded in digital collections and provides practical and conceptual means for making this knowledge accessible and usable.

The case is made that digital collections are not simply representations of physical archives. Digital collections record not only what is known about the content of an archive. Collections data contains traces of institutional decisions and curatorial biases, as well as data related to administrative procedures. Such ‘hidden data’ – information that has not been explicitly recorded, but is nevertheless present in the dataset – is crucial for drawing informed conclusions from digitised cultural collections and can be exposed through appropriately designed visualisation tools.

The research takes a practice-led and collaborative approach, working closely with cultural institutions and their curators. Functional prototypes address issues of visualising large cultural datasets and the representation of uncertain and multiple temporal descriptions that are typically found in digital collections.

The prototypes act as means towards an improved understanding of and a critical engagement with the time-wise visualisation of collections data. Two example implementations put the design principles that have emerged into practice and demonstrate how such tools may assist in knowledge discovery in cultural collections.

Calls for new visualisation tools that are suitable for the purposes of humanities research are widespread in the scholarly community. However, the present thesis shows that gaining new insights into digital collections does not only require technological advancement, but also an epistemological shift in working with digital collections. This shift is expressed in the kind of questions that curators have started seeking to answer through visualisation. Digitisation requires and affords new ways of interrogating collections that depart from putting the collected artefact and its creator at the centre of humanistic enquiry. Instead, digital collections need to be seen as artefacts themselves. Recognising this leads curators to address self-reflective research questions that seek to study the history of an institution and the influence that individuals have had on the holdings of a collection; questions that so far escaped their areas of research.

MoMA on GitHub

The Museum of Modern Art has followed in the footsteps of Tate and Cooper Hewitt and published their collections data on GitHub.


As I’m currently in the final phase of my PhD, I have to dedicate more time to writing and less to doing. Even so I can’t let MoMA’s datasets go by unnoticed.

The above screenshot is from a timeline tool I developed for visually analysing large cultural collections. I imported the MoMA dataset and visualised the object records along their production dates. We can see the timeframe the collection spans, with earliest pieces from the late 1700s and – obviously – a focus on twentieth century and contemporary items.


The block shape around 1820 and the rectangular spike at 1900 represent large numbers of items that have the same, or very similar, production dates. Such anomalies can stand for series of items in the collection, they can be traces of curatorial decisions in cataloguing, they could be mistakes in dating, etc.

I inspected a few records in the 1900 spike and encountered a few photographs, which gave me the idea that the spike could represent a larger series of photographs – this would explain the high production output in a short timeframe. The tool allows me to colour records according to a field value, so I gave it a try and coloured all photographs in green:

Continue reading

Backstory: 13 years of HIV/AIDS on Wikipedia

For the 25th Day With(out) Art an interactive timeline I created will occupy the website of the ICA Philadelphia on World AIDS Day, 1 December 2014. Visitors to the site will be able to explore the history of HIV and AIDS as captured by more than 8,000 revisions of the HIV/AIDS Wikipedia article.

Screen Shot 2014-11-28 at 15.05.38

The visualisation is based on Backstory, a tool I designed during the Beautiful Data workshop at Harvard metaLab. For the Day With(out) arts, a campaign organised by Visual AIDS, I have expanded it into an interactive timeline, which lets users explore the revision history of the HIV/AIDS Wikipedia article.

In the words of Becky Huff Hunter from the ICA Philadelphia:

BackStory: 13 years of HIV/AIDS on Wikipedia is an online visualization tool which allows viewers to explore a subjective, contested, and constantly expanding history of HIV/AIDS, through a chronology of revisions to Wikipedia articles on this topic.

When you read the article about HIV/AIDS on Wikipedia, what you see is just the latest version of a document, that has undergone 13 years of collaborative writing and editing. This revision history, which is exposed through this visualisation, reflects the changing views and discourses around HIV/AIDS.

The basis for this visualisation form over 8’000 versions of the HIV/AIDS wikipedia article, which have been curated around three chosen keywords: Condoms, Viral Load and Safe Sex.

Visit www.icaphila.org on December 1st to see the project live.

See also the announcement on the ICA’s website.

BackStory is an online visualisation tool for exploring the history of wikipedia articles. It lets you access and navigate through past revisions of Wikipedia articles.

The Search Is Over! – Day 1

This is a long overdue blog post about a workshop I co-organised on the topic of exploring Cultural Collections with visualisation: The Search is Over! It was conceived by Marian Dörk, Mitchell Whitelaw and Stephen Drucker, and took place during DL2014, 11-12 September 2014 at the City University in London.


The line-up of speakers promised these two days to be exciting and it was matched by an engaged and enterprising group of participants. In this post I present a summary of the two keynotes of the first day given by Lizzy Jongma (Rijksmuseum) and Aaron Straup Cope (Cooper Hewitt).

Continue reading