This thesis explores the ability of data visualisation to enable knowledge discovery in digital collections. Its emphasis lies on time-based visualisations, such as timelines.
Although timelines are among the earliest examples of graphical renderings of data, they are often used merely as devices for linear storytelling and not as tools for visual analysis. Investigating this type of visualisation reveals the particular challenges of digital timelines for scholarly research. In addition, the intersection between the key issues of time-wise visualisation and digital collections acts as a focal point. Departing from authored temporal descriptions in collections data, the research examines how curatorial decisions influence collections data and how these decisions may be made manifest in timeline visualisations.
The thesis contributes a new understanding of the knowledge embedded in digital collections and provides practical and conceptual means for making this knowledge accessible and usable.
The case is made that digital collections are not simply representations of physical archives. Digital collections record not only what is known about the content of an archive. Collections data contains traces of institutional decisions and curatorial biases, as well as data related to administrative procedures. Such ‘hidden data’ – information that has not been explicitly recorded, but is nevertheless present in the dataset – is crucial for drawing informed conclusions from digitised cultural collections and can be exposed through appropriately designed visualisation tools.
The research takes a practice-led and collaborative approach, working closely with cultural institutions and their curators. Functional prototypes address issues of visualising large cultural datasets and the representation of uncertain and multiple temporal descriptions that are typically found in digital collections.
The prototypes act as means towards an improved understanding of and a critical engagement with the time-wise visualisation of collections data. Two example implementations put the design principles that have emerged into practice and demonstrate how such tools may assist in knowledge discovery in cultural collections.
Calls for new visualisation tools that are suitable for the purposes of humanities research are widespread in the scholarly community. However, the present thesis shows that gaining new insights into digital collections does not only require technological advancement, but also an epistemological shift in working with digital collections. This shift is expressed in the kind of questions that curators have started seeking to answer through visualisation. Digitisation requires and affords new ways of interrogating collections that depart from putting the collected artefact and its creator at the centre of humanistic enquiry. Instead, digital collections need to be seen as artefacts themselves. Recognising this leads curators to address self-reflective research questions that seek to study the history of an institution and the influence that individuals have had on the holdings of a collection; questions that so far escaped their areas of research.
As I’m currently in the final phase of my PhD, I have to dedicate more time to writing and less to doing. Even so I can’t let MoMA’s datasets go by unnoticed.
The above screenshot is from a timeline tool I developed for visually analysing large cultural collections. I imported the MoMA dataset and visualised the object records along their production dates. We can see the timeframe the collection spans, with earliest pieces from the late 1700s and – obviously – a focus on twentieth century and contemporary items.
The block shape around 1820 and the rectangular spike at 1900 represent large numbers of items that have the same, or very similar, production dates. Such anomalies can stand for series of items in the collection, they can be traces of curatorial decisions in cataloguing, they could be mistakes in dating, etc.
I inspected a few records in the 1900 spike and encountered a few photographs, which gave me the idea that the spike could represent a larger series of photographs – this would explain the high production output in a short timeframe. The tool allows me to colour records according to a field value, so I gave it a try and coloured all photographs in green:
Registration is still open for the inaugural event on 25th/26th of April 2015, which will be followed by a ten week period for the participants to work on their projects, and culminate in a final event and presentations on the 5th of July.
Find more information and how to register on the organiser’s website
The House of World Cultures in Berlin. Image by Avda
The organisers Shintaro Miyazaki and Jamie Allen write:
Media archaeology is an academic method, but also an artistic practice and material inquiry. Playful, ironic aesthetics and critically historical approaches to media cultures and their technologies is gaining increased attention. We live in an archive of the media technological storage and of regurgitation of bygone times — such a situation requires artistic reactions and interventions.
In this context I will present my research and will focus on two recent projects on mining and visualising Wikipedia article revisions. A Wikipedia article, as commonly accessed through a browser, only represents the most recent version of that article. Underneath the surface are often thousands of earlier revisions of the same article. Wikipedia is not only an encyclopaedia, but also a history of an encyclopaedia and a reflection of changing knowledge, beliefs, concerns and social issues. Through my work I try to expose these hidden layers and mine the cultural archive of Wikipedia.
The visualisation is based on Backstory, a tool I designed during the Beautiful Data workshop at Harvard metaLab. For the Day With(out) arts, a campaign organised by Visual AIDS, I have expanded it into an interactive timeline, which lets users explore the revision history of the HIV/AIDS Wikipedia article.
In the words of Becky Huff Hunter from the ICA Philadelphia:
BackStory: 13 years of HIV/AIDS on Wikipedia is an online visualization tool which allows viewers to explore a subjective, contested, and constantly expanding history of HIV/AIDS, through a chronology of revisions to Wikipedia articles on this topic.
When you read the article about HIV/AIDS on Wikipedia, what you see is just the latest version of a document, that has undergone 13 years of collaborative writing and editing. This revision history, which is exposed through this visualisation, reflects the changing views and discourses around HIV/AIDS.
The basis for this visualisation form over 8’000 versions of the HIV/AIDS wikipedia article, which have been curated around three chosen keywords: Condoms, Viral Load and Safe Sex.
Visit www.icaphila.org on December 1st to see the project live.
See also the announcement on the ICA’s website.
BackStory is an online visualisation tool for exploring the history of wikipedia articles. It lets you access and navigate through past revisions of Wikipedia articles.