YYYY-MM-DD

YYYY-MM-DD

Time/Data/Visualisation

Time/Data/Visualisation

Behind the Scenes: ChronoZoom

In this post I will try to reproduce the steps that lead to my visualisation of ChronoZoom timelines. I tried to save the important mile stones as individual files and you can find them at the beginning of this post. It is fairly technical and in a way written more as a record for myself than for a general audience. So bear with me, should you decide to read this and feel free to ask questions in the comment section.

evolution

Versions

v0.1, v0.2, v0.3, v0.4, v0.5, v0.6, v0.7, v0.8, v0.9, v0.10, v0.11, v0.12, v0.13, v0.14, v0.15, v0.16, v0.17, v0.18, v0.19, v0.20, v0.21, v0.22, v0.23, v0.24, v0.25, v0.26, v0.27, v0.28, v0.29, v0.30, v0.31, v0.32, v1.0b, Final

ChronoZoom

Visualizing.org has partnered up with the people behind ChronoZoom. ChronoZoom is both a dataset containing curated timelines of the history of the cosmos and the world, as well as visual interface for exploring those timelines. Much in the same tradition as some of the earliest timelines, which aimed to map all of time – from the Creation to the last Judgement – ChronoZoom contains events since the Big Bang up to our current times. If you somehow haven’t come across it yet, you should give it a try here.

Screen Shot 2014-01-03 at 16.33.42

The platform has been around for a while, but it seems that they are now making an effort to develop it further and making it a valuable learning and research tool. Hence the contest, and hence the very clear briefing for it, which points at issues with the current visualisation that they have discovered themselves.

Such restricted contests always leave a bit of a bad after taste, as one can’t help to suspect, they are more about institutions looking for a cheap way to gather a lot of ideas they themselves could not come up with, rather than really contributing to the community. But let’s not go into politics.

At least with the visualizing.org contests, all entries are publicly visible. When the competition was first advertised, I was immediately looking forward to seeing diverse ways of representing time, so it was obvious I had to contribute myself.

Entries are possible in three categories addressing three distinct issues with any visual timeline based on large amounts of data:

Causality
How events are caused and influenced by previous events
Affinity
How events relate to or are viewed by different groups or entities
Context
How to give a clear overview of a dataset

I have struggled myself often with the last one, how can we give a meaningful overview of a dataset which contains thousands or millions of entries, especially on a timeline? If everything is represented, it easily becomes too complex, and if we summarise, we have to decide on a meaningful abstraction.

What distinguishes the ChronoZoom dataset from others I have worked with, is the hierarchical nature of it. It is basically a set of curated nested timelines, all contained in the Cosmos timeline which is the top most level. From there, one can travel to the history of humanity by zooming into “Earth & Solar System” and subsequently “Life”. However, the hierarchies are not very strict and especially on finer grained levels they translate more to subject categories (Industrial Revolution, Roman History, etc.).

The ChronoZoom visualisation emphasises the scale of these events: one has to zoom in very far to finally be able to discover the tiny smudge that is the history of humanity. In communicating these huge relationships in scales of time, it is very effective. Yet, it is very hard to get an impression of the richness of the ChronoZoom dataset as a whole in this way.

Retrieving the data

As a first step, I therefore wanted to get a better idea of the content. It is possible to access the content directly via a REST API, which is very well documented. I didn’t want to look at anything specific, I just wanted to look at everything. Calling getTimelines without any arguments returns all the timelines in the collection as a JSON object. As the plain text format is pretty much unreadable, I copied the returned string into an online JSON viewer.

JsonViewer Timelines

This reveals the basic structure of the root node in the dataset, the Cosmos timeline. I expanded the property “timelines” which is an array that contains the next three timelines. All timelines follow that same schema with the same set of properties. Start and end denote the expansion of the timeline in years, while an end year of 9999 simply means the timeline does not have a (known) end. Height seems to be a property used for the visual rendering of the timeline in the ChronoZoom visualisation. Exhibits is an array that contains a range of content items, usually images and videos, that refer to a date and event inside the timeline. In the ChronoZoom visualisation they are rendered as bubbles and their content can be viewed by zooming in.

ChronoZoom Exhibits

In the JSON data they contain a title, description, source, URL and type. I haven’t seen any other types apart from “Picture” and “Video”, but maybe I haven’t looked good enough.

Exhibit JSON data

Now I knew what the data looked like on a structural level, but exploring the dataset as a whole in this format was cumbersome. It was time to turn to visualisation.

Visualise it

Usually I use Excel or Tableau to make some initial charts and scatter plots to get an idea of a dataset. But Excel doesn’t handle hierarchical data well. Tableau is better, but I figured the quickest way to build a hierarchical visualisation is by using d3.js and one of Mike Bostock’s excellent examples.

I adapted the Tree layout example, although I hardly had to change anything to make it work with the ChronoZoom data. After pointing the script at the right URL and changing the children accessor, I was presented with a clean tree representation of the entire dataset.


var tree = d3.layout.tree()
    .size([height, width - 160])
    .children(function(d) {
         return d.timelines
    });

The children() function of the tree layout (and any other hierarchical d3 layout) specifies how the child elements of a datasets are accessed. In this dataset, the children are in the timelines array, so I had to point there.

Screen Shot 2014-01-03 at 12.01.41

The most prominent branch is “Humanity”, which contains a great number of timelines. But also less obvious timelines are easily discoverable in such a map. I have spent some time looking at this representation and then finding the corresponding timeline in the ChronoZoom visualisation.

I liked how well one could get an overview of the data in such a tree visualisation, but of course all reference to time and time spans is lost. Well, the solution is simple then; just plot the timelines according to their span and location in time, but keep the hierarchical tree layout.

I won’t reproduce the napkin sketch that followed this insight, but I’ll include an Illustrator sketch below.

Sketc

Very simple, the timelines are plotted as bars and connected to their children. Deeper levels are hidden and can be expanded at will. I found the connecting lines hard to read, as it was not obvious which timeline should be the parent when several child timelines start around the same date. Therefore I decided to have the lines switch sides every other levels. I also added some dots that would represent the exhibits in a timeline. I didn’t like how one has to zoom in all the way in the original ChronoZoom visualisation in order to see them, so I imagined them to be displayed in a separate panel or window.

Sketch

Now all I had to do was to change the original tree visualisation to horizontal orientation and add the visual bars. This was a very bad idea. It was one of these moments where you don’t see the most obvious solution and go on on the wrong track for a while.

When I added the bars to the tree layout, they overlapped. So I had another bad idea and stacked the bars vertically on top of each other between the nodes. Here is the result.

Crappy Tree

Still I didn’t realise how bad my idea was and first started reading about the Reingold-Tilford Tree algorithm (that powers d3’s tree layout), to find out if I can somehow recreate and modify it to produce the visual image I wanted.

And at last the penny’s dropped! My tree layout was much simpler. The horizontal position of every node was given by it’s timeframe, and the vertical position was simply below the previous one.

In fact, rendering the data as a HTML list already produces the vertical ordering that I wanted. By rendering the list items as bars and adding some jQuery magic to handle the collapsing, I already had half of what I needed.

Screen Shot 2013-12-19 at 16.25.22

I still used d3 to generate the HTML code, except I used the more higher level hierarchy layout instead of the tree layout to get rid of unnecessary overhead.

jQuery and the standard HTML flow handled most of the positioning and interaction I needed. Even zooming and panning would be possible to realise in this way, but drawing the connecting lines would be a pain in the a..bsolute positioning.

Therefore I went back to drawing with SVG elements. Overall, this was more flexible. The visual appearance was not much different, but now I could easily implement zooming and panning via d3’s zoom behaviour.

I have never used this, but once more the examples were helpful. Nevertheless, I ran into many problems on the way, so I will include only the code snippets that I used in the final version.

First: one needs to define d3 scales which are controlled by the zoom behaviour.

var x = d3.scale.linear()
    .domain([0, params.viewport.width])
    .range([0, params.viewport.width]);

var y = d3.scale.linear()
    .domain([0, params.viewport.height])
    .range([0, params.viewport.height]);

The d3 zoom behaviour takes two d3 scales and will manipulate their domain according to the user’s interaction

Initially those are 1:1 mappings between, in my case, pixel coordinates on the screen. The zoom behaviour will later control the output range to reflect zoom and pan operation. Therefore, these scales also need to control the position and size of the elements on the screen.

I have played around with how to best implement this through the evolution of the code. Finally I settled with storing the x and y coordinates as well as the sizes of every timeline in 1:1 scale as their data attribute and passing this value to the updated scales.

function updateNodeValues() {
     nodes.forEach(function(d,i) {    
          d.x = params.scale.yearToPx(d.start);
          d.y = params.elements.height * i +  i * params.elements.margin + params.viewport.padding;
          d.width = (params.scale.yearToPx(d.end) - params.scale.yearToPx(d.start));
     })
}

This function defines the x and y coordinates as well as the width for every visible timeline in 1:1 scale.
The y position stays consistent during zooming and is calculated based on a set height and the number of visible timelines (index i) plus a set value to create a margin from the top edge of the window.

params.scale.yearToPx = d3.scale.linear()
          .domain([root.start, thisYear])
          .range([params.viewport.padding, params.viewport.width - params.viewport.padding]);

The x position and the width use a predefined scale function, which maps year numbers to pixel values. The output range is the available visible space, minus padding on the left and right edge of the screen.

Next up was the code that takes care of drawing the link. I want back to using d3s tree layout instead of the more basic hierarchy layout, as I could use the built in link function which creates an array of node pairs that are connected. All I needed to do then is to write some code that creates an SVG path between the two timelines:


function linkPath(d) {
     var dist = 5,
          sourceX = x(d.source.x),
          sourceY = d.source.y + params.elements.height/2 + y(0),
          targetX = x(d.target.x),
          targetY = d.target.y + params.elements.height/2 + y(0),
          m = sourceX + "," + sourceY,
          l1 = sourceX - dist + "," + sourceY,
          l2 = sourceX - dist + "," + targetY,
          l3 = targetX + "," + targetY;
         
     return "M" + m +"L" + l1 + "L" + l2 + "L" + l3;
}

The initial code just draws an orthogonal connection between the target and the source, not yet switching sides.

Now for some more interactivity. I wanted to only have the first two levels of timelines visible at first, and expand and collapse lower parts of the timeline tree at will. Again, Bostock provides an example of a collapsible tree layout. I borrowed the part that manages the collapsing. When initialising, all lower level children are moved from the array children to a new array _children. That way, they are out of sight for the tree layout.


 function collapse(d) {
    if (d.children) {
      d._children = d.children;
      d._children.forEach(collapse);
      d.children = null;
    }
  }


  root.children.forEach(collapse);

Move the content of the children array to a new array, to hide it from the tree layout.

A similar piece of code handles the expanding and collapsing of the tree by switching the content of children and _children and subsequently updating the tree:

function click(d) {
  if (d.children) {
    d._children = d.children;
    d.children = null;
  } else {
    d.children = d._children;
    d._children = null;
  }
  update(d);
}

Toggling children is achieved by moving the content back and forth between d.children and d._children, updating after each call.

This worked also for my code, except that there was some unexpected behaviour when new updating the timelines. Child timelines should be inserted below the clicked ones and existing timelines should be moved downwards. Instead what happened was that existing timelines were updated with new data and new timelines were inserted below the tree. What I needed to do is telling d3 how to identify the timelines in order to distinguish new from existing ones. This is done fairly easy by passing the unique identifier with the data function.

var timelines = params.elements.timelines.selectAll("g.timeline")
          .data(nodes, function(d) {return d.id});

Data() accepts an accessor function which can be used to uniquely identify instances of data.

Screen Shot 2014-01-06 at 16.44.33

Now the timelines knew where to appear, but not yet where to disappear. As every piece of data also contains the data of its parent element, I could simply use this for the disappearing transition:

     var timelinesExit = timelines.exit()
        .transition()
       .duration(params.duration)
       .style("opacity",0)
       .attr("transform", function(d,i)  {
               if(d.parent) {
                    return "translate(" + x(d.x) + "," + (d.parent.y + y(0)) + ")";
               } else {
                    return "translate(" + x(0) + "," + y(0) + ")";
               }
          })
       .remove();

Timelines fade out and move to the position of their parent, before getting removed.

I’ll jump ahead a bit at this point. Making the links appear and disappear at the right position also caused me some headaches, but ultimately it all worked. The next challenge was to have to visualisation zoom in to the desired timeline, when it is clicked. I already had a function which handles updating the visualisation when zooming manually via scroll wheel. Now I just had to make a function that sets the desired zoom scale and range programmatically. For simplicity, I again include the function in its final form:

function focus(from, to) {
    
     var zoomFactor = params.viewport.width/(to - from);
    
     zoom.scale([zoomFactor]);
     zoom.translate([-from * zoomFactor, zoom.translate()[1]]);

     doZoom(zoom.scale());
    
}

It takes two arguments, from and to, which it receives from the selected timeline and correspond to its start and end position expressed in 1:1 scale pixel values. I then updated the function that handles zooming to support transitions.

Next I added a visual axis to display the year numbers. Linking the axis to the zoom function took a bit of persuasion, but ultimately it worked by passing the (rescaled) domain of the zoomed x scale to the scale that drives the axis:

params.scale.axis.domain([params.scale.pxToYear(x.domain()[0]), params.scale.pxToYear(x.domain()[1])]) 

Updating the domain of the axis with the domain of the zoomed scale (converted from pixels to years).

A bit nasty but it works. Now what was left was supporting exhibits and making it all look a bit nicer. You can follow the evolution in the screenshots below. Click on an example to see it and examine the code.

Screen Shot 2014-01-06 at 14.16.47

Screen Shot 2014-01-06 at 14.16.51

Screen Shot 2014-01-06 at 14.17.08

Screen Shot 2014-01-06 at 14.17.19

Screen Shot 2014-01-06 at 14.18.00

2 thoughts on “Behind the Scenes: ChronoZoom

  1. Kim

    really detailed insight into your process. Thanks!
    I have to say somehow I really like the d3 tree layout structure version.

    All the best,

    Kim

    | Reply
    • Florian

      Yes I was awed how easy it was to implement and yet how elegant it looked. My goal was to have this + time information. But the elegance and simplicity of this tree is hard to beat.

      | Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>