Great gallery of science visualisations here. The above image came with the following caption.
Photograph: Kai-hung Fung, Pamela Youde Nethersole Eastern Hospital (Hong Kong) /2012 Science/NSF International Science & Engineering Visualization Challenge
Note: click on the image above to see the full-text PDF version of the paper.
yEd is a pretty cool diagramming app that runs on all platforms (thanks to Java). It’s free of charge but not open-source. The automatic layout feature is very funky! The company behind it, yWorks, seems to have a neat selection of diagramming tools for specialist needs.
I fiddled around with visualising hyponym hierarchies of various English words last September as per this post. The script I used back then has undergone slight modification in the interim (which can’t really be called an improvement). The following image is the hyponym hierarchy of “science”.
And the next one is for “physics”.
And the next one is for “biology”.
And the next one is for “art”.
So these images are, in essence, just quick ways of visualising the interconnectedness of the meaning of a word as related to its corresponding synonym sets. The darkest node in each of the graphs above is the corresponding starting word that I used (i.e., science, physics, biology and art respectively). As you might have guessed, the science graph contains both of the graphs corresponding to physics and biology.
Unfortunately I can’t print labels alongside the nodes as yet (not correctly, that is). So they look cryptic and lifeless without any kind of annotation. But if you can imagine that each node has a corresponding word associated with it, which happens to be a synonym of the starting word, then the density of the graph tells you something about the genericness of the meaning. Hence, science is more dense than physics and biology, while biology is clearly more dense than physics.
For me, the real attraction of graphics like this is in browsing the thesaurus and the dictionary. I’m working toward (very very slowly) a browsing tool that lets people click on a node in order to bring up the hidden hierarchy inside it. That would be more stimulating way to use the thesaurus + dictionary than the old-fashioned tedious way. We’ll see how things go, this is a mock concept for now.
The following images (in corresponding order of science, physics, biology and art) are with labels printed over each node. That’s what I meant that my code for annotating currently is incorrect, since the density of the graphs mean that the labels just turn into junk :( Unfortunately, I think there’s some complicated maths involved in order to automatically space out and resize the labels according to some metric like the distance of the synonym from the root word :\ Hmm…this will take time :/
Finally, just for some context and contrast - the following graphs correspond to “man” and “woman” respectively :) No points for guessing which one is more complicated :P
UPDATE: Just to ensure that I’m complying with citation requirements - the lexical database that the above images are based on is called WordNet from Princeton University. And the main Python toolkit that I’m using is called NLTK, which ships with a convenient WordNet wrapper that makes it easy to browse this extensive database.
A hyponym and its corresponding hypernym are linguistic jargon that I will allow Wikipedia to explain. What matters for the sake of this post is that two or more visually different words can have semantic relationships of some kind. For instance, the statement “red is a colour” asserts that “red” and “colour” are related in some specific manner (usually called is-a relationship in Computer Science). Hence, a thesaurus is quite useful because it exposes a relationship between different words that have the same meaning (i.e., synonyms).
Anyhow, it turns out that you can elegantly visualise these relationships as a network (or in strict mathematical terms, as a graph containing vertices and edges) using NLTK, NetworkX and Matplotlib libraries for the Python programming language. NLTK ships with a lexical database called WordNet, which contains 155,287 English words and 117,659 synonym sets. These sets are organised into hierarchies with a root word, for example “car”. Various synonyms then branch out (and merge) from that root, and their subsequent synonyms branch out from them, and so on. Thus you have this idea of a network or graph. So, I plotted quite a few such hyponym/hypernym hierarchies and here’s the graph corresponding to “fear”.
Unfortunately I haven’t, as yet, figured out a way to align the labels and edges correctly so that they don’t overlap (which makes it hard to read the graph obviously). Nevertheless, the network structure is quite interesting. Here’s the actual list of words that NLTK printed out as the synonym set of “fear”.
It appears that WordNet actually includes phrases like “cold feet” and defines them as words. Not sure that I agree entirely with this idea, but there certainly is a synonymous relationship in this particular instance, nonetheless.
Well, I’ve got tonnes of plots like the one above and there simply isn’t enough space to put them here :( The best way to see them is to start using these tools yourself and generate your own plots. I’ve used a slightly modified chunk of code from the NLTK book available here.