Alif Wahid

Visualising Hyponym/Hypernym Hierarchies

A hyponym and its corresponding hypernym are linguistic jargon that I will allow Wikipedia to explain. What matters for the sake of this post is that two or more visually different words can have semantic relationships of some kind. For instance, the statement “red is a colour” asserts that “red” and “colour” are related in some specific manner (usually called is-a relationship in Computer Science). Hence, a thesaurus is quite useful because it exposes a relationship between different words that have the same meaning (i.e., synonyms).

Anyhow, it turns out that you can elegantly visualise these relationships as a network (or in strict mathematical terms, as a graph containing vertices and edges) using NLTK, NetworkX and Matplotlib libraries for the Python programming language. NLTK ships with a lexical database called WordNet, which contains 155,287 English words and 117,659 synonym sets. These sets are organised into hierarchies with a root word, for example “car”. Various synonyms then branch out (and merge) from that root, and their subsequent synonyms branch out from them, and so on. Thus you have this idea of a network or graph. So, I plotted quite a few such hyponym/hypernym hierarchies and here’s the graph corresponding to “fear”.

Unfortunately I haven’t, as yet, figured out a way to align the labels and edges correctly so that they don’t overlap (which makes it hard to read the graph obviously). Nevertheless, the network structure is quite interesting. Here’s the actual list of words that NLTK printed out as the synonym set of “fear”.

horror
hysteria
intimidation
apprehension
trepidation
gloom
foreboding
presage
shadow
chill
suspense
alarm
timidity
shyness
diffidence
hesitance
unassertiveness
cold_feet
panic
swivet
scare
frisson
creeps
stage_fright

It appears that WordNet actually includes phrases like “cold feet” and defines them as words. Not sure that I agree entirely with this idea, but there certainly is a synonymous relationship in this particular instance, nonetheless.

Well, I’ve got tonnes of plots like the one above and there simply isn’t enough space to put them here :( The best way to see them is to start using these tools yourself and generate your own plots. I’ve used a slightly modified chunk of code from the NLTK book available here.