monolingual similarities (synonyms)


The following pictures present semantic similarities between german words. The graph itself has been created as by-product of my diploma thesis. It is visualised by Christian Biemann's Chinese Whispers program. The layout and clustering (indicated by diffrent colors) is done by this program. It's important to mention that the edges have weights between 0.4 an 1.0 not visible in the graph. These values describe the amount of similarity. Distances between nodes does only roughly reflect the amount of similarity, because edge weights are not taken into consideration. Nevertheless this is a really nice form of visualisation.
The graph has been calculated from Europarl corpus and the way it is done will be described in my thesis. Please note that not all nodes have been expanded. The central word stands below the picture.

Below the graphs there are some plain tables showing concrete similarities. These tables are calculated based on other text material, so they differ from the graphs above.
Please consider that similarity doesn't depend on word frequencies (at least if frequency exceeds 5 or 10). Regard the nice typos for "Kommission". If "lang2" is vissible upon top of the table only words with frequency greater equal 10 are showed. As the tables show the method is not based on stringmatching ;) .
Although the expamples are in german the method can be applied for (nearly) every language. By the way, no linguistic information (e.g. no POS, no lemmatizer etc.) has been used yet, so there are a lot of possebilities left for improovements.

Europa
"Europa"

Drogen
"Drogen"

verbunden
"verbunden"

lernen
"lernen"

akzeptieren
"akzeptieren"

misst
"misst"

genanntes
"genanntes"

erster
"erster"

schon
"schon"

schon
"erster"

Bauern2
"Bauern"

Bauern1
all "Bauern"

Kommission
"Kommission"

verstehen
"verstehen"

behaupten
"behaupten"

uebermitteln
├╝bermitteln