DanNet

From Brede Wiki
Jump to: navigation, search
Topic (help)
DanNet
Variations:
Category: DanNet
Parents:

wordnet
OWL

Children:
Databases:
Search
Papers: DOAJ Google Scholar PubMed
Ontologies: MeSH NeuroLex Wikidata Wikipedia
Other: Google Twitter WolframAlpha

This is a graph with borders and nodes. Maybe there is an Imagemap used so the nodes may be linking to some Pages.

DanNet is the Danish version of WordNet. In July 2011 it had 65.000 synsets of which around 2.000 was connected to the English (Princeton) WordNet.[1]

OWL files with DanNet is available for download from:

http://wordnet.dk

A Webservice with search interface on words from DanNet is available from:

http://andreord.dk

Contents

[edit] Relations

$ awk -F'@' '{print $2}' relations.csv | sort | uniq -c | sort -n
      4 usedForQualifiedBy
     10 meronymOf
     31 hypernymOf
     41 locationHolonymOf
     51 eqHyponymOf
     68 nearAntonymOf
     72 eqHypernymOf
    156 madeofMeronymOf
    282 involvedInstrument
    371 involvedPatient
    458 rolePatient
    569 nearSynonymOf
   1048 memberMeronymOf
   1309 instanceOf
   2074 memberHolonymOf
   2286 xposNearSynonymOf
   2939 locationMeronymOf
   3192 concerns
   4263 madeofHolonymOf
   4875 eqSynonymOf
   5872 partMeronymOf
   6938 usedForObject
  11007 involvedAgent
  11741 madeBy
  13745 partHolonymOf
  23190 usedFor
  34225 roleAgent
  41332 domain
  64712 hyponymOf

[edit] Python

[edit] Comma-separated data files

Read comma-separated values files from the distribution:

import pandas as pd
 
def read_csv(filename, names): 
    names.append('end')
    df = pd.read_csv(filename, sep='@', encoding='iso-8859-1', 
                     header=0, names=names).drop('end', 1)
    return df
 
words = read_csv('words.csv', names=['word_id', 'form', 'pos'])
relations = read_csv('relations.csv', names=['synset_id', 'name', 'name2', 'value'])
wordsenses = read_csv('wordsenses.csv', names=['wordsense_id', 'word_id', 'synset_id', 'register'])
 
# There is a separator character in the values in this file:
# synsets = read_csv('synsets.csv', names=['id', 'label', 'gloss', 'ontological_type', 'end'])

[edit] Near synonyms

pprint(dannet.db.query("select s1.label, s2.label from relations r, synsets s1, synsets s2 where r.name = 'nearSynonymOf' and r.synset_id = s1.synset_id and r.value = s2.synset_id;").values.tolist())

This query returns 558 pairs of near synonyms. These are mostly nouns, thought there are a few adjectives, e.g., halvtosset/småtosset.

[edit] References

  1. http://wordnet.dk/dannet/dannetspecifikationer_v2.pdf
Personal tools