Lexicon is a part of grammatical dictionary containing the following information.
1. Morphological categories and attributes
2. Parts of speech, e.g. adverb (see more...).
3. Word entries.
4. Phrase entries.
All lexicon data is accessible via c-style API or by use of ORM library.
Word entry contains of head and body.
Word entry head is a name and part of speech reference. In most cases these two parameters are unique among all word entries in lexicon. The rare exceptions are homonyms, approximately 250 pairs among more than 100,000 entries in Russian lexicon.
The name of word entry is usually the basic form of the word, that is the nominative case singular number for nouns, infinitive for verbs and so on.
There are two main components of the entry body: 1) the list of grammatical attributes and 2) the set of grammatical forms:
More about word entries in grammatical dictionary ...
Lexicon utility is included in Grammatical Dictionary SDK for Windows and Linux. It provides the following options:
Searching the lexicon for word and phrase entries given any word form.
Thesaurus look up; showing all related words and phrases for a given word or phrase.
N-grams database look up: showing all N-grams containing the given word.
Start the program and choose [1] in startup menu to enter the lexicon look up mode. In this mode the program displays the wordforms and phrases matching the input pattern:
Wildcards are also allowed to find the wordform matching the input glob pattern:
Tilda in front of input word enables the fuzzy search:
The following is the short list of c-style API functions to access the lexicon.
sol_FindEntry - search for the word entry.
sol_FindPhrase - search for the phrase entry.
WordEntry class in ORM library
© Козиев Илья 2019
![]() |
|
changed 05-Feb-12 |