Parts of speech in lexicon

Part of speech is a group of words with similar syntactical and morphological behaviour. Examples of part of speech are English verbs and Russian adjectives. Each word entry in lexicon belonges to the one of declared parts of speech.

The list of available parts of speech is not hardcoded in grammatical dictionary sources. It is loaded from dictionary source codes during compilation and stored in dictionary database files. Developer can use some API call like sol_FindClass to access this list of part of speech.

Part of speech declaration

Part of speech declaration consists of the following elements.

1. Language reference. It is used by dictionary compiler to validate the word entry definition: word entry name and forms must not contain the characters which are not in proper alphabet. Morphological analyses also uses the language reference in part of speech declaration to guess the language of sentences.

2. List of morphological attributes that must be defined for each word entry belonging to this part of speech. Gender for Russian nouns is an example of such attribute. Dictionary compiler controls that word entry definition contains all necessary morphological attributes. API function sol_GetCoordType returns 0 for such attributes.

3. List of optional morphological attributes which can be defined for some word entries. Modality flag is an example of such attributes. API function sol_GetCoordType returns 2 for this kind of attributes.

4. List of morphological attributes which describe the word flexion, e.g. case, number and gender for Russian adjectives. API function sol_GetCoordType returns 1 for such morphological attributes.

Parts of speech in multilingual dictionary

Grammatical dictionary can contains the word entries from two or more languages. Co-existence of many languages does not make any trouble because each part of speech declaration is linked to the proper language. For this reason the similar parts of speech in different languages are declared as completely independent. For example, nouns in Russian and English languages has different names (существительное and noun respectfully) and completely different description, that is the lists of corresponding morphological attributes.

Parts of speech in Russian language

The set of main parts of speech in Russian division of grammatical dictionary includes:

существительное (noun): делание (doing)

глагол (verb): делать (to do)

деепричастие (adverbial participle): делая (by doing something)

причастие (participle): делающий, делавший (being doing)

прилагательное (adjective): быстрый (quick)

наречие (adverb): быстро (quickly)

местоимение (pronoun): мы (we,us)

Accessing the parts of speech

The complete list of parts of speech can be accessed via Dictionary class in ORM library. Class PartOfSpeech represents the part of speech description, namely its ID, name, language referense and related grammatical attributes.

Each part of speech has got unique name and numerical ID. IDs are enumerated as C/C++ constants in API file _sg_api.h, for example:

NOUN_ru - Russian nouns id

VERB_en - English verbs id

The list of constants for part of speech IDs is available also for .NET developers. They are listed in gren_consts.dll assembly in namespace SolarixGrammarEngineNET.GrammarEngineAPI.

Another way to get the part of speech ID is to use sol_FindClass function in API. Function sol_GetClassName returns the name of part of speech with specified ID.

SQL dictionary makes it possible to enumerate all available parts of speech by use of SQL query to SG_CLASS table.

Additional information

changed 05-Feb-12