понедельник, 7 февраля 2011 г.

Grammatical Dictionary v.9.10

Grammatical Dictionary changelog is as follows:

1. The order of entries in search results has been changed: the best matching entries are placed in the top of the list.

2. English nouns, verbs, adjectives and adverbs are shown with stress marks provided the stress information is available.

3. English lexicon update: ~50 pairs on nouns and verbs, where the noun is stressed on the first syllable, the verb on the second syllable, e.g. áccent-accént.



The program installer for MS Windows is available for testing (~32 Mb).

понедельник, 27 декабря 2010 г.

воскресенье, 5 декабря 2010 г.

SQL and Non-SQL dictionary components

The following components are stored in relational DBs:

1. Morphology - definition of parts of speech, grammatical categories and so on.
2. Word and phrase entries (lexicon)
3. Thesaurus
4. Lemmatization engine
5. N-grams

There are dictionary parts which are not stored in SQL DB:

1. Alphabet
2. Rules for morphological and syntactic analysis, text segmentation and translation
3. Stemmer


Read more about SQL Dictionary and Persistent Dictionary ORM

Getting started with grammatical dictionary SDK

Grammatical dictionary c-style API is composed of 180+ functions, counting the wide and utf8 versions as distinct, see sol_GetEntryName as an example.

It can be difficult for developer to choose the right function. Sample programs and their source code can be helpful - read more about them.

суббота, 4 декабря 2010 г.

How to add new word entries to the dictionary

Grammatical Dictionary SDK contains all necessary means to extend the dictionary:

2. Basic Russian dictionary (more than 120,000 word entries)
3. Sample text files with word entry definitions
4. Shell script which loads the basic dictionary, parses the word entry definition file, merges it all and stores new dictionary datafiles in .../bin-linux.