Daoulagad — A Celto-Slavic OCR Dictionary
Dmitri Hrapof
Abstract
In this paper we present Daoulagad [dɔwˈlaːgat], a mobile Celtic-Russian dictionary, supporting Optical Character Recognition (OCR). The dictionary provides Cymraeg↔Русский, Cymraeg↔English, Cymraeg↔Gaeilge, Cymraeg↔Brezhoneg, Gaeilge↔Русский, Gaeilge↔English, Brezhoneg↔Русский, English↔Русский translations, supports initial consonant mutations and ‘Item and Arrangement’ & ‘Word and Paradigm’ morphological models. OCR capabilities make it possible to use iPhone or Android phone's camera as input device. OCR errors are corrected using trigram frequencies calculated over an extensive corpus. Also supported are Belarusian, Bulgarian, Croatian/Serbian, Czech, Polish, Slovak, Slovenian and Ukranian (as well as English, French, German, Italian, Latin, Portuguese, Spanish, Thai, Arabic and hanzi/kanji).