The Albanian language is synthetic-analytical and, as a language with developed inflection, it has a rich system of grammatical forms. To prepare applications for spelling and grammar in a language, as well as several natural language applications: parsing, lemmatization, annotation of words in a corpus, text generation, machine translation, document retrieval, e-learning, etc. the development of computer models of morphological forms is very important. This process in the case of the Albanian language presents many difficulties and challenges. In this paper, we describe the process of creating a computational morphological model of the verbal system in the Albanian language. The verb in Albanian has the grammatical categories of person, number, tense, mood, and diathesis. The grammatical meanings of these categories are expressed with a very large number of grammatical forms, which are constructed with different means, that serve to express grammatical meanings: personal endings, phonetic changes of the stem of the verb, formative suffixes, suppletion, and/or combinations between these methods. To create digital morphological models of verbs in the Albanian language and to assign morphological labels and lemmas, it was necessary to prepare different formulas based on different stems of the verbs, which serve to generate all verb forms for every mood, tense, person, number etc. These are a small group of inductive and representative models that, despite the structural complications and diverse means of verb forms, result in the most accurate and automatic completion of the forms for each verb in the Albanian language.
References
[1] Bolshakov, Igor A.; Gelbukh, Alexander, Computational Linguistics Models, Resources, Applications, 2004, 186 pp; ISBN 970-36-0147-2, free download from www.gelbukh.com/clbook
[2] Çepani, A.; Çerpja, A., Hyrje në gjuhësinë kompjuterike (tekst universitar), Fakulteti i Historisë dhe i Filologjisë, Departamenti i gjuhës shqipe, Botime “Albas”, Tiranë, ISBN 978-9928-02-833-4, 2017, 232 f.
[3] Gjuha letrare shqipe për të gjithë. Elemente të normës letrare kombëtare. Komisioni hartues: Prof. Androkli Kostallari (kryetar), Emil Lafe, Menella Totoni, Nikoleta Cikuli. Shtëpia Botuese e Librit Shkollor. Tiranë, 1976, 294 f.
[4] Gramatika e gjuhës shqipe, I, II, Akademia e Shkencave e Shqipërisë, Instituti i Gjuhësisë dhe i Letërsisë, Tiranë, 2002.
[5] Jurafsky, D., J. H. Martin. Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition. Prentice-Hall, 2000; see www.cs.colorado.edu/~martin/slp.html.
[6] Shishani. L., Çerpja, A., “Gjuha shqipe dhe programi për drejtshkrim AS 2.0”, Gjuha jonë, n. 1-4, 2005, f. 126-134.