Innovation in Language Learning

Edition 18

Accepted Abstracts

Automatic Generation of a Graded Reader in Old Church Slavonic

Iglika Nikolova-Stoupak, Sorbonne Université (France)

Gaël Lejeune, Sorbonne Université (France)

Eva Schaeffer-Lacroix, Sorbonne Université (France)

Abstract

Comprehensible input, as well as graded readers as a case study of its application, have been highly valued within language education in the past few decades. Graded readers have so much as extended onto the so-called “dead” or classical languages, typically represented by Latin and Greek. The immersive reading and listening of adapted texts in these languages has been shown to increase students’ proficiency, independence and motivation when used either in isolation or combination with more traditional methods, such as the grammar-translation method [4, 9, 12, 14]. However, there is a perceptible scarcity in both the associated resources and the classical languages represented. The present study will investigate the current potential for automatic generation of adapted classical-language readers while focusing on the Old Church Slavonic language. Dating from the 9th century, Old Church Slavonic is considered to be the earliest written Slavic language, which was to later branch itself into the modern-day West, East and South Slavic languages. It makes use of the Cyrillic and, more rarely and for earlier sources, of the Glagolitic script. Major challenges in the acquisition and interpretation of the language include its occasional borrowings from Greek in terms of both vocabulary and grammar and the large geographical territory it encompassed, which was associated with sometimes significantly different dialects [7, 8]. The following steps are to be undertaken in the framework of the study: 1) The linguistic characteristics of professional classical-language readers per several levels will be analysed (with a focus on atomic readability-based characteristics) [1, 6]. 2) Automatic generation of adapted Old Church Slavonic text will be attempted through the use of a strong multilingual LLM such as GPT-4 in a one-shot setting [11]. Finetuning of a model with a large amount of Old Church Slavonic text will also be considered [3, 5]. 3) The derived text’s quality will be assessed through both human evaluation and a comparison of its textual characteristics with those of professional texts as defined in point 1). The primary texts to be regarded include the Latin Lingua latina per se illustrata (Orberg, 1994), the Greek Logos (Martínez, 2023), the Hebrew Jonah (Moranville, 2025) as well as the following unabridged Old Church Slavonic texts: the Biblical story of Adam and Eve, “The Life of Saint Clement of Ohrid”, and “The Passion of Saint George”.

 

Keywords

classical languages, graded readers, large language models (LLMs)

 

REFERENCES

[1] Biber, D. (1988). Variation across Speech and Writing. Cambridge: Cambridge University Press.

[2] Carbonell, S., Gallego, Á. L., & del Río, M. (2016). Lingua Latina per se Illustrata: propuestas para un enfoque didáctico comunicativo. Thamyris, nova series, 7. https://www.revistas.uma.es/index.php/thamyris/article/view/18648

[3] Chowdhery, A., Narang, S., Devlin, J., Bosma, M., Mishra, G., Roberts, A., … & Fiedel, N. (2022). PaLM: Scaling Language Modeling with Pathways. arXiv. https://arxiv.org/abs/2204.02311

[4] Diller, K. C., & Walsh, T. M. (1978). “Living” and “Dead” Languages: A Neurolinguistic Distinction. In J. G. Savard & L. Laforge (Eds.), Actes du 5e Congrès de l’Association Internationale de Linguistique Appliquée (pp. [insert pages if known]). Montreal: Les Presses de l’Université Laval.

[5] Dodge, J., Ilharco, G., Schwartz, R., Farhadi, A., Hajishirzi, H., & Smith, N. (2020). Fine-Tuning Pretrained Language Models: Weight Initializations, Data Orders, and Early Stopping. arXiv. https://arxiv.org/abs/2002.06305

[6] DuBay, W. H. (2007). The Classic Readability Studies. ERIC Clearinghouse.

[7] Duridanov, I. (Ed.). (1991). Gramatika na Starobalgarskiya Ezik. Sofia: Bulgarian Academy of Sciences.

[8] Lunt, H. G. (2001). Old Church Slavonic Grammar (7th ed.). Mouton de Gruyter.

[9] McMenamin, C. (2022). Greek Club: Resurrecting Dead Languages in Secondary Schools. Journal of Classics Teaching, 23(46), 121–123. https://doi.org/10.1017/S2058631022000058

[10] Nation, P. & Wang, K. (1999). Graded Readers and Vocabulary. Reading in a Foreign Language, 12(2), 355-379.

[11] Nikolova-Stoupak, I., Lejeune, G., & Schaeffer-Lacroix, E. (2024). Contemporary LLMs and Literary Abridgement: An Analytical Inquiry. In Proceedings of the Sixth International Conference on Computational Linguistics in Bulgaria (CLIB 2024) (pp. 39–57). Sofia, Bulgaria: Department of Computational Linguistics, Institute for Bulgarian Language, Bulgarian Academy of Sciences.

[12] Philips, F. C. (1988). The Language Laboratory and the Teaching of "Dead" Languages. The Classical World, 82(2), 105–108. https://doi.org/10.2307/4350305

[13] Thongsan, N. C. (2023). Vocabulary uptake and retention from reading a graded reader. LEARN Journal: Language Education and Acquisition Research Network, 16(2), 154-167.

[14] Venditti, E. (2021). Using Comprehensible Input in the Latin Classroom to Enhance Language Proficiency. Journal of Classics Teaching, 22(43), 22–28. https://doi.org/10.1017/S2058631021000039

[15] Wan-a-rom, U. (2008). Comparing the vocabulary of different graded-reading schemes. Reading in a Foreign Language, 20(1), 43-69

[16] Waring, R., & Takaki, M. (2003). At what rate do learners learn and retain new vocabulary from reading a graded reader? Reading in a Foreign Language, 15, 130-163.

[17] Yang, Z. (2024). Enhancing the Comprehension: Text Simplification Approaches and the Role of Large Language Models (Doctoral dissertation, Temple University).

 

Back to the list

REGISTER NOW

Reserved area


Indexed in


Media Partners:

Click BrownWalker Press logo for the International Academic and Industry Conference Event Calendar announcing scientific, academic and industry gatherings, online events, call for papers and journal articles
Pixel - Via Luigi Lanzi 12 - 50134 Firenze (FI) - VAT IT 05118710481
    Copyright © 2025 - All rights reserved

Privacy Policy

Webmaster: Pinzani.it