The Future of Education

Edition 14

Accepted Abstracts

Challenges in Compiling Expert Corpora for Academic Writing Support

Roxana Rogobete, West University of Timisoara (Romania)

Mădălina Chitez, West University of Timișoara (Romania)

Valentina Mureșan, West University of Timișoara (Romania)

Bogdan Damian, West University of Timișoara (Romania)

Adrian Duciuc, West University of Timișoara (Romania)

Claudiu Gherasim, West University of Timișoara (Romania)

Ana-Maria Bucur, West University of Timișoara / University of Bucharest (Romania)


The present paper explores a series of challenges faced by Romanian scholars in their attempt to build discipline-specific expert corpora for academic writing. Such corpora are useful when teaching and researching disciplinary writing in L1 Romanian and L2 English. Since many study programs in Romania are also taught in English (IT, Political Science, Economics, for instance), and, moreover, English has been seen for many years as the main academic lingua franca (Mauranen & Randa 2008), most of the academic articles relevant for many disciplines are to be found in English - in addition, papers written in English have a broader impact. The study is based on a bilingual comparable corpus compiled within the DACRE project (Discipline-specific expert academic writing in Romanian and English: corpus-based contrastive analysis models), freshly started in 2021 and financed by the Romanian Executive Unit for Financing Higher Education, Research, Development and Innovation (UEFISCDI) in which we aim to advance the popularisation of corpora in higher education area and create digital instruments and methodological models useful to the national and international language-related research community. The intention of the project is to unfold salient linguistic and rhetorical features specific for each discipline (see Boettger 2016) and each language variety (Romanian, English L1 and L2), as extracted from peer-reviewed scientific articles. At the initial stage of the corpus compilation process, when assessing the linguistic resources to be included in the corpus, a multitude of challenges emerges. For example, the linguistic level of these resources is not consistent (see Yilmaz and Römer 2020). Other difficulties we encountered were the data availability (open sources or subscription-based), lack of recent resources for certain corpus batches, “multi-authorship” in determining L1 texts, and, most important, legal aspects (i.e. copyright). By describing, comparing and analyzing data collection barriers, we propose a model for expert corpus building in English vs in low-resource languages such as Romanian. 

Keywords: Romanian vs English academic writing, bilingual expert corpora, discipline-specific writing, Romanian expert corpus, DACRE corpus.


  • Boettger, R. (2016). Using corpus-based instruction to explore writing variation across the disciplines: A case history in a graduate-level technical editing course. Across the Disciplines 13(1), 1-21. Available at: Accessed: 22May2021. 
  • Yilmaz, S., and Römer, U. (2020). A corpus-based exploration of constructions in written academic English as a lingua franca, in Römer, U., Cortes, V., and Friginal, E. (eds.). Advances in Corpus-based Research on Academic Writing. Effects of discipline, register, and writer expertise. Amsterdam/Philadelphia: John Benjamins: 59-88.
  • Mauranen, A., and Ranta, E. (2008). English as an Academic lingua franca - the ELFA project. Nordic Journal of English Studies 7(3): 199-202.

Back to the list


Reserved area

Media Partners:

Click BrownWalker Press logo for the International Academic and Industry Conference Event Calendar announcing scientific, academic and industry gatherings, online events, call for papers and journal articles
Pixel - Via Luigi Lanzi 12 - 50134 Firenze (FI) - VAT IT 05118710481
    Copyright © 2024 - All rights reserved

Privacy Policy