Some Corpora using automatic or semi-automatic tools can provide a list of all repeated word combinations (2-gramms, 3-gramms etc.). This method has been used in the last twenty years to analyze lexical bundles in different genres and registers (Paquot & Granger 2012: 135). The results of these analyses are of significant benefit to both linguistic research and foreign language teaching.
For the present paper, an enriched form of a corpus was used, which is also a diachronic project. The UniCorpus with a total of 22,206 words was created during the last 4 years by students of the Italian Language, studying in a Greek University, in the department of Italian Language and Literature. The written productions that have been selected for this analysis are of the narrative genre. The Corpus was analyzed aiming at the n-grams that form the most frequent verbs and verb forms. The purpose is to observe whether there is an overuse or an underuse in specific lexical bundles, which are directly linked to patterns of verb complements.
Analyzing the results quantitatively, we can establish a persistence of the students in repetitive patterns. A possible explanation for this observation is the confidence they feel when they use the same phrases (Nesselhauf 2005: 69) which are probably the ones they know better. Moreover, it becomes apparent that few, but repeated errors of the students are related to the mother tongue and its interference (Paquot & Granger 2012: 136) in learning Italian.
The analysis of this kind of n-grams reveals a technique that can distinguish linguistic patterns related to a part of speech (in our case the verbs) or more, starting from those that are common to the whole corpus and proceeding to those that distinguish the linguistic variety of one student from another (Aarts & Granger 2014: 140). The above conclusion is immediately usable in teaching, in order to improve both the students' performance and the targeting or the evaluation of the course.
Keywords: Italian language Teaching, Learner Corpora
References
[1] Aarts, J., & Granger, S. (1998). Tag sequences in learner corpora: A key to interlanguage grammar and discourse. In Learner English on computer (pp. 132-141). Routledge. Available from https://scholar.google.com/scholar?hl=en&as_sdt=0%2C5&q=Aarts%2Bgranger%2B2014&btnG">https://scholar.google.com/scholar?hl=en&as_sdt=0%2C5&q=Aarts+granger+2014&btnG=
[2] Anthony, L. (2022). AntConc (Version 4.1.4) [Computer Software]. Tokyo, Japan: Waseda University. Available from https://www.laurenceanthony.net/software">https://www.laurenceanthony.net/software
[3] Nesselhauf, N. (2005). Collocations in a Learner Corpus. Amsterdam, the Netherlands: John Benjaminsdoi:10.1075/scl.14
[4] Paquot, M., & Granger, S. (2012). Formulaic language in learner corpora. Annual Review of Applied Linguistics, 32, 130–149. doi:10.1017/S0267190512000098. Available from https://scholar.google.com/scholar?hl=en&as_sdt=0%2C5&q=paquot%2Bgranger%2B2012&btnG">https://scholar.google.com/scholar?hl=en&as_sdt=0%2C5&q=paquot+granger+2012&btnG=