Advantages and limits of text mining software for analysis of students' satisfaction
Zdena Lustigova, ass. prof. Charles university in Prague (Czech Republic)
Abstract
Presented article deals with the possibilities of statistical software packages, which include data mining and text mining tools, for the analysis of unstructured text available at different Open Educational Resources.
The information about satisfaction is often available in the form of multilingual information, hidden in users reviews, chat rooms, tea rooms and other unspecified and unstructured ways of feedback. Authors do not present the whole large scale of users' reflection, they focus just on part of it, to present the basic problems researchers might meet while working with commercial software packages, Statistica e.g.. The unstructured text they processed, users remarks and reviews, was created and published in 32 languages. The Statistica software has not been proven very beneficial, especially in the area of multilingual information processing. Despite all effort, more then half of languages used by users failed to process. The second major problem presented incomplete and of poor quality stoplists, even for common and frequently used languages, such as English or German.