Investigations on the information structure of scientific and technical texts have become particularly topical with the introduction of new methods of text analysis using corpora and text processing software.
The concept ‘information’ is closely related to such notions as knowledge, meaning, comprehension, constraint, perception, representation, and communication. Following Shannon, Weaver (cf. 1998) proposed analyzing information considering (1) technical problems associated with the quantification of information; (2) semantic problems relating to meaning; (3) problems concerning the impact and effectiveness of information on human behavior.
Considering the general advancement of information technologies in any field of human activities, text processing tools should be able to perform multiple functions, including classifying texts according to genres and functions, distinguishing intra- and cross-disciplinary polysemic items, decoding different models of meaning extension. The computer software can manipulate long texts and/or separate sentences with the purpose to obtain relevant information in a user-friendly way. The challenges with processing of information and its extraction from the text are rooted in the fact that even the most advanced statistical methods are incapable to perform many tasks unless they are combined with the methods of cognitive analysis. The combination of both approaches is aimed at fast and efficient extraction of value from volume (cf. Scarfe & Shortland 1995:1-4).
The issues addressed in the present article include the information structure of popular scientific and technical texts, their hierarchical organization, and the problems of decoding of meaning at different levels in the process of information processing and extraction.
It should be kept in mind that in linguistics “…the term ‘information’ is not meant to be restricted to cognitive knowledge, but includes any possible item which is somehow present in the mental world of individuals, including their preconceptions and prejudices” (Dik 1997:10). Therefore, in order to establish the theoretical framework of the research, the semantic, pragmatic, cognitive and textual analyses of the texts on Telecommunications, Architecture, and Civil Engineering have been performed.