During the three years of the project, we want to show what standardizable methods can be defined to ensure this reproducibility. For the systematic exploration, evaluation and determination of interpretation patterns and possible errors have been strongly criticized up to now: There are too many possible combinations of data and procedures. The "classical" steps involved in the analysis of text corpora are: reading the texts, normalization, tokenization, counting, measuring, (in the case of artificial neural networks: abstraction of causal pairs), clustering. In each of these steps, there is a huge potential of information and consequently a risk of losing this information by using an inappropriate procedure. The right choice of tools requires a kind of instruction manual that moves away from the usual, mostly mathematical, attempts at explanation. This will include a way of validating intermediate results to extend the dominant error measures, and will give a better insight into the process to facilitate interpretation. We intend to work with ancient Greek and Latin text corpora, as these two languages are particularly suitable for understanding the general structure of language. Both languages are particularly wellsuited for this task as we have closed corpora that do not change over time. In addition, unlike English, neither language has a fixed word order, which allows for a much more flexible combination of words, so that we do not have worry about word order criteria.