 What is Corpus Presenter ?

Quick orientation
Before you start . . .

  1. Corpus Presenter is a suite of programs which pursues two main goals.

    (1) the presentation of corpora in tree form. The program will show the internal arrangment of files of a corpus as a tree with branches similar to the folders on your hard disk as seen with Windows Explorer. Just how the files are presented depends on the tree which a user chooses for display. A tree can be designed very quickly using the supplied utility Corpus Presenter Make Tree (updated for Version 14) or you can generate a tree directly from files on your computer from within Corpus Presenter.

    (2) the retrieval of information from the files of a corpus. The manner in which information is found in files and returned to the user depends on the way in which the files of a corpus are presented. There are a number of options here. Assuming that you have a tree display for your files you can search the entire tree or just a branch of it. You can also search through individual files, whether they are continguous in the tree or not. By having more than one tree for a single set of files you can carry out selective searches which return highly accurate information. Remember that it is not necessary to have a corpus to start with: you can search through any texts you might load into Corpus Presenter for linguistically interesting structures.

    Corpus Presenter also has all the other options of standard corpus software, i.e. it can generate concordances, word lists, generate reverse dictionaries of words in texts, etc. It does not require that texts are prepared in any way, e.g. by indexing them in advance.

  2. Corpus Presenter has powerful retrieval options. You can search not just for words, using wildcards if you like, but also for ‘syntactic frames’ of the form [Word_1 . . . Word_2]. Here you can specify searches for whole words or just a specific part of a word and specify the number of intervening items. You can use lists of input word forms for searches instead of single words.

  3. Corpus Presenter can generate various types of word lists, from single files, groups of texts or selections determined by the user. The word lists can be returned in various ways and can be stored in different formats and reloaded if necessary. It is also possible (in Version 14) to use different word lists to determine ‘keyness’ in texts, i.e. to see to what extent a text or set of texts differs in style from another set of texts and thus to determine stylistic differences between authors, for example.

  4. Corpus Presenter has flexible options for storing returns. All search results can be stored to disk (or clipboard) in a variety of user-specifiable ways ensuring that the returns can be processed further, e.g. in your own word processor. Returns can be reloaded and used again. Clicking on any return will open the text in which it was found and highlight the find in this text.

  5. Corpus Presenter is especially powerful with corpora which have an internal structure, e.g. those which are arranged by text type or period or author, to mention just a few common arrangements for linguistic corpora. Corpus Presenter can adopt the folder structure of a computer’s disk or you can design/alter a structure if you like. Needless to say, you can start working with Corpus Presenter straight away with any files on your computer, just select and load them.

  6. To use Corpus Presenter, you or your library should have bought the book Corpus Presenter published by John Benjamins, Amsterdam in 2003.

The different sections of this website will explain what you can do with Corpus Presenter. For quick orientation, the two most important steps are summarised here, namely loading files and doing searches with them. Click on the following links to get short, concise advice about how to do this.

  How to load text files
  Search through texts

Corpus processing software will help you find information quickly and accurately in any set of texts. Just how accurate the information you gain is, depends on the nature of a search and the extent to which you use the options of the software you are using. Ultimately, it is up to the user to decide about the linguistic significance of the returns which corpus processing software yields. This means that the returns will always have to be viewed and assessed by the user before making any statement about language use or language change based on these returns. This decision cannot be made by any computer program and it would be wrong to expect this of your software.

A word about tagging . . .

Users are often interested in finding out whether it is possible to tag texts with available software. This certainly is possible with the supplied utility Corpus Presenter Text Tool. Please consult the section Corpus Presenter Text Tool for more information.
    Tagging involves making decisions in advance about how forms in texts are to be classified. Not all end-users will agree with these decisions, indeed some may positively disagree. This can throw into doubt the value of such tagging in the first place. Furthermore, bear in mind that no matter how good a tagging system might be, the final decision about its usefulness must be made by end-users.

Note. Corpus Presenter is designed for use on PCs. If you have an Apple Macintosh on which Microsoft Windows runs normally, then Corpus Presenter will also run normally. The program cannot be used with the Linux operating system, I’m afraid. Please ensure that your computer is using Windows XP / Windows Vista before loading Corpus Presenter. These newer versions of the Windows operating system are the most stable and yield the best results.

Version 14 of Corpus Presenter is fully compatible with Windows XP / Windows Vista / Windows 7 / Windows 8.1.