corpus

Etymology

Borrowed from Latin corpus (“body”). Doublet of corpse, corps, and riff.

noun

  1. A collection of writings, often on a specific topic, of a specific genre, from a specific demographic or a particular author, etc.
    No one suggests that Browning intended to mean vagina when he wrote “owls and bats, / Cowls and twats,” because the context does not allow for it, nor does the greater context of the Browning corpus. 2011, Patrick Spedding, James Lambert, “Fanny Hill, Lord Fanny, and the Myth of Metonymy”, in Studies in Philology, volume 108, number 1, page 113
  2. (specifically, linguistics) Such a collection in form of an electronic database used for linguistic analyses.
    Text corpora are being used in most current lexicographic projects. Applied linguistic research is another field where text corpora are welcome as an inexhaustible source of empirical information, a polygon for testing various linguistic tools – spell-checkers, OCRs, machine translation systems, NLP systems, etc. 2007, Mihail Mihailov, Hannu Tommola, “Compiling Parallel Text Corpora: Towards Automation of Routine Procedures”, in Wolfgang Teubert, editor, Text Corpora and Multilingual Lexicography (Benjamins Current Topics; 8), Amsterdam: John Benjamins Publishing Company, page 60
    Comparable corpora are made up of texts in different languages that may be related in various ways, but are not translations of each other. They may have nothing in common at all, or be on the same subject, of the same genre, or from the same chronological period, etc. 2008, Anabel Borja, “Corpora for Translators in Spain. The CDJ-GITRAD Corpus and the GENITT Project.”, in Gunilla [M.] Anderman, Margaret Rogers, editors, Incorporating Corpora: The Linguist and the Translator, Clevedon, North Somerset: Multilingual Matters, page 248
    The Lancaster/IBM Spoken English Corpus began in September 1984 as part of a research project into the automatic assignment of intonation […] The original design of the corpus was determined by the need to provide data for research into speech synthesis. As a result, unlike most other corpora currently being used in the computational linguistics field, the SEC exists in several forms. […] However, whatever the original motivation for compiling a corpus, it quickly becomes an object of interest in its own right. New users find it valuable for applications for which it was not designed. 2013, “Introduction”, in Gerry Knowles, Briony Williams, L[ita] Taylor, editors, A Corpus of Formal British English Speech: The Lancaster/IBM Spoken English Corpus, Abingdon, Oxon., New York, N.Y.: Routledge, page 1
    A corpus approach is a useful methodology for observing, describing and interpreting the stylistic features of language in literary and non-literary texts. 2014, Giuseppina Balossi, “Corpus Approaches to the Study of Language and Literature”, in A Corpus Linguistic Approach to Literary Language and Characterization: Virginia Woolf's The Waves (Linguistic Approaches to Literature; 18), Amsterdam: John Benjamins Publishing Company, page 41
    Today, computer databases and corpora infinitely increase the ease of this type of research, but the collecting process remains essentially the same. 2018, James Lambert, “A multitude of ‘lishes’: The nomenclature of hybridity”, in English World-Wide, page 4
  3. (uncommon) A body, a collection.
    About a hundred years ago in Germany, the publishing of corpuses of the ancient Greek coinages was started. […] The significance of those, and some other corpuses is exclusive, because they allowed an enormous amount of numismatic material kept in museum and private collections all over the world, to be studied and systematized. 1998, Dimitǎr Draganov, “New Coin Types of Hadrianopolis”, in Ulrike Peter, editor, Stephanos Nomismatikos: Edith Schönert-Geiss zum 65. Geburtstag (Griechisches Münzwerk), Berlin: Akademie Verlag, page 221
    An assessment in 1991 proposed publication of the results of this work in three stages: […] secondly, a corpus of the Roman pottery to present the type series and to discuss the fabrics and forms recovered, […] 2014, Margaret Darling, Barbara Precious, “Introduction”, in A Corpus of Roman Pottery from Lincoln (Lincoln Archaeological Studies; 6), Oxford: Oxbow Books, page 1

Attribution / Disclaimer All definitions come directly from Wiktionary using the Wiktextract library. We do not edit or curate the definitions for any words, if you feel the definition listed is incorrect or offensive please suggest modifications directly to the source (wiktionary/corpus), any changes made to the source will update on this page periodically.