Subject: Lexicography
Course: Corpus lexicography
ECTS-credits: 3
Language: Croatian (English)
Duration: 1 semester
Status: elective
Method of teaching: 2 hours of seminar/lectures
Prerequisite: Introduction to lexicography
Assessment: seminar project, class participation
Course description:
The course deals with the scope, key concepts and methods of corpus lexicography. The differences between traditional and corpus based lexicography are elaborated. A survey of various language corpora is given and their use in the compilation of dictionaries illustrated. Advantages and disadvantages of corpus use in lexicography. Understanding corpora, corpus design and development. Existing lexical resources and main language technologies used to process text corpora are interpreted. Corpus lexicography methods are elaborated on the example of the Croatian National Corpus and other corpora.
Course objectives:
On completing this course students will be able to appreciate corpus resources for lexicographic purposes, understand corpus-based methods for finding various lexical data, interpret corpus material and evaluate language technologies aimed at corpus analysis and use. They will produce sample entries based on corpus analysis.
Quality check and success of the course: Quality check and success of the course will be done by combining internal and external evaluation. Internal evaluation will be done by teachers and students using survey method at the end of semester. The external evaluation will be done by colleagues attending the course, by monitoring and assessment of the course.
Reading list:
1. Atkins, B. T. S. (1994) A corpus-based dictionary. In: Oxford-Hachette French Dictionary (Introductory section). Oxford: Oxford University Press. xix - xxxii.
2. Bratanić, M. (1998) Korpusna lingvistika na kraju 20. stoljeća i implikacije za suvremenu hrvatsku leksikografiju, Filologija, 30-31, Zagreb 1998 , 171-177.
3. Bratanić, M. (1997) Od intuicije do opservacije i nazad (Višejezična leksikografija i paralelni korpusi), Suvremena lingvistika br. 43-44, Zagreb 1997, str 1-12.
4. Clear, J. (1994) I can't see the sense in a large corpus Keifer, K., Kiss, G. & Pajzs, J. Papers in Conceptual Lexicography Complex '94, Research Institute for linguistics Hungarian academy of science
5. Ooi, Vincent B.Y. (1998). Computer Corpus Lexicography, Edinburgh: Edinburgh University Press.
6. Pearson, J. (2002) Working with Specialized Language. A practical guide to using corpora. London/New York: Routledge.
Additional reading list:
1. Altenberg, Bengt (ed.) (2002) Lexis in contrast. Corpus-based approaches. Amsterdam
2. Boguraev, B. and T. Briscoe (1989) Computational Lexicography for Natural Language Processing. London: Longman.
3. Bratanić, M. (1992) Korpusna lingvistika ili sretan susret, Radovi Zavoda za slavensku filologiju, vol. 27, Zagreb, 1992, str. 145-159.
4. Fillmore, C.J. and Atkins, B. T S. (1994) Starting where the dictionaries stop; the challenge of corpus lexicography. U Atkins and Zampolli, eds., Computational Approaches to the Lexicon, 350-393
5. Garside, R., G. Leech and A. McEnery, eds. (1997) Corpus Annotation: Linguistic Information from Computer Text Corpora. London: Longman
6. MacEnery, T.& Wilson, A. (1996) Corpus linguistics. Edinburgh 1996.
7. Mair, C. (ed.). Corpus linguistics and linguistic theory. Amsterdam 2000.
8. Sinclair, J (1991) Corpus, Concordance, Collocation. OUP, Oxford
9. Sinclair,J. (2004) Trust the Text: Language, Corpus and Discourse. London: Routledge.
10. Tadić. M. (2003) Jezične tehnologije i hrvatski jezik, Zagreb: Ex libris