Course name: Grammar-Driven Language Models

Instructor: Kristina Kocijan, PhD, Assistant Professor

ECTS credits: 6

Status: elective

Form: 1 h lecture + 1 h seminar + 2 h lab

Prerequisites: 'Introduction to NLP' and 'Introduction to formal languages and automata'

Exam: writen, practical, seminar

Content: Introduction. Finite-State Automata (FSA), Recursive Transition Networks (RTNs), Enhanced Recursive Transition Networks (ERTNs), Context-Free Grammars (CFGs) and Context-Sensitive Grammars (CSGs). Regular Expressions. Building grammars with graphs and rules (local grammars, inflectional, derivational, lexical, orthographical, morphological, terminological, syntactic, semantic and translation grammars). Lexical analysis. Syntax analysis (chunkers and parsers). Disambiguation. Evaluation of analysis systems. Concordances. Language processing in the context of Big Data.


Objectives: Upon the completion of the course, the student should be able to:
-define and recognize automata and finite state transducers,

-define, explain and use grammars built with rules or graphs,

-independently build, explain and use regular expressions in grammars and pattern recognition,

-independently build simple and complex queries on text using regular expressions and syntactic grammars,

-independently and/or in team work build, explain and use grammar built with graphs,

-independently and/or in team work build a system for analysis of written text in any language,

-evaluate existing or new system for the thext analysis.

Recommended reading:

1. Steven Abney: Parsing by Chunks, u Principle-Based Parsing,(eds.) R. Berwick, S.Abney, C. Tenny, Kluwer Academic Publishers, 257-278, 1991.
2. Steven Abney: Partial Parsing via Finite-State Cascades, u Workshop on Robust Parsing, (eds.) J. Carroll, ESSLLI'96, 8-15, 1996.
3. Steven Abney: Part-of-Speech Tagging and Partial Parsing, u Corpus-Based Methods in Language and Speech, (eds.) K. Church, S. Young, G. Bloothooft, Kluwer Academic Publishers, Dordrecht, 1996.
4. James Allen: Natural Language Understanding, 2nd edition, The Benjamin Cummings Publishing Company, Inc., Redwood City, 1995. (u knjižnici)
5. Kenneth R. Beesley, Lauri Karttunen: Finite Morphology, CSLI Publications, Stanford, 2003. (u knjižnici)
6. John Carroll: Parsing, u The Oxford Handbook of Computational Linguistics, Ruslan Mitkov (ed.), Oxford University Press, Oxford, 233-248, 2003. (u knjižnici)
7. David Clemenceau: Finite-State Morphology: Inflections and Derivations in a Single Framework Using Dictionaries and Rules, u Finite-State Language Processing, (eds.) E. Roche, Y. Schabes, The MIT Press, London, 67–98, 1997.
8. Zdravko Dovedan, Formalni jezici: sintaksna analiza, Zavod za informacijske studije, 2003.
9. Maurice Gross: Local Grammars and their representation by finite automata, u Data, Description, Discourse: Papers on the English Language in honour of John McH sinclair, (ed.) M. Hoey, 26-38, 1993.
10. Maurice Gross: The Construction of Local Grammars, u Finite-State Language Processing, (eds.) E. Roche, Y. Schabes, MIT Press, London, 329-354, 1997.
11. Dick Grune, Ceriel Jacobs: Parsing Techniques: A Practical Guide, Ellis Horwood Limited, West Sussex, 1998.
12. Udo Hahn, Geert Adriaens: Parallel Natural Language Processing: Background and Overview, u Parallel Natural Language Processing, ed. G. Adriaens, U. Hahn, Ablex Publishing Corporation, New Yersey, 1-134, 1994.
13. James E. Hoard: Language understanding and the emerging alignment of linguistics and natural language processing, u Using Computers in Linguistics: A Practical Guide, (eds) J. Lawler, H. Aristar Dry, Routledge, London, 197-230, 1998. (u knjižnici)
14. Daniel Jurafsky, James H. Martin: Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition, Prentice Hall, New Jersey, 2000. (u knjižnici)
15. Lauri Karttunen: Finite-State Technology, u The Oxford Handbook of Computational Linguistics, Ruslan Mitkov (ed.), Oxford University Press, Oxford, 339-357, 2003. (u knjižnici)
16. Emmanuel Roche: Parsing with Finite-State Transducers, u Finite-State Language Processing, (eds.) E. Roche, Y. Schabes, The MIT Press, London, 241 – 282, 1997.Max D.Silberztein:NooJ, 2009.
17. Atro Voutilainen:Designing a (Finite-State) Parsing Grammar, u Finite-State Language Processing, (eds.) E. Roche, Y. Schabes, The MIT Press, London, 283 – 310, 1997.
18. Kristina Vučković, Marko Tadić, Zdravko Dovedan:Rule Based Chunker for Croatian, u Proceeding of the Sixth International Conference on Language Resources and Evaluation LREC 2008, Marakeš: ELRA, 2008.
19. Kristina Vučković, Nives Mikelić Preradović, Zdravko Dovedan: Verb Valency Enhanced Croatian Lexicon, u Proceedings of NooJ 2008, Budimpešta, Mađarska, 2008.