Subject: Socio-humanistic informatics
Course: Socio-humanistic informatics
ECTS credits: 3
Language: Croatian/English
Duration: 1 semester
Status: elective
Method of teaching: blended course
Prerequisite: no
Assessment: Complete set of weekly writing tasks, final paper
Course description: Socio-humanistic informatics includes the application of information and communication technologies in the social sciences and humanities, as well as text and language processing. Course topics include the following: advanced text formatting, text indexing, extractive summarization, regular expressions for extracting collocations and basic statistical text analysis.
Course objectives: Students need to understand the basic principles of text processing. They need to understand the concepts of text and language processing such as token, type, lemma, index terms, distribution of words, collocations, abstract and extract. Practical work includes weekly assignments during the semester. Students will learn how to apply the theoretical knowledge through a series of project-oriented and interrelated tasks that include creating and applying styles and templates in MS Word document, advanced find and replace using regular expressions, creating a frequency list of words, creating an index using MS Word, basic statistical text analysis using MS Excel and generating extract of the document.
Reading list:
Ignatow, G., & Mihalcea, R. (2017). Text mining: A Guidebook for the Social Sciences Thousand Oaks, CA: SAGE Publications, Inc doi: 10.4135/9781483399782 (odabrana poglavlja)