DE / EN

IS 661: Text Analytics

Contents
In the digital age, techniques to automatically process textual content have become ubiquitous. Given the breakneck speed at which people produce and consume textual content online – e.g., on micro-blogging and other collaborative Web platforms like wikis, forums, etc. – there is an ever-increasing need for systems that automatically understand human language, answer natural language questions, translate text, and so on. This class will provide a complete introduction to state-of-the-art principles and methods of Natural Language Processing (NLP). The main focus will be on statistical techniques, and their application to a wide variety of problems. This is because statistics and NLP are nowadays highly intertwined, since many NLP problems can be formulated as problems of statistical inference, and statistical methods, in turn, represent de-facto the standard way to solve many, if not the majority, of NLP problems.

Learning outcomes
Students will acquire knowledge of state-of-the-art principles and methods of Natural Language Processing, with a specific focus on the application of statistical methods to human language technologies.
Successful participants will be able to understand state-of-the-art methods for Natural Language Processing, as well as being able to select, apply and evaluate the most appropriate techniques for a variety of different practical and application-oriented scenarios.

Necessary prerequisites

Recommended prerequisites
Basic knowledge of programming concepts and methods, practical programming skills (Python).

Forms of teaching and learningContact hoursIndependent study time
Lecture2 SWS4 SWS
Exercise class2 SWS4 SWS
ECTS credits6
Graded yes
Workload180h
LanguageEnglish
Form of assessmentWritten exam (90 min)
Restricted admissionyes
Further information
Examiner
Prof. Dr. Markus Strohmaier
Frequency of offeringFall semester
Duration of module 1 semester
Range of applicationM.Sc. MMM, M.Sc. Wirt. Inf., MMDS
Preliminary course workStudents must pass at least 50% of the written assignments in the exercise class in order to take the final exam.
LiteratureChris Manning and Hinrich Schütze, Foundations of Statistical Natural Language Processing, MIT Press 1999;
Dan Jurafsky and James H. Martin, Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition, Prentice Hall 2009 (2nd edition).