IS 616: Large Scale Data Analysis and Visualization

Contents
This course teaches students principles of scientific visualization of data using the R and Python programming languages. Starting from introductory large scale data handling and basics of visualization, more advanced methods for visualization will also be covered. Important libraries and frameworks that are essential for data analysis and visualization are introduced.

Learning outcomes
On completion of the course, students should be familiar with libraries in the R and Python programming languages that enable them to create professional scientific visualizations. This outcome includes the application of those scientific libraries, handling of large datasets and knowledge of many examples of how challenges in scientific visualization were overcome and in what ways creative solutions were found. Skills:

Knowledge on how to include scientific visualization in research projects
Independent choice of ways to prepare large scale data to run visualization methods to solve a given problem
Knowledge about different libraries and their (dis-)advantages
Data preprocessing, analysis, organisation and visualization

Necessary prerequisites
–

Recommended prerequisites
Basic knowledge about statistics and 1) either basic knowledge of R and Python or 2) intermediate knowledge of either Python or R and willingness to learn the other, yet unfamiliar language

Forms of teaching and learning	Contact hours	Independent study time
Lecture	2 SWS	10 SWS
Exercise class	2 SWS	7 SWS

ECTS credits	6
Graded	yes
Workload	180h
Language	English
Form of assessment	Written exam (90 min)
Restricted admission	yes
Further information	https://www.bwl.uni-mannheim.de/strohmaier/teaching/

Examiner

Performing lecturer

Prof. Dr. Markus Strohmaier

M. Strohmaier & M. Pellert

Frequency of offering	Fall semester
Duration of module	1 semester
Range of application	M.Sc. MMM, M.Sc. VWL, M.Sc. Wirt. Inf., MMDS
Preliminary course work	Successful completion of the corresponding exercises
Program-specific Competency Goals	CG 2
Literature	Claus Wilke: Fundamentals of Data Viz (https://clauswilke.com/dataviz/), Roger D. Peng & Elizabeth Matsui: The Art of Data Science (https://bookdown.org/rdpeng/artofdatascience/), Julia Silge & David Robinson: Tidy Text Mining (https://www.tidytextmining.com/), Robin Lovelace, Jakub Nowosad, Jannes Muenchow: Geocomputation with R (https://r.geocompx.org/), Kieran Healy: Data Visualization (https://socviz.co/index.html), Winston Chang: ggplot2 cookbook (http://www.cookbook-r.com/Graphs/), Jake VanderPlas: Python Data Science Handbook (https://jakevdp.github.io/PythonDataScienceHandbook/), BBC Data Journalism team (https://medium.com/bbc-visual-and-data-journalism/how-the-bbc-visual-and-data-journalism-team-works-with-graphics-in-r-ed0b35693535)
Course outline	This course starts with the fundamental concepts of working with data in R and Python. This will progress towards methods that can be used to work with large data sets. Concurrently, basic concepts of visualization will be introduced. After that, we study selected examples of scientific visualizations (from historical times until today). Together, we will reconstruct the problem situation that scientists were facing when creating these visualizations and we will study their creative problem solutions to learn by example. While we provide theoretical background where necessary, we strongly focus on implementations to solve practical problems.

IS 616: Large Scale Data Analysis and Visualization

Dekanat | Fakultät für Betriebswirtschaftslehre

FORUM