MKT 624: Data Scraping for Analytics and AI using R
Contents
For scientists, online platforms like Twitter, Amazon, LinkedIn, TikTok or AirBnB are invaluable for social science research, offering extensive datasets ideal for analysis and predictive modeling. This course will guide you through the process of extracting, storing, and refining this data, ensuring you’re equipped for statistical analysis, predictive modeling, and AI applications. You’ll explore the crucial role of data science in social sciences and AI, then advance to using R for crafting web scrapers with libraries such as rvest, httr, and RSelenium.
The training encompasses advanced R techniques, interpreting web formats like HTML, CSS, JSON, and XML, using regular expressions, and managing diverse data types. You’ll learn to store data with relational databases and (My)SQL, plus how to efficiently extract data through APIs from platforms like Twitter and Yelp. The course will also briefly cover feature and embeddings extraction from text and images, enriching your datasets for detailed analysis and AI model development.
A special focus will be on enhancing your R skills to an advanced level and teaching you the basics of building programs from simple functional programs to Shiny apps, enabling you to create interactive web applications that showcase your scraped data.
Learning outcomes
Upon successful completion of this course, students will have the proficiency
- … to identify key online data sources,
- … develop sophisticated scrapers,
- … process data for analytical and AI applications, and
- … present your findings through an app
Necessary prerequisites
–
Recommended prerequisites
basics in statistics and/
basics in R and/
basics in statistical analysis with R
Forms of teaching and learning | Contact hours | Independent study time |
---|---|---|
Seminar | 2 SWS | 9 SWS |
ECTS credits | 4 |
Graded | yes |
Workload | 120h |
Language | English |
Form of assessment | oral exam (presentation at the end of the seminar) |
Restricted admission | yes |
Further information | student portal |
Examiner Performing lecturer | ![]() | Prof. Dr. Florian Stahl Prof. Dr. Reto Hofstetter & Prof. Dr. Florian Stahl |
Frequency of offering | Spring semester |
Duration of module | 1 semester |
Range of application | M.Sc. MMM, M.Sc. Bus. Edu., M.Sc. Econ., M.Sc. Bus. Inf. |
Preliminary course work | – |
Program-specific Competency Goals | CG 1 |
Literature |
|
Course outline | Will be announced at the beginning of the course. |