Seminar Data-Science I (Methods)
CS 721 Master Seminar (M. Sc. Wirt. Inf., M.Sc. MMDS, Lehramt für Gymnasien)
Lecturer | Prof. Dr. Markus Strohmaier, Marlene Lutz, Maximilian Kräutner |
Course Format | Seminar |
Offering | HWS/ |
Credit Points | 4 ECTS |
Language | English |
Grading | Written report with oral presentations |
Examination date | See schedule below |
Information for Students | The course is limited to 12 participants. Please register centrally via Portal2. |
Course Information
Course Description
In this seminar, students perform scientific research, either in the form of a literature review or by conducting a small experiment, or a mixture of both, and prepare a written report about the results. Topics of interest focus around a variety of problems and tasks from the fields of Data-Science, Network Science and Text Mining.
Previous participation in the courses “Network Science” and “Text Analytics” are recommended.
Objectives
Expertise: Students will acquire a deep understanding of the research topic. They are expected to describe and summarize a topic in detail in their own words, as well as to judge the contribution of the research papers to ongoing research.
Methodological competence: Students will develop methods and skills to find relevant literature for their topic, to write a well-structured scientific paper and to present their results.
Topics
This seminar will be split into two main topic blocks. Every student will be assigned a research paper from only one of these blocks to work on. Yet, it is expected that students also actively participate in discussion on papers from the other topic blocks after they have been presented.
The two topics we are going to discuss in the HWS 2024 are:
- Adversarial Attacks. Adversarial attacks on Large Language Models (LLMs) represent a critical area of research due to their implications for security, trust, and the reliability of AI systems. These attacks involve manipulating inputs to deceive LLMs into producing incorrect or harmful outputs, posing significant risks in applications where accuracy and safety are important. We will look at different methods to attack LLMs, as well as ways to defend LLMs against such attacks.
- Is LLM-as-a-Judge Robust? Investigating Universal Adversarial Attacks on Zero-shot LLM Assessment
- Efficient Adversarial Training in LLMs with Continuous Attacks
- Adversarial Tuning: Defending Against Jailbreak Attacks for LLMs
- Self-Evaluation as a Defense Against Adversarial Attacks on LLMs
- Certifying LLM Safety against Adversarial Prompting
- An LLM can Fool Itself: A Prompt-Based Adversarial Attack
- Model Editing. Pretrained LLMs serve as backbone for many downstream applications. As such, we often want to refine them to tailor performance to specific downstream tasks, mitigate bias or update new information to the model. However, the growing size of language models has made traditional fine-tuning costly, leading to increased interest in alternative refinement methods that avoid gradient updates. We will investigate different methods to edit LLMs with regard to e.g. knowledge, bias or linguistic style.
- Word Embeddings Are Steers for Language Models
- Time is Encoded in the Weights of Finetuned Language Models
- The Butterfly Effect of Model Editing: Few Edits Can Trigger Large Language Models Collapse
- Mitigating Biases for Instruction-following Language Models via Bias Neurons Elimination
- Language Models are Homer Simpson! Safety Re-Alignment of Fine-tuned Language Models through Task Arithmetic
- Cross-Lingual Knowledge Editing in Large Language Models
- Adversarial Attacks. Adversarial attacks on Large Language Models (LLMs) represent a critical area of research due to their implications for security, trust, and the reliability of AI systems. These attacks involve manipulating inputs to deceive LLMs into producing incorrect or harmful outputs, posing significant risks in applications where accuracy and safety are important. We will look at different methods to attack LLMs, as well as ways to defend LLMs against such attacks.