Content
What is this course about?
Large language models are rapidly reshaping machine learning research and practice, yet many questions remain about how they work, how to ensure they behave as intended, and how to build reliable systems on top of them. This course dives into three core areas at the frontier of LLM research: interpretability, alignment, and agents. Students will learn to analyze circuits and internal representations, probe the geometry of model features through sparse autoencoders and linear representations, and reason about scalable oversight and emergent misalignment. The course will also cover how LLMs are deployed as autonomous agents for software engineering and scientific research, how they are used to simulate human behavior, and how they can complement human decision-making. This is an advanced course and assumes familiarity with transformers and language modeling. We will read and discuss recent publications, with importance placed on analyzing, interpreting, and making arguments from necessarily incomplete empirical evidence. Students will get hands-on experience through assignments and a quarter-long research project that pushes into open problems in the field.
Prerequisites
- CMSC 25700/CMSC 35100: Natural Language Processing
- DATA 37712/CMSC 37712: Foundations of Machine Learning II: Generative Models
You are expected have understoond the transformer architecture and have experience with training and analyzing language models. Research experience is preferred too.
Coursework
Grading
- Quizzes: 20%
- Roast or Toast: 10%
- Assignments: 20%
- Project: 50%
Quizzes
Short quizzes will be held at the beginning of the lecture to assess understanding of the readings.
Roast or Toast
Students will either critically analyze (roast) a paper or propose (imagine) an extension or question from the course readings.
Assignments
There will be three assignments throughout the quarter.
Project
- Project Proposal: Due Friday, March 28
- Proposal Revision: Due Friday, April 4
- Weekly Blog Entries: April 10, April 17, April 24, May 1, May 15
- First Draft: Due Thursday, May 8
- Final Report: Due Thursday, May 22
Compute
Modal has generously offered compute to each student. See details on Ed.
Textbook
There is no required textbook. Reading materials for each week will be a combination of technical papers and online resources.
Honor Code
We expect students to not look at solutions or implementations online. Like all other classes at UChicago, we take academic honesty very seriously. Please make sure to read the UChicago Academic Honesty page.
Collaboration policy
For individual assignments, collaboration with fellow students is encouraged as long as they are properly disclosed for each submission. However, you should not share any written work or code for your assignments. After discussing a problem with others, you should write the solution by yourself. For final projects, you are expected to work in groups of 1-2, preferrably 1.
AI tools policy
Using generative AI tools such as Claude Code and ChatGPT is allowed as long as they are properly disclosed for each submission. You are encouraged to use AI (e.g., NeuriCo) heavily for the project.
Additional course policies can be found on Canvas.
Submitting Coursework
- All coursework should be submitted via Gradescope by the deadline.
Late Days
- Each student has 3 late days to use throughout the quarter for assignments. No late submissions will be accepted for project related work.