Course roadmap

Interpretability

Lectures 1–8

  • Attention
  • MLPs & Factual Recall
  • Transformer Circuits
  • Geometry of Representations
  • Superposition & SAEs
  • Chain of Thought
  • Interpretability for Science

Alignment

Lectures 9–13

  • The Alignment Problem
  • Scalable Oversight
  • Emergent Misalignment
  • Sycophancy
  • Finding Novel Behavior

Agents

Lectures 14–18

  • Agents & Agentic RL
  • LLM-based Simulation
  • Scientific Discovery
  • Frontiers

Science is not inherently valuable.

Science is not inherently valuable.

How AI changes the balance

Selection
Production
Evaluation
Selection
Production
Evaluation

How AI changes the balance

Selection
Production
Evaluation
Selection
Production
Evaluation
AI facilitates production. Scientists must invest more in selection and evaluation.