Graduate course at UNC on the statistical and probabilistic principles of Generative AI
Instructor: Shankar Bhamidi
Meeting time: Tue–Thu, 11:00–12:15
Location: Hanes 107
Note on course logistics: Official course materials, announcements, submissions, and grades will be handled via UNC Canvas.
This public website provides a lightweight “front door” with links to slides/notes/readings (e.g., via Dropbox) and a running schedule.
This course explores Generative AI through the lens of probability and statistics, with a particular emphasis on identifying—or questioning—the existence of coherent statistical and probabilistic principles underlying modern generative models.
The course title GASPp reflects this goal:
This is an intentionally exploratory, research-oriented course. Roughly two-thirds of the semester will focus on building foundational understanding of:
The course is not implementation-heavy; the emphasis is on understanding, intuition, reading research papers critically, and identifying research directions.
More details can be found here
I can only convey my understanding of the material, much of which I learnt from the material in the Resources. In particular I need to mention the wonderful resources developed by Prof. Cosma Shalizi and Prof. Ambuj Tewari. I am largely following in the shadow of these giants.
By the end of the semester, students should:
Official announcements/submissions are via Canvas. This part of the page is the “living” schedule with links (e.g., Dropbox) to slides, notes, and readings. If the links below do not work, this folder link to all the lectures should hopefully be more stable.
| Date | Topic | Materials |
|---|---|---|
| 2026-01-08 (Thu) | Intro to course. Class expectations. BPE Tokenization. | Slides: Lecture 1, P1-49 Reading: notes |
| 2026-01-13 (Tue) | Vector semantics and representation learning. Classification and inevitability of logistic regression. | Slides: Lecture 1, slides 49 - 114 Reading: notes |
| 2026-01-15 (Thu) | Word2vec and node2vec. Toy models. Start of information theory | Slides: Lecture 1, slides 114 - end, Lecture 2, slides 1-9 Reading: notes |
| 2026-01-20 (Tue) | Basics of information theory | Slides: Lecture 2, Slides 9-41, Reading: notes |
| 2026-01-22 (Thu) | Data processing inequality, Information theory and MLE and LDP. Entropy rates for stochastic processes | Slides: Lecture 2, associated slides, |
| 2026-01-27 (Tue) | Classes cancelled owing to snow day. | |
| 2026-01-29 (Thu) | Ergodic sequences and entropy. Convexity of KL. Information geometry (maximization of Entropy, ELBO) | Slides: Lecture 2, associated slides, Reading: notes |
| 2026-02-03 (Tue) | N-gram models and their use | Slides: Lecture 3 |
| 2026-02-05 (Thu) | Finished N-gram models. Started Neural networks. Origin story. | Slides: Lecture 3, Lecture 4, Slides 1-20 |
| 2026-02-10 (Tue) | Neural networks, Kolmogarov and Arnold. Baby implementation in Logistic regression | Lecture 4, Slides 21-45 |
| 2026-02-12 (Thu) | Neural networks and Stochastic gradient descent, Back propogation. Noise contrastive estimation | Lectue 4, slides 45-86 |
| 2026-02-17 (Tue) | Basics of Stochastic gradient descent | Lecture 4, slides 95-115 |
| 2026-02-19 (Thu) | Convergence a.s for stochastic gradient descent, state of the art for high dimensional stochastic gradient descent | Lecture 4 till the end |
| 2026-02-24 (Tue) | Enter the transformer | Lecture 5 |
| 2026-02-26 (Thu) | ||
| 2026-03-03 (Tue) | ||
| 2026-03-05 (Thu) | ||
| 2026-03-10 (Tue) | ||
| 2026-03-12 (Thu) |
Machine learning street talk Podcast with Chris Bishop. Some key takeaways:
Andrej Karpathy on Dwarkesh podcast: deep thinker and expert in this general area. One thing I am still wrapping my head around are his views on reinforcement learning and its role in training LLMs. I know very little about this general area. His take is as follows: Karpathy argues that reinforcement learning plays only a limited and deeply flawed role in the development of large language models. His core critique is that RL relies on an extremely weak learning signal: a single scalar reward assigned after a long trajectory of actions, which is then indiscriminately propagated backward across every step. This creates noisy credit assignment, where incorrect reasoning steps are reinforced as long as the final outcome happens to be correct. He describes this as “sucking supervision through a straw” and views it as a statistically inefficient and conceptually misguided approach. In contrast to humans, who reflect selectively, revise mistakes, and attribute credit unevenly, RL blindly upweights entire trajectories. As a result, RL does not capture how reasoning or learning actually works and produces brittle improvements rather than genuine gains in intelligence. At the same time, Karpathy emphasizes that imitation learning and pretraining were far more surprising and impactful breakthroughs than reinforcement learning. Fine-tuning pretrained models on human demonstrations quickly transformed autocomplete systems into useful assistants while preserving their knowledge. Reinforcement learning, such as RLHF, provides incremental benefits like preference shaping and modest hill climbing when correct answers exist, but it does not solve reasoning, planning, or continual learning. Attempts to improve RL through process-based supervision run into severe problems because automated judges are gameable and lead to adversarial behavior, where models exploit reward models without improving correctness. Karpathy believes progress will require fundamentally new training paradigms involving reflection, selective credit assignment, memory, and distillation, rather than simply scaling RL, and he does not see reinforcement learning as the main path toward general intelligence. I guess only time will tell where we stand on these debates. While the above podcast goes into lots of other topics, in the specific setting of reinforcement learning, it seems to follow a previous podcast by another great researcher Richard Sutton. I have not had the time to dive into this second podcast yet.
| Date | Topic | Materials |
|---|---|---|
| 2026-01-08 (Thu) | Intro to course. Class expectations. | Slides: link · Notes: link |
| 2026-01-13 (Tue) | BPE / N-grams / vector semantics / representation learning | Slides: link · Reading: link |
| 2026-01-15 (Thu) | Information theory: cross-entropy loss and MLE | Slides: link · Reading: link |
| 2026-01-20 (Tue) | AEP, ergodic theory, Shannon–McMillan–Breiman | Slides: link · Reading: link |
| 2026-01-22 (Thu) | Markov chains for text; stochastic gradient descent | Slides: link · Reading: link |
| 2026-01-27 (Tue) | SGD in low and high dimensions | Slides: link · Reading: link |
| 2026-01-29 (Thu) | SGD continued: neural networks | Slides: link · Reading: link |
| 2026-02-03 (Tue) | Transformers, attention | Slides: link · Reading: link |
| 2026-02-05 (Thu) | Transformers, attention (cont.) | Slides: link · Reading: link |
| 2026-02-10 (Tue) | Transformers and interacting particle systems | Slides: link · Reading: link |
| 2026-02-12 (Thu) | Prompting as conditioning; RLHF | Slides: link · Reading: link |
| 2026-02-17 (Tue) | Influence functions; superconcentration; propagation of chaos | Slides: link · Reading: link |
| 2026-02-19 (Thu) | Interlude: graph rep learning; spectral clustering | Slides: link · Reading: link |
| 2026-02-24 (Tue) | Interlude: graph neural networks | Slides: link · Reading: link |
| 2026-02-26 (Thu) | Latent variables I: K-means and EM | Slides: link · Reading: link |
| 2026-03-03 (Tue) | Latent variables II: VI & ELBO; score matching | Slides: link · Reading: link |
| 2026-03-05 (Thu) | GANs and autoencoders | Slides: link · Reading: link |
| 2026-03-10 (Tue) | GANs and autoencoders II | Slides: link · Reading: link |
| 2026-03-12 (Thu) | Diffusion models I | Slides: link · Reading: link |