Resources
There is no single required textbook. We will draw from textbooks, surveys, papers, and lecture notes.
Core references (early course anchors)
- Christopher M. Bishop, Pattern Recognition and Machine Learning (PRML)
- Daniel Jurafsky & James H. Martin, Speech and Language Processing
Additional important references
- Thomas M. Cover & Joy A. Thomas, Elements of Information Theory
- Elliott Paquette, lecture notes on stochastic gradient descent
- Sourav Chatterjee, Superconcentration and Related Topics
Additional papers and notes will be posted as the course evolves (primarily via Canvas).