Lately I’ve been troubled by how little I actually knew about how Bayesian inference really worked. I could explain to you many other machine learning techniques, but with Bayesian modelling… well, there’s a model (which is basically the likelihood, I think?), and then there’s a prior, and then, um…
What actually happens when you run a sampler? What makes inference “variational”? And what is this automatic differentiation doing in my variational inference? Cue long sleepless nights, contemplating my own ignorance.
So to celebrate the new year^{1}, I compiled a list of things to read — blog posts, journal papers, books, anything that would help me understand (or at least, appreciate) the math and computation that happens when I press the Magic Inference Button™. Again, this reading list isn’t focused on how to use Bayesian modelling for a specific use case^{2}; it’s focused on how modern computational methods for Bayesian inference work in general.
So without further ado…
MarkovChain Monte Carlo
For the uninitiated
 MCMC Sampling for Dummies by Thomas Wiecki. A basic introduction to MCMC with accompanying Python snippets. The Metropolis sampler is used an introduction to sampling.
 Introduction to Markov Chain Monte Carlo by Charles Geyer. The first chapter of the aptlynamed Handbook of Markov Chain Monte Carlo.
Hamiltonian Monte Carlo and the NoUTurn Sampler
 Hamiltonian Monte Carlo explained. A visual and intuitive explanation of HMC: great for starters.
 A Conceptual Introduction to Hamiltonian Monte Carlo by Michael Betancourt. An excellent paper for a solid conceptual understanding and principled intuition for HMC.
 The NoUTurn Sampler: Adaptively Setting Path Lengths in Hamiltonian Monte Carlo by Matthew Hoffman and Andrew Gelman. The original NUTS paper.
 MCMC Using Hamiltonian Dynamics by Radford Neal.
 Hamiltonian Monte Carlo in PyMC3 by Colin Carroll.
Sequential Monte Carlo and particle filters
 An Introdution to Sequential Monte Carlo Methods by Arnaud Doucet, Nando de Freitas and Neil Gordon. This chapter from the authors’ textbook on SMC provides motivation for using SMC methods, and gives a brief introduction to a basic particle filter.
 Sequential Monte Carlo Methods & Particle Filters Resources by Arnaud Doucet. A list of resources on SMC and particle filters: way more than you probably ever need to know about them.
Other sampling methods
 Chapter 11 (Sampling Methods) of Pattern Recognition and Machine Learning by Christopher Bishop. Covers rejection, importance, MetropolisHastings, Gibbs and slice sampling. Perhaps not as rampantly useful as NUTS, but good to know nevertheless.
 The Markovchain Monte Carlo Interactive Gallery by Chi Feng. A fantastic library of visualizations of various MCMC samplers.
Variational Inference
For the uninitiated
 Deriving ExpectationMaximization by Will Wolf. The first blog post in a series that builds from EM all the way to VI. Also check out Deriving MeanField Variational Bayes.
 Variational Inference: A Review for Statisticians by David Blei, Alp Kucukelbir and Jon McAuliffe. An highlevel overview of variational inference: the authors go over one example (performing VI on GMMs) in depth.
 Chapter 10 (Approximate Inference) of Pattern Recognition and Machine Learning by Christopher Bishop.
Automatic differentiation variational inference (ADVI)
 Automatic Differentiation Variational Inference by Alp Kucukelbir, Dustin Tran et al. The original ADVI paper.
 Automatic Variational Inference in Stan by Alp Kucukelbir, Rajesh Ranganath, Andrew Gelman and David Blei.
OpenSource Software for Bayesian Inference
There are many opensource software libraries for Bayesian modelling and inference, and it is instructive to look into the inference methods that they do (or do not!) implement.
Further Topics
Bayesian inference doesn’t stop at MCMC and VI: there is bleedingedge research being done on other methods of inference. While they aren’t ready for realworld use, it is interesting to see what they are.
Approximate Bayesian computation (ABC) and likelihoodfree methods
 Likelihoodfree Monte Carlo by Scott Sisson and Yanan Fan.
Expectation propagation
 Expectation propagation as a way of life: A framework for Bayesian inference on partitioned data by Aki Vehtari, Andrew Gelman, et al.
Operator variational inference (OPVI)
 Operator Variational Inference by Rajesh Ranganath, Jaan Altosaar, Dustin Tran and David Blei. The original OPVI paper.
(I’ve tried to include as many relevant and helpful resources as I could find, but if you feel like I’ve missed something, drop me a line!)

If that’s what you’re looking for, check out my Bayesian modelling cookbook or Michael Betancourt’s excellent essay on a principles Bayesian workflow. ↩