Statistical Rethinking (2nd Ed) with Tensorflow Probability
This repository provides jupyter notebooks that port various R code fragments found in the chapters of Statistical Rethinking 2nd Edition by Professor Richard McElreath to python using tensorflow probability framework.
Note - These notebooks are based on the 8th December 2019 draft. I will update the notebooks once the book is released.
PyMC4). There are 2 main reasons why I chose to do this exercise in tfp.
For production use, I strongly recommend that one must use these higher level libraries i.e.Numpyro,PyMC3,PyMC4
What worked ? Well of course this book is the best there is in this area. The community is also great. I got quick responses from tensorflow probability team whenever I asked questions on tfp google group.
What was hard ? It may be tad bit subjective because I am challenged when it comes to manipulating shapes (high dimensional arrays). I find numpy to be difficult and tensorflow is way more harder when it comes to working with multi-dimensional arrays. This is one of the main problems I have faced and continue to face. Another problem is that the stack trace generated by TFP can be really difficult to understand. This mostly is the side effect of graphs that make debugging difficult. Quite often as long as I used only 1 chain things would work but working with multiple chains require that you pay special attention to the shapes/batches of the various tensors/distributions.
Visualization I have made use of
arvizand in order to do that I converted the output of various sampling procedures to the format/structure required by it. This made me learn and discover
xarray. It was really worth doing it and made it easy to plot the graphs.
There are few code cells in various notebooks that are still not working. I do plan to investigate & fix/finish them. Chapter 14 in particular is not working. Any help is appreciated.
In majority of the chapters, the book has used quadratic approximation (quap) where as I have used HMC everywhere. I plan to change this as well by implementing Quadratic/Laplace approximation.
If you prefer the readonly view of notebooks (html pages) then use this link - https://ksachdeva.github.io/rethinking-tensorflow-probability/
If you want to run the notebooks locally -
# install the requirements pip install -r requirements.txt # install jupyter in your virtual environment pip install -r requirements-extra.txt
Clicking on the links will open the notebooks in Google Colab
My immense gratitude goes to Professor Richard McElreath for writing such a wonderful book. His method of teaching has made somewhat difficult subject of Bayesian Statistics approachable, interesting and to some extent fun as well. We need more educators like you Sir !.
Another person I want to thank is Du Phan (https://github.com/fehiepsi). He is the main author of Numpyro, a great framework to do Bayesian Analysis. He has ported Statsical Rethinking (2nd Ed) to Numpyro and his notebooks were not only insipirational but were also of great help to me in creating graphs. I borrowed most of his code fragments when it came to plotting the figures using matplotlib.