Materials from Stan conferences
This repository contains materials from StanCon and links to StanCon content hosted elsewhere. Unless otherwise noted, the text in this repository is distributed under the CC BY 4.0 License and code is distributed under the New BSD License. Copyright to the authors.
2020 Virtual Conference
2019 Cambridge (UK)
2018 Helsinki
2018 Asilomar
2017 NYC
StanCon’s version of conference proceedings is a collection of contributed talks based on interactive notebooks. Every submission is peer reviewed by at least two reviewers. The reviewers are members of the Stan Conference Organizing Committee and the Stan Developmemt Team. This repository contains all of the accepted notebooks as well as any supplementary materials required for building the notebooks. The slides presented at the conference are also included.
Solving ODEs in the wild: Scalable pharmacometrics with Stan
Pharmacometric modeling involves nonlinear hierarchical models, which are most naturally expressed as forced ordinary differential equations (ODEs). These class of models lead to a number of challenges which complicate a practical modeling work-flow in Stan mostly due to long model execution times. This contribution demonstrates at the example of the drug Warfarin how forced ODEs can be written efficiently in Stan leading to a doubling in model evaluation speed for the presented example. Finally, it is demonstrated how the new
map_rectfacility in Stan can be used to make models scalable to large data sets leading to substantial speedups in model evaluation time and most importantly this enables to scale Stan's performance as needed.
Links:
Analysis of repeated measures data in RStan
We illustrate the analysis of repeated measures data in the Bayesian framework using RStan. In addition to the modelling itself, we further show how to make inference on the primary effect based on a probability of success, and how to predict the longitudinal profile of a future patient, two difficult (if not impossible) tasks from a frequentist perspective.
Links:
Define custom response distributions with brms
The brms package (Bürkner, 2017a, 2017b) implements Bayesian regression models using the probabilistic programming language Stan (Carpenter et al., 2016) behind the scenes. It has grown to be one of the most flexible R packages when it comes to multilevel regression modelling. A wide range of response distributions are supported, allowing users to fit -- among others -- linear, robust linear, count data, survival, response times, ordinal, zero-inflated, hurdle, and even self-defined mixture models all in a multilevel context. Predictor terms can be specified with a simple yet powerful formula syntax. Thanks to Stan, even very complex models tend to converge well in a reasonable amount of time. While brms comes with a lot of built-in response distributions (usually called families in R), there is still a long list of distributions which are not natively supported. The present notebook will explain how to specify such custom families in brms. By doing that, users can benefit from the modeling flexibility and post-processing options of brms even when applying self-defined response distributions.
Links:
Are shots predictive of soccer results?
Predicting the outcome of a soccer match is the subject of much debate, and several models based on different assumptions have been proposed for modeling the numbers of goals scored by two competing teams. In this case study we adopt a different perspective and propose a Bayesian hierarchical model consisting of three nested multiple outcomes: number of scores, number of shots on target and number of total shots. We model the number of scores and the number of shots on target with two binomial distributions respectively, whereas the total shots follow a negative binomial distributon. Our dataset consists of nine seasons of the English Premier League (EPL): eight seasons---from 2008/2009 to 2015/2016, 3040 matches---represent the train set, whereas the nineth season, 2016/2017 (with the remaining 380 matches), is our test set, used for out-of sample prediction.
Links:
Flexible models of holiday lift
We develop a flexible, portable, and interpretable Bayesian model of cyclical holiday effects on time series. Our model uses five parameters for each possible holiday that capture the general shape, magnitude, and peak location offset of each holiday effect. Choice of priors prevents the model from overfitting while still achieving considerable flexibility. We experiment on simulated and real data from Google Trends and demonstrate the model's performance on held-out data.
Links:
Artificial turf advantage and predictive accuracy in Dutch football
This submission uses Stan to learn about the so-called artificial turf advantage in Dutch football. Two model variants are used to model match outcomes. One of the models is the model from Milad Kharratzadeh presented at Stancon 2017, the other is new to Stan (a dynamic Skellam model). I use out-of-sample forecasts together with the Ranked Probability Score (a proper scoring rule) to learn whether including the artificial turf advantage increases predictive accuracy. Bookmakers odds are used as a benchmark to check the quality of our forecasts.
Links:
ODE model of gene regulation
In this notebook we fit time series of gene expression data with a non-linear ODE-based model. Splines are used to model noisily observed regulator expression. The ODE is not solved explicitly, it is instead transformed to a definite integral and integrated via the trapezoid rule. Some interesting reparametrizations are introduced to make the model well identified.
Links:
Predicting New York City school enrollment
We propose a Bayesian hierarchical Age-Period-Cohort model to predict elementary school enrollment in New York City. We demonstrate this model using student enrollment data for grades K-5 in each Census Tract of Brooklyn's 20th School District over the 2001-02 to 2010-11 school years. Specifically, our model disaggregates enrollment into grade (age), year (period), and cohort effects so that each can be interpreted and extrapolated over the 2011-12 to 2017-18 school years. We find this approach ideal for incorporating spatial information indicative of the socioeconomic forces that determine school enrollment in New York City.
Links:
Analyzing brain taxonomy trees
This talk and notebook introduce the idea of analyzing brain anatomy as a taxonomy of structures, defined by containment, using Stan. This taxonomy imposes dependence between model coefficients for structures and their enclosing structure. The slides and notebook compare this approach to other hierarchical modelling strategies using the mouse brain for illustration.
Links:
Getting more out of Stan: some ideas from the Haskell bindings
We present draft bindings to Stan in Haskell, a purely functional programming language. Unlike in most bindings, our models are encoded as a data type the host language. We show how this can be used to widen the range of computations that can be done based on the Stan model definition. For instance, predictions, posterior predictive checks and residual calculations can be done based on a single model definition.
Links:
Dose-finding clinical trial designs in Stan with trialr
My notebook illustrates two different methods for conducting dose-finding clinical trials. The first, CRM, escalates dose according to toxicity outcomes only, assuming implicitly that higher doses are more likely to benefit the patient. The second, EffTox, escalates dose according to efficacy and toxicity outcomes, thus addressing the potential that higher may not always mean better. These models are implemented in the trialr R-package using Stan.
Links:
"The implementation of a model of choice: the (truncated) linear ballistic accumulator.
It is very common in cognitive science and psychology to use experimental tasks that involve making a fast choice among a restricted number of alternatives. In this notebook, I focus on one influential and relatively simple model that belongs to the class of sequential-sampling models: the linear ballistic accumulator with a drift rate drawn from a normal distribution (restricted to positive values) (S. D. Brown and Heathcote 2008; Heathcote and Love 2012). First, I discuss the motivation for fitting this model using the Stroop task (Stroop 1935) as a case study. Then, I discuss the challenges in the implementation of the model in (R)Stan (Stan Development Team 2017), which might also apply to other hierarchical models with complex likelihood functions. Finally, I show some results that exemplify how the linear ballistic accumulator can be used for examining individual differences.
Links:
Hierarchical Ornstein-Uhlenbeck type t-processes
This work investigates probabilistic time series models that are motivated by applications in statistical ecology. In particular, we investigate variants of the mean-reverting and stochastic Ornstein-Uhlenbeck (OU) process. We provide a hierarchical extension for joint analysis of multiple (short) time series, validate the model, and analyze its performance with simulations. The works extends the recent Stan implementation of the OU process (A 2018), where parameter estimates of a Student-t type OU process are obtained based on a single (long) time series. We have added a level of hierarchy, which allows joint inference of the model parameters across multiple time series.
Links:
GPU optimized math routines in the Stan math library
The Stan Math library's Hamilton Monte Carlo (HMC) sampler has computationally expensive draws while usually searching the target distribution more efficiently than alternative MCMC methods with fewer iterations. The bottleneck within draws makes Stan a prime candidate for GPU optimizations within samples. This project implements GPU optimizations for the Cholesky decomposition and it's derivative in the Stan Math library [@stanmath2015]. This work is the first known open source implementation of the Cholesky decomposition with a GPU in an HMC setting. Furthermore, the GPU kernels use OpenCL which allows the use of these methods across any brand of GPU. While results show that GPU optimizations are not optimal for small $N\times M$ matrices, large matrices can see speedups of 7.8x while retaining the same precision as models run purely on a CPU.
Links:
Relating Disparate Measures of Coagulapathy Using Unorthodox Data: A Hybrid Mechanistic-Statistical Approach
Links:
Modeling the Effects of Nutrition with Mixed-Effect Bayesian Network
This work proposes Mixed-Effect Bayesian Network (MEBN) as a method for modeling the effects of nutrition. It allows identifying both typical and personal correlations between nutrients and their bodily responses. Predicting a personal network of nutritional reactions would allow interesting applications at personal diets and in understanding this complex system. Brief theory of MEBN is first given, followed by the implementation in R and Stan. A real life dataset from a nutritional study (Sysdimet) is then analyzed with this method and the results are visualized with a responsive JavaScript-visualization.
Links:
Using counterfactual queries to improve models for decision-support
In this extended abstract, we generalize active learning to tasks where a human has to choose which action a to take for a target after observing its covariates x ̃ and predicted outcomes p( ̃y|x, a ̃ ). An example case is personalized medicine and the decision of which treatment to give to a pa- tient. We show that standard active learning, which is not aware of the final task, would be very inefficient, and we introduce a new problem of decision-making-aware active learning. We for- mulate the problem as finding the query with the highest information gain for the specific decision- making task, assuming a rational decision-maker. The problem can be solved particularly efficiently assuming an expert able to answer queries about counterfactuals. We demonstrate the effective- ness of the proposed method in a binary outcome decision-making task using simulated data, and in a continuous-valued outcome task on the medical dataset IHDP with synthetic treatment outcomes. The outcomes are predicted using Gaussian processes.
Links:
Hierarchical modelling of galaxy clusters for cosmology
Bad data, big models & statistical methods for studying evolution
Identifying the effect of public holidays on daily demand for gas
Esther Williams in the Harold Holt Memorial Swimming Pool
Basics of Bayesian inference and Stan
Hierarchical models
Stan C++ development: adding a new function to Stan
Ordinary differential equation (ODE) models in Stan
Productization of Stan
Instructor: Eric Novik Productization panel discussion: Markus Ojala (Smartly), Tom Nielsen (Tweag.io), Anna Kircher (Lendable), Eric Novik (Generable) * Video
Model assessment and selection
StanCon’s version of conference proceedings is a collection of contributed talks based on interactive notebooks. Every submission is peer reviewed by at least two reviewers. The reviewers are members of the Stan Conference Organizing Committee and the Stan Developmemt Team. This repository contains all of the accepted notebooks as well as any supplementary materials required for building the notebooks. The slides presented at the conference are also included.
Does the New York City Police Department rely on quotas?
This submission investigates whether the New York City Police Department (NYPD) uses productivity targets or quotas to manage officers in contravention of New York State Law. The analysis is presented in three parts. First, the NYPD's employee evaluation system is introduced, and the criticism that it constitutes a quota is summarized. Secondly, a publically available dataset of traffic tickets issued by NYPD officers in 2014 and 2015 is described. Finally, a generative model to describe how officers write traffic tickets is proposed. The fitted model is consistent with the criticism that police officers substantially alter their ticket writing to coincide with departmental targets. The submission concludes by discussing the implication of these findings and offering directions for further research.
Links:
Diagnosing Alzheimer’s the Bayesian way
Alzheimer's Disease is one the most debilitating diseases, but how do we diagnose it accurately? Researchers have been trying to answer this question by building generative models to describe how patient biomarkers, such as MRI scans, psychological tests, and lab tests relate over time to the underlying brain deterioration that's present in Alzheimer's Disease. In this notebook we show how we translated these models to the Bayesian framework in Stan and how this allowed for several model improvements that can ultimately improve our understanding of Alzheimer's and help physicians in diagnosis. In particular, we describe how we hierarchically model patient disease trajectories to obtain stable estimates for patients who lack data. We describe how fitting in Stan yields uncertainties on these disease trajectories, and why that is important for weighing the pros and cons of risky treatment. Lastly, we describe a new method for Bayesian modeling of these monotonic disease trajectories in Stan using I-Splines.
Links:
Joint longitudinal and time-to-event models via Stan
The joint modelling of longitudinal and time-to-event data has received much attention in the biostatistical literature in recent years. In this notebook (and talk), we describe the implementation of a shared parameter joint model for longitudinal and time-to-event data in Stan. The methods described in the notebook are a simplified version of those underpinning the
stan_jmmodeling function that has recently been contributed to the rstanarm R package.
Links:
A tutorial on Hidden Markov Models using Stan
We implement a standard Hidden Markov Model (HMM) and the Input-Output Hidden Markov Model for unsupervised learning of time series dynamics in Stan. We begin by reviewing three commonly-used algorithms for inference and parameter estimation, as well as a number of computational techniques and modeling strategies that make full Bayesian inference practical. For both models, we demonstrate the effectiveness of our proposed approach in simulations. Finally, we give an example of embedding a HMM within a larger model using an example from the econometrics literature.
Links:
Student Ornstein-Uhlenbeck models served three ways (with applications for population dynamics data)
Ornstein-Uhlenbeck (OU) processes are a mean reverting process and is used to model dynamics in biology, physics, and finance. I fit an extension of the OU process that is driven by a Lévy process with Student's t-marginals rather than Brownian motion with Gaussian marginals, which allows for heavy-tailed increments. I implement four formulations of the Student-t OU-type model in Stan and compare the sampling performance on both real and simulated population dynamic data.
Links:
SlicStan: a blockless Stan-like language
We present SlicStan — a probabilistic programming language that compiles to Stan and uses static analysis techniques to allow for more abstract and flexible models. SlicStan is novel in two ways: (1) it allows variable declarations and statements to be automatically shredded into different components needed for efficient Hamiltonian Monte Carlo inference, and (2) it introduces more flexible user-defined functions that allow for new model parameters to be declared as local variables. This work demonstrates that efficient automatic inference can be the result of the machine learning and programming languages communities joint efforts.
Links:
idealstan: an R package for ideal point modeling with Stan
Item-response theory (IRT) ideal-point scaling/dimension reduction methods that incorporate additional response categories and missing/censored values, including absences and abstentions, for roll call voting data (or any other kind of binary or ordinal item-response theory data). Full and approximate Bayesian inference is done via Stan.
Links:
Computing steady states with Stan’s nonlinear algebraic solver
Stan’s numerical algebraic solver can be used to solve systems of nonlinear algebraic equations with no closed form solutions. One of its key applications in scientific and engineering fields is the computation of equilibrium states (equivalently steady states). This case study illustrates the use of the algebraic solver by applying it to a problem in pharmacometrics. In particular, I show the algebraic system we solve can be quite complex and embed, for instance, numerical solutions to ordinary differential equations. The code in R and Stan are provided, and a Bayesian model is fitted to simulated data.
Links:
Bayesian estimation of mechanical elastic constants
This outlines a Bayesian approach to resonance ultrasound spectroscopy (RUS), a technique for estimating elastic constants of a material from a sample's measured resonance modes. The notebook includes an example of how to take advantage of custom automatic differentiation in specialized Stan models (either for numerical or efficiency reasons).
Links:
Aggregate random coefficients logit — a generative approach
This notebook illustrates how to fit aggregate random coefficient logit models in Stan, using Bayesian techniques. It’s far easier to learn and implement than the standard BLP algorithm, and has the benefits of being robust to mismeasurement of market shares, and giving limited-sample posterior uncertainty of all parameters (and demand shocks). This comes at the cost of modeling firms’ price-setting process, including how unobserved product-market demand shocks affect prices.
Links:
The threshold test: Testing for racial bias in vehicle searches by police
We develop a new statistical test to detect bias in decision making — the threshold test—that mitigates the problem of infra-marginality by jointly estimating decision thresholds and risk distributions.
Links:
Assessing the safety of Rosiglitazone for the treatment of type II diabetes
A Bayesian paradigm for making drug approval decisions. Case study in the treatment of Diabetes (Type 2).
Links:
Causal inference with the g-formula in Stan
The potential outcomes framework often uses one or more parametric outcome models to learn about underlying causal processes. In Stan, parameter estimation using observed data takes place in the model block, while simulation-based estimation of causal parameters using the g-formula can be done separately with generated quantities. Bayesian estimation allows for data-driven sensitivity analysis regarding the assumption of no unmeasured confounding. This presentation shows some simple causal models, then outlines a basic sensitivity analysis using prior information derived from an external data source.
Links:
Bayesian estimation of ETAS models with Rstan
Earthquake modeling with Stan. Applied to seismic recurrence in Ecuador in 2016.
Links:
Predictive information criteria in hierarchical Bayesian models for clustered data
ScalaStan
Stan applications in physics: Testing quantum mechanics and modeling neutrino masses
Forecasting at scale: How and why we developed Prophet for forecasting at Facebook
Stan applications in human genetics: Prioritizing genetic mutations that protect individuals from human disease
Statistics using geometry to show uncertainties and integrate graph information
A brief history of Stan
Model assessment, model selection and inference after model selection
Spatial models in Stan: intrinsic auto-regressive models for areal data
Some problems I'd like to solve in Stan, and what we'll need to do to get there
StanCon’s version of conference proceedings is a collection of contributed talks based on interactive notebooks. Every submission is peer reviewed by at least two reviewers. The reviewers are members of the Stan Conference Organizing Committee and the Stan Developmemt Team. This repository contains all of the accepted notebooks as well as any supplementary materials required for building the notebooks. The slides presented at the conference are also included.
Twelve Cities: Does lowering speed limits save pedestrian lives?
We investigate whether American cities can expect to achieve a meaningful reduction in pedestrian deaths by lowering the posted speed limit. We find some evidence that a lower speed limit does in fact reduce fatality rates, and our estimated causal effect is similar to the traditional before-after analysis espoused by policy analysts. Nevertheless, we conclude that adjusting the posted speed limit in urban environments does not correspond with a reliable reduction in pedestrian fatalities.
Links:
Hierarchical Bayesian Modeling of the English Premier League
In this case study, we provide a hierarchical Bayesian model for the English Premier League in the season of 2015/2016. The league consists of 20 teams and each two teams play two games with each other (home and away games). So, in total, there are 38 weeks, and 380 games. We model the score difference (home team goals − away team goals) in each match. The main parameters of the model are the teams’ abilities which is assumed to vary over the course of the 38 weeks. The initial abilities are determined by performance in the previous season plus some variation.
Links:
Advertising Attribution Modeling in the Movie Industry
We present a Bayesian method for inferring advertising platform effectiveness as applied to the movie industry, and show some possibilities for drawing inferences by analyzing model parameters at different levels of the hierarchy. In addition, we show some common ways to check model efficacy, and possibilities for comparing between different models.
Links:
hBayesDM: Hierarchical Bayesian modeling of decision-making tasks
hBayesDM (hierarchical Bayesian modeling of Decision-Making tasks) is a user-friendly R package that offers hierarchical Bayesian analysis of various computational models on an array of decision-making tasks.
Links:
Differential Equation Based Models in Stan
Differential equations can help us model sophisticated processes in biology, physics, and many other fields. Over the past year, the Stan team has developed many tools to tackle models based on differential equations.
Links:
How to Test IRT Models Using Simulated Data
This notebook explains how to code some IRT models using Stan and test whether they can recover input parameters when given simulated data.
Links:
Models of Retrieval in Sentence Comprehension
This work presents an evaluation of two well-known models of retrieval processes in sentence comprehension, the activation-based model and the direct-access model. We implemented these models in a Bayesian hierarchical framework and showed that some aspects of the data can be explained better by the direct access model. Specifically, the activation-based cannot predict that, on average, incorrect retrievals would be faster than correct ones. More generally, our work leverages the capabilities of Stan to provide a powerful framework for flexibly developing computational models of competing theories of retrieval, and demonstrates how these models’ predictions can be compared in a Bayesian setting.
Links:
Hierarchical Gaussian Processes in Stan
Stan’s library has been expanded with functions that facilitate adding Gaussian processes (GPs) to Stan models. I will share the best practices for coding GPs in Stan, and demonstrate how GPs can be added as one component of a larger model.
Links:
Modeling the Rate of Public Mass Shootings with Gaussian Processes
We have used Stan to develop a new model for the annualized rate of public mass shootings in the United States based on a Gaussian process with a time-varying mean function. This design yields a predictive model with the full non-parametric flexibility of a Gaussian process, while retaining the direct interpretability of a parametric model for long-term evolution of the mass shooting rate. We apply this model to the Mother Jones database of public mass shootings and explore the posterior consequences of different prior choices and of correlations between hyperparameters. We reach conclusions about the long term evolution of the rate of public mass shootings in the United States and short-term periods deviating from this trend.
Links: