Bayesetta Workshop June 27-28th 2022

Bayesian Statistical Modeling Workshop for Rosetta-based Modeling

Registration open through May 20th

Workshop Goals

An integral part of protein structural modeling with Rosetta is evaluating the quality of predictions against experimental observations. The community effort to develop Scientific Benchmarks has been instrumental in tracking and measuring modeling progress. However, due to the complexity and uncertainty in both the model predictions and the biochemical data, it can be challenging for researchers from both biochemical and computational researchers without a strong statistical background to rigorously test prediction accuracy for their specific modeling tasks. To improve the scientific quality for Rosetta-based modeling we would like to hold a hands-on 2-day workshop aimed at teaching researchers in the Rosetta community practical best-practices for how to analyze their data using rigorous Bayesian statistical modeling.

Workshop Schedule Outline

The first day we will teach the basics of “tidy data” analysis in R and the BayesPharma analysis workflow built on Stan and brms. On the second day we will guide participants to work together to apply the statistical modeling methods to their own data analysis problems.

Day 1 Morning: tidy data analysis in R

Preparing RStudio environment, and loading and installing packages, RMarkdown
tidyverse tools: pipe-operator, data I/O, manipulating tables
Grammar of graphics plotting with ggplot2

Day 1 Afternoon: Bayesian model workflow

Bayesian theory: Priors + Data => Posterior
brms/Stan Bayesian regression modeling
Model evaluation using simulation based calibration
Model interpretation and hypothesis testing
BayesPharma dose-response modeling case-study

Day 2: Bring Your Own Data for hands-on analysis

Work together in pairs (one partner in the morning, the other in the afternoon)

Prerequisites

Basic familiarity with R packages, functions, and data structures
Bring-Your-Own-Data regression task. Experimental or modeling data set from own research to work on

Software

RStudio
RStan
C++ Toolchain for Windows, Mac, and Linux