Doing Bayesian Data Analysis book cover

Doing Bayesian Data Analysis

A Tutorial Introduction with R

There is an explosion of interest in Bayesian statistics, primarily because recently created computational methods have finally made Bayesian analysis tractable and accessible to a wide audience. Doing Bayesian Data Analysis, A Tutorial Introduction with R and BUGS, is for first year graduate students or advanced undergraduates and provides an accessible approach, as all mathematics is explained intuitively and with concrete examples. It assumes only algebra and ‘rusty’ calculus. Unlike other textbooks, this book begins with the basics, including essential concepts of probability and random sampling. The book gradually climbs all the way to advanced hierarchical modeling methods for realistic data. The text provides complete examples with the R programming language and BUGS software (both freeware), and begins with basic programming examples, working up gradually to complete programs for complex analyses and presentation graphics. These templates can be easily adapted for a large variety of students and their own research needs.The textbook bridges the students from their undergraduate training into modern Bayesian methods.

First-year Graduate Students and Advanced Undergraduate Students in Statistics, Psychology, Cognitive Science, Social Sciences, Clinical Sciences and Consumer Sciences in Business.

Hardbound, 672 Pages

Published: October 2010

Imprint: Academic Press

ISBN: 978-0-12-381485-2


  • “I think it fills a gaping hole in what is currently available, and will serve to create its own market as researchers and their students transition towards the routine application of Bayesian statistical methods.” -Prof. Michael lee, University of California, Irvine, and president of the Society for Mathematical Psychology

    “Kruschke’s text covers a much broader range of traditional experimental designs…has the potential to change the way
    most cognitive scientists and experimental psychologists approach the planning and analysis of their experiments" -Prof. Geoffrey Iverson, University of California, Irvine, and past president of the Society for Mathematical Psychology

    “John Kruschke has written a book on Statistics. It’s better than others for reasons stylistic. It also is better because itis Bayesian. To find out why, buy it -- it’s truly amazin’!”-James L. (Jay) McClelland, Lucie Stern Professor & Chair, Dept. Of Psychology, Standford University

    "In a December article in The New Yorker, Jonah Lehrer pointed out that some phenomena in the psychology literature are not always repeatable. One reason for this failure to replicate results comes from the kinds of statistics often used in Psychology. We use a procedure called Null Hypothesis Testing that was developed over 100 years ago. More recently, statisticians and psychologists have been working to create a new form of statistical testing based on Bayesian statistics. These methods may help us to avoid publishing studies that are not likely to replicate. John Kruschke published a nice tutorial on how to use these methods." -2010’s top ten advances in psychology on Psychology Today’s blog

    "The intended audience for this book is a first-year graduate student or advanced undergraduate in the social or biological sciences, but one whose mathematical background is sufficient for them to not be put off by occasional references to calculus… Kruschke also provides a comprehensive solution manual for the exercises in each chapter. He says he has worked on his book for six years and it shows, not least because it has few typographical errors and is well-presented. In summary, this book has several features that could make it preferable to its competitors…it is impressive that Kruschke is able to quickly bring readers up to speed on techniques such as robust regression and repeated-measures regression that would be considered ‘‘advanced’’ in the conventional NHST curriculum. His extensions from linear regression to logistic, ordinal probit and Poisson regression are very clearly articulated and will outfit students with a very adaptable statistical toolbox… This is the best introductory textbook on Bayesian MCMC techniques I have read, and the most suitable for psychology students. It fills a gap I described in my recent review of six other introductory Bayesian method texts (Smithson, 2010). I look forward to using it in my own teaching, and I recommend it to anyone wishing to introduce graduate or advanced undergraduate students to the emerging Bayesian revolution."--Journal of Mathematical Psychology

    "In sum, this is a new kind of textbook to teach a kind of statistical analysis that will be new to its audience. It uses a tutorial approach and instills in its students the tools of the trade: coding, debugging, simulating, and plotting. Though some will surely look down on its folksy tone, its extended analogies and cautious commenting, these measures will probably do much more good than harm. The text has the potential to change the methodological toolbox of a new generation of social scientists, bringing them up to a level of computation, modeling, and analysis that they might not have thought to be within their grasp. Where past approaches to teaching statistics to those in psychology and economics have not lead to widespread insight, this tutorial approach might."--Journal of Economic Psychology

    "I would describe this book as revolutionary, at least in the context of psychology. It is, to my knowledge, the first book of its kind in this field to provide a general introduction to exclusively Bayesian statistical methods. In addition, it does so almost entirely by way of Monte Carlo simulation methods. While reasonable minds may disagree, it is arguable that both the general Bayesian framework advocated here, and the heavy use of Monte Carlo simulations, are destined to be the future of all data-analysis, whether in psychology or elsewhere…the ideas and methods presented here will eventually be seen as the foundations for new approaches to statistics that will become commonplace in the near future."--British Journal of Mathematical and Statistical Psychology

    "There are quite a few books on Bayesian statistics, but what makes Doing Bayesian Data Analysis: A Tutorial With R and BUGS stand out for me is the author’s focus of the book-writing for real people with real data. From the very first chapter, the engaging writing style will get readers excited about this topic, a comment one can rarely make about statistical books. Clearly a master teacher, the author, John Kruschke, uses plain language to explain complex ideas and concepts. A comprehensive website is associated with the book and provides program codes, examples, data, and solutions to the exercises. If the book is used to teach a statistics course, this set of materials will be necessary and helpful for students as they go through the materials in the book step by step."--PsycCritiques


  • 1.) This Book’s Organization: Read Me First!

    1.1 Real People Can Read This Book

    1.2 Prerequisites

    1.3 The Organization of This Book

    1.3.1 What Are the Essential Chapters?

    1.3.2 Where’s the Equivalent of Traditional Test X in This Book

    1.4 Gimme Feedback (Be Polite)

    1.5 Acknowledgments

    Part 1.) The Basics: Parameters, Probability, Bayes’ Rule, and R

    2.) Introduction: Models We Believe In

    2.1 Models of Observations and Models of Beliefs

    2.1.1 Prior and Posterior Beliefs

    2.2 Three Goals for Inference from Data

    2.2.1 Estimation of Parameter Values

    2.2.2 Prediction of Data Values

    2.2.3 Model Comparison

    2.3 The R Programming Language

    2.3.1 Getting and Installing R

    2.3.2 Invoking R and Using the Command Line

    2.3.3 A Simple Example of R in Action

    2.3.4 Getting Help in R

    2.3.5 Programming in R

    2.4 Exercises

    3.) What Is This Stuff Called Probability?

    3.1 The Set of All Possible Events

    3.1.1 Coin Flips: Why You Should Care

    3.2 Probability: Outside or Inside the Head

    3.2.1 Outside the Head: Long-Run Relative Frequency

    3.2.2 Inside the Head: Subjective Belief

    3.2.3 Probabilities Assign Numbers to Possibilities

    3.3 Probability Distributions

    3.3.1 Discrete Distributions: Probability Mass

    3.3.2 Continuous Distributions: Rendezvous with Density

    3.3.3 Mean and Variance of a Distribution

    3.3.4 Variance as Uncertainty in Beliefs

    3.3.5 Highest Density Interval (HDI)

    3.4 Two-Way Distributions

    3.4.1 Marginal Probability

    3.4.2 Conditional Probability

    3.4.3 Independence of Attributes

    3.5 R Code

    3.5.1 R Code for Figure 3.1

    3.5.2 R Code for Figure 3.3

    3.6 Exercises

    4.) Bayes’ Rule

    4.1 Bayes’ Rule

    4.1.1 Derived from Definitions of Conditional Probability

    4.1.2 Intuited from a Two-Way Discrete Table

    4.1.3 The Denominator as an Integral over Continuous Values

    4.2 Applied to Models and Data

    4.2.1 Data Order Invariance

    4.2.2 An Example with Coin Flipping

    4.3 The Three Goals of Inference

    4.3.1 Estimation of Parameter Values

    4.3.2 Prediction of Data Values

    4.3.3 Model Comparison

    4.3.4 Why Bayesian Inference Can Be Difficult

    4.3.5 Bayesian Reasoning in Everyday Life

    4.4 R Code

    4.4.1 R Code for Figure 4.1

    4.5 Exercises

    Part 2.) All the Fundamentals Applied to Inferring a Binomial Proportion

    5.) Inferring a Binomial Proportion via Exact Mathematical Analysis

    5.1 The Likelihood Function: Bernoulli Distribution

    5.2 A Description of Beliefs: The Beta Distribution

    5.2.1 Specifying a Beta Prior

    5.2.2 The Posterior Beta

    5.3 Three Inferential Goals

    5.3.1 Estimating the Binomial Proportion

    5.3.2 Predicting Data

    5.3.3 Model Comparison

    5.4 Summary: How to Do Bayesian Inference

    5.5 R Code

    5.5.1 R Code for Figure 5.2

    5.6 Exercises

    6.) Inferring a Binomial Proportion via Grid Approximation

    6.1 Bayes’ Rule for Discrete Values of 0

    6.2 Discretizing a Continuous Prior Density

    6.2.1 Examples Using Discretized Priors

    6.3 Estimation

    6.4 Prediction of Subsequent Data

    6.5 Model Comparison

    6.6 Summary

    6.7 R Code

    6.7.1 R Code for Figure 6.2 and the Like

    6.8 Exercises

    7.) Inferring a Binomial Proportion via the Metropolis Algorithm

    7.1 A Simple Case of the Metropolis Algorithm

    7.1.1 A Politician Stumbles on the Metropolis Algorithm

    7.1.2 A Random Walk

    7.1.3 General Properties of a Random Walk

    7.1.4 Why We Care

    7.1.5 Why It Works

    7.2 The Metropolis Algorithm More Generally

    7.2.1 ""Burn-in,"" Efficiency, and Convergence

    7.2.2 Terminology: Markov Chain Monte Carlo

    7.3 From the Sampled Posterior to the Three Goals

    7.3.1 Estimation

    7.3.2 Prediction

    7.3.3 Model Comparison: Estimation of p(D)

    7.4 MCMC in BUGS

    7.4.1 Parameter Estimation with BUGS

    7.4.2 BUGS for Prediction

    7.4.3 BUGS for Model Comparison

    7.5 Conclusion

    7.6 R Code

    7.6.1 R Code for a Home-Grown Metropolis Algorithm

    7.7 Exercises

    8.) Inferring Two Binomial Proportions via Gibbs Sampling

    8.1 Prior, Likelihood, and Posterior for Two Proportions

    8.2 The Posterior via Exact Formal Analysis

    8.3 The Posterior via Grid Approximation

    8.4 The Posterior via Markov Chain Monte Carlo

    8.4.1 Metropolis Algorithm

    8.4.2 Gibbs Sampling

    8.5 Doing It with BUGS

    8.5.1 Sampling the Prior in BUGS

    8.6 How Different Are the Underlying Biases?

    8.7 Summary

    8.8 R Code

    8.8.1 R Code for Grid Approximation (Figures 8. and 8.2)

    8.8.2 R Code for Metropolis Sampler (Figure 8.3)

    8.8.3 R Code for BUGS Sampler (Figure 8.6)

    8.8.4 R Code for Plotting a Posterior Histogram

    8.9 Exercises

    9.) Bernoulli Likelihood with Hierarchical Prior

    9.1 A Single Coin from a Single Mint

    9.2 Multiple Coins from a Single Mint

    9.2.1 Posterior via Grid Approximation

    9.2.2 Posterior via Monte Carlo Sampling

    9.2.3 Outliers and Shrinkage of Individual Estimates

    9.2.4 Case Study: Therapeutic Touch

    9.2.5 Number of Coins and Flips per Coin

    9.3 Multiple Coins from Multiple Mints

    9.3.1 Independent Mints

    9.3.2 Dependent Mints

    9.3.3 Individual Differences and Meta-Analysis

    9.4 Summary

    9.5 R Code

    9.5.1 Code for Analysis of Therapeutic-Touch Experiment

    9.5.2 Code for Analysis of Filtration-Condensation Experiment

    9.6 Exercises

    10.) Hierarchical Modeling and Model Comparison

    10.1 Model Comparison as Hierarchical Modeling

    10.2 Model Comparison in BUGS

    10.2.1 A Simple Example

    10.2.2 A Realistic Example with ""Pseudopriors""

    10.2.3 Some Practical Advice When Using Transdimensional MCMC with Pseudopriors

    10.3 Model Comparison and Nested Models

    10.4 Review of Hierarchical Framework for Model Comparison

    10.4.1 Comparing Methods for MCMC Model Comparison

    10.4.2 Summary and Caveats

    10.5 Exercises

    11.) Null Hypothesis Significance Testing

    11.1 NHST for the Bias of a Coin

    11.1.1 When the Experimenter Intends to Fix N

    11.1.2 When the Experimenter Intends to Fix z

    11.1.3 Soul Searching

    11.1.4 Bayesian Analysis

    11.2 Prior Knowledge about the Coin

    11.2.1 NHST Analysis

    11.2.2 Bayesian Analysis

    11.3 Confidence Interval and Highest Density Interval

    11.3.1 NHST Confidence Interval

    11.3.2 Bayesian HDI

    11.4 Multiple Comparisons

    11.4.1 NHST Correction for Experimentwise Error

    11.4.2 Just One Bayesian Posterior No Matter How You Look at It

    11.4.3 How Bayesian Analysis Mitigates False Alarms

    11.5 What a Sampling Distribution Is Good For

    11.5.1 Planning an Experiment

    11.5.2 Exploring Model Predictions (Posterior Predictive Check)

    11.6 Exercises

    12.) Bayesian Approaches to Testing a Point (""Null"") Hypothesis

    12.1 The Estimation (Single Prior) Approach

    12.1.1 Is a Null Value of a Parameter among the Credible Values?

    12.1.2 Is a Null Value of a Difference among the Credible Values?

    12.1.3 Region of Practical Equivalence (ROPE)

    12.2 The Model-Comparison (Two-Prior) Approach

    12.2.1 Are the Biases of Two Coins Equal?

    12.2.2 Are Different Groups Equal?

    12.3 Estimation or Model Comparison?

    12.3.1 What Is the Probability That the Null Value Is True?

    12.3.2 Recommendations

    12.4 R Code

    12.4.1 R Code for Figure 12.5

    12.5 Exercises

    13.) Goals, Power, and Sample Size

    13.1 The Will to Power

    13.1.1 Goals and Obstacles

    13.1.2 Power

    13.1.3 Sample Size

    13.1.4 Other Expressions of Goals

    13.2 Sample Size for a Single Coin

    13.2.1 When the Goal Is to Exclude a Null Value

    13.2.2 When the Goal Is Precision

    13.3 Sample Size for Multiple Mints

    13.4 Power: Prospective, Retrospective, and Replication

    13.4.1 Power Analysis Requires Verisimilitude of Simulated Data

    13.5 The Importance of Planning

    13.6 R Code

    13.6.1 Sample Size for a Single Coin

    13.6.2 Power and Sample Size for Multiple Mints

    13.7 Exercises

    Part 3.) Applied to the Generalized Linear Model

    14.) Overview of the Generalized Linear Model

    14.1 The Generalized Linear Model (GLM)

    14.1.2 Scale Types: Metric, Ordinal, Nominal

    14.1.3 Linear Function of a Single Metric Predictor

    14.1.4 Additive Combination of Metric Predictors

    14.1.5 Nonadditive Interaction of Metric Predictors

    14.1.6 Nominal Predictors

    14.1.7 Linking Combined Predictors to the Predicted

    14.1.8 Probabilistic Prediction

    14.1.9 Formal Expression of the GLM

    14.1.10 Two or More Nominal Variables Predicting Frequency

    14.2 Cases of the GLM

    14.3 Exercises

    15.) Metric Predicted Variable on a Single Group

    15.1 Estimating the Mean and Precision of a Normal Likelihood

    15.1.1 Solution by Mathematical Analysis

    15.1.2 Approximation by MCMC in BUGS

    15.1.3 Outliers and Robust Estimation: The t Distribution

    15.1.4 When the Data Are Non-normal: Transformations

    15.2 Repeated Measures and Individual Differences

    15.2.1 Hierarchical Model

    15.2.2 Implementation in BUGS

    15.3 Summary

    15.4 R Code

    15.4.1 Estimating the Mean and Precision of a Normal Likelihood

    15.4.2 Repeated Measures: Normal Across and Normal Within

    15.5 Exercises

    16.) Metric Predicted Variable with One Metric Predictor

    16.1 Simple Linear Regression

    16.1.1 The Hierarchical Model and BUGS Code

    16.1.2 The Posterior: How Big Is the Slope?

    16.1.3 Posterior Prediction

    16.2 Outliers and Robust Regression

    16.3 Simple Linear Regression with Repeated Measures

    16.4 Summary

    16.5 R Code

    16.5.1 Data Generator for Height and Weight

    16.5.2 BRugs: Robust Linear Regression

    16.5.3 BRugs: Simple Linear Regression with Repeated Measures

    16.6 Exercises

    17.) Metric Predicted Variable with Multiple Metric Predictors

    17.1 Multiple Linear Regression

    17.1.1 The Perils of Correlated Predictors

    17.1.2 The Model and BUGS Program

    17.1.3 The Posterior: How Big Are the Slopes?

    17.1.4 Posterior Prediction

    17.2 Hyperpriors and Shrinkage of Regression Coefficients

    17.2.1 Informative Priors, Sparse Data, and Correlated Predictors

    17.3 Multiplicative Interaction of Metric Predictors

    17.3.1 The Hierarchical Model and BUGS Code

    17.3.2 Interpreting the Posterior

    17.4 Which Predictors Should Be Included?

    17.5 R Code

    17.5.1 Multiple Linear Regression

    17.5.2 Multiple Linear Regression with Hyperprior on Coefficients

    17.6 Exercises

    18.) Metric Predicted Variable with One Nominal Predictor

    18.1 Bayesian Oneway ANOVA

    18.1.1 The Hierarchical Prior

    18.1.2 Doing It with R and BUGS

    18.1.3 A Worked Example

    18.2 Multiple Comparisons

    18.3 Two-Group Bayesian ANOVA and the NHST t Test

    18.4 R Code

    18.4.1 Bayesian Oneway ANOVA

    18.5 Exercises

    19.) Metric Predicted Variable with Multiple Nominal Predictors

    19.1 Bayesian Multifactor ANOVA

    19.1.2 The Hierarchical Prior

    19.1.3 An Example in R and BUGS

    19.1.4 Interpreting the Posterior

    19.1.5 Noncrossover Interactions, Rescaling, and Homogeneous Variances

    19.2 Repeated Measures, a.k.a. Within-Subject Designs

    19.2.1 Why Use a Within-Subject Design? And Why Not?

    19.3 R Code

    19.3.1 Bayesian Two-Factor ANOVA

    19.4 Exercises

    20.) Dichotomous Predicted Variable

    20.1 Logistic Regression

    20.1.1 The Model

    20.1.2 Doing It in R and BUGS

    20.1.3 Interpreting the Posterior

    20.1.4 Perils of Correlated Predictors

    20.1.5 When There Are Few 1’s in the Data

    20.1.6 Hyperprior Across Regression Coefficient

    20.2 Interaction of Predictors in Logistic Regression

    20.3 Logistic ANOVA

    20.3.1 Within-Subject Designs

    20.4 Summary

    20.5 R Code

    20.5.1 Logistic Regression Code

    20.5.2 Logistic ANOVA Code

    20.6 Exercises

    21.) Ordinal Predicted Variable

    21.1 Ordinal Probit Regression

    21.1.1 What the Data Look Like

    21.1.2 The Mapping from Metric x to Ordinal y

    21.1.3 The Parameters and Their Priors

    21.1.4 Standardizing for MCMC Efficiency

    21.1.5 Posterior Prediction

    21.2 Some Examples

    21.2.1 Why Are Some Thresholds Outside the Data?

    21.3 Interaction

    21.4 Relation to Linear and Logistic Regression

    21.5 R Code

    21.6 Exercises

    22.) Contingency Table Analysis

    22.1 Poisson Exponential ANOVA

    22.1.1 What the Data Look Like

    22.1.2 The Exponential Link Function

    22.1.3 The Poisson Likelihood

    22.1.4 The Parameters and the Hierarchical Prior

    22.2 Examples

    22.2.1 Credible Intervals on Cell Probabilities

    22.3 Log Linear Models for Contingency Tables

    22.4 R Code for the Poisson Exponential Model

    22.5 Exercises

    23.) Tools in the Trunk

    23.1 Reporting a Bayesian Analysis

    23.1.1 Essential Points

    23.1.2 Optional Points

    23.1.3 Helpful Points

    23.2 MCMC Burn-in and Thinning

    23.3 Functions for Approximating Highest Density Intervals

    23.3.1 R Code for Computing HDI of a Grid Approximation

    23.3.2 R Code for Computing HDI of an MCMC Sample

    23.3.3 R Code for Computing HDI of a Function

    23.4 Reparameterization of Probability Distributions

    23.4.1 Examples

    23.4.2 Reparameterization of Two Parameters




advert image