Bayesian Core

A Practical Approach to Computational
Bayesian
Statistics

J.-M. Marin & Ch. P. Robert

Springer-Verlag [site], New York, 2007, ISBN 0-387-38979-2


Goals - Format - Schedule - Places - Contact Slides, codes & datasets - Typos Reviews


Goals

The purpose of this book is to provide a self-contained entry to practical & computational Bayesian Statistics using generic examples from the most common models, for a class duration of about 7 blocks that roughly corresponds to 12 to 14 weeks of teaching (with 3 hours of lectures per week), depending on the intended level & the prerequisites imposed on the students. (That timing does not includes practice, i.e. programming labs, since those may have a variable duration also depending on the students’ involvement & their programming abilities). The emphasis on practice is a strong feature of this book in that its primary audience is made of graduate students that need to use (Bayesian) statistics as a tool to analyse their experiments and/or datasets. The book should also appeal to scientists in all fields, given the versatility of the Bayesian tools. It can also be used for a more classical Statistics audience when aiming at teaching a quick entry to Bayesian Statistics at the end of an undergraduate program for instance. (Obviously, it can supplement another textbook on Data Analysis at the graduate level.) The minimal prerequisites for this course are a mastering of basic Probability theory for discrete and continuous variables and of basic Statistics (MLE, sufficient statistics).

Format

The format of the book is a somehow sketchy coverage of the topics, always backed by a motivated problem & a corresponding dataset (available on this website), & a detailed resolution of the inference procedures pertaining to this problem, sometimes including commented R programs. Additional cases are also be proposed as exercises. The current format is therefore self contained & can thus serve as a unique textbook for a service course for scientists aiming at analyzing data the Bayesian way or as an introductory course on Bayesian Statistics. Contrary to usual practice, the exercises are interseeded within the chapters rather than postponed till the end of each chapter. There are two reasons for this stylistic choice: first, the results or developments contained in those exercises are often relevant for upcoming points in the chapter. Second, they signal to the student (or to any reader) that some pondering over the previous pages may be useful before moving to the following topic & so may act as self-checking gateways.

Schedule & expectations

A course corresponding to the book has been taught since 2003 in a second year Master program for students aiming at a professional degree in Data processing & Statistics (at University Paris Dauphine). The first half of the book was used in a seven week (intensive) program & students were tested on both the exercises (meaning, all exercises) & their (practical) mastering of the datasets, the stated expectation being that they should go further than a mere reproduction of the R outputs presented in the book. While the students found that the amount of work required by this course was rather beyond their usual standards (!), we observed that their understanding & mastering of Bayesian techniques was much deeper & more ingrained than in the more formal courses their counterparts had had the years before. In short, they started to think about the purpose of a Bayesian statistical analysis rather than on the contents of the final test & they ended building a true intuition about what the results should look like, intuition that for instance helped them to detect modelling & programming errors! In most subjects, working on Bayesian Statistics from this perspective developed a genuine interest for the approach & there were several occurrences of students that continued using this approach in later courses or even better in their job.

In order to check the progress of the students, we collected reports at the end of each chapter that contained both at the resolution of most exercises & an original analysis of the datasets supporting the corresponding chapter, including commented R programs. A solution manual with solutions to all exercises is available here as well as on Springer webpage [here] for any reader interested, not only for instructors as was previously the case.

Places where the book is used

Besides Paris Dauphine, this course has been taught by the authors abroad, namely at the University of Canterbury, Christchurch, New Zealand, in the summer of 2006, & at the Universidad Central de Venezuela, Caracas, Venezuela, in the fall of 2006. The book has already been used as a textbook at University of British Columbia, Vancouver, & at Université de Montréal, both in Canada, as well as Queensland University of Technology, Brisbane, Australia. It is now used at the University of Bristol, the University of Canterbury, the University of Massachusset, and [as a reference] at the University of Maryland and the University of Wyoming.

Contact details

The authors can be contacted for general comments or questions & in particular for typos that will be posted as soon as they come to the authors. We however cannot reply to all questions about exercises & R programs, especially now that a complete solution manual is available freely and unrestrictedly). The same lack of support applies for the LaTeX files provided along with the PDF files only for instructors' convenience. (Here is the complete tared+gzipped file for recompiling the slides, along with figures and macros!)

                  Christian Robert, Université Paris Dauphine & CREST-INSEE
                  email xian [at] ceremade.dauphine.fr

Slides, codes & datasets


Chapters Topics Slides
pdf
(latex)
Datasets R functions R commands
2. Normal models Conditional distributions, priors, posteriors, improper priors, conjugate priors, exponential families, tests, Bayes factors, decision theory, importance sampling #2 
(tex)
normaldata
CMBdata
90cntycr.wk1

#2.txt
3. Regression and variable selection G-priors, noninformative priors, Gibbs
sampling, variable selection
#3
(tex)
caterpillar #3.R #3.txt
4. Generalised linear models  Probit, logit and log-linear models, Metropolis Hastings algorithms, model choice #4
(tex)
bank
airquality
#4.R #4.txt
5. Capture–recapture experiments Sampling models, open populations,
accept reject algorithm, Arnason Schwarz
model
#5
(tex)
northernpintail
eurodipabc
EuroDipper
#5.R #5.txt
6. Mixture models  Completion, variable dimensional models,
label switching, tempering, reversible
jump MCMC
#6
(tex)
license

#6.txt
7. Dynamic models AR, MA and ARMA models, state-space
representation, hidden Markov models, forward-backward algorithm
#7
(tex)
Eurostoxx50
Dnadataset


#7.txt
8. Image analysis k-nearest-neighbor, supervised classification,
segmentation, Markov random fields, Potts model
#8
(tex)
vision
bank
Menteith
#8.R #8.txt

Typos

Despite reading and re-reading the manuscript, there unfortunately remain errors of different magnitudes, from "invisible" typos to modelling mistakes: please contact one of us if you think you have detected an error. Typos that have already been corrected in the second printing are given on this page on corrected typos of first printing. Here is the page for the typos remaining in the second printing.

Book reviews

Reviews of the book have appeared in
and authors of personal or journal reviews are welcome to send us their review for posting. Here is for instance an alternative proposal from Xiaohui Chen (UBC) on the projection priors in Chapter 3 (pdf file).

Last updated, April 8, 2008 © Christian P. Robert