Probabilistic modeling and Bayesian inference using BUGS
Probabilistic generative models provide a powerful framework for analyzing and thinking about complex, noisy systems. When modeling the underlying process that generated the data set of interest, many problems can be framed as an inversion of this process, using Bayesian inference. Convenient modeling abstractions and tractable inference schemes for computing with them remain a challenge to executing Bayesian computations in practice. BUGS (Bayesian inference Using Gibbs Sampling) is a useful language for writing hierarchical probabilistic models, and sampling from the posterior distributions they define using Gibbs sampling. (Gibbs sampling is a kind of Markov chain Monte Carlo (MCMC) technique -- a family of methods that revolutionized the way Bayesian inference is implemented.)
To get started with probabilistic modeling in BUGS, Michael Lee and E.-J. Wagenmakers have written a fantastic book on Bayesian Graphical Modeling (available freely in PDF form), with detailed examples coded in WinBUGS, a modern implementation of BUGS. The book's examples are geared for cognitive scientists, though the kinds of models presented are widely applicable. For general reading on Bayesian inference, check out Tom Griffiths' Reading list on Bayesian methods.
Below are some issues/questions related to WinBUGS and their solutions, geared in part towards running WinBUGS on Mac.
Working with WinBUGS
Running WinBUGS on MacWinBUGS is a Windows program, but fortunately it can run smoothly on two Windows emulators on Mac: Darwine (a Mac port of WINE) and CrossOver. CrossOver is available commercially from CodeWeavers, while Darwine is open source and free. Both emulators can run WinBUGS on Intel-based Macs without the need to install Windows.
WinBUGS with a Matlab/R interface on MacThere are interfaces available for using WinBUGS through Matlab (MATBUGS) or R (rbugs). Samples from models written in WinBUGS, as well as information about chain convergence, can then be read into these packages for analysis and plotting. However, both interfaces rely on calling WinBUGS from the command line. This poses a problem for Mac users, since calling emulated programs from the command line can be difficult. A way to do this using CrossOver is as follows:
- A shell script is needed, call it run_winbugs.sh, that can take the path of a BUGS model file as argument, and run WinBUGS on it.
- The shell script must set the environment variables necessary for running CrossOver programs. To find out what the proper CrossOver environment is in your machine, open CrossOver > go to Programs > Run command > click on Open Shell. Copy and paste the set of export statements that appear on your terminal window, and place them first in run_winbugs.sh.
- Next, find out what the Windows-style path of your WinBUGS14.exe executable is, and add following line to run_winbugs.sh for calling it with an argument:
wine C:\\Program\ Files\\WinBUGS14\\winbugs14.exe -- /PAR $1;In my case, the Windows-style path is C:\Program Files\WinBUGS14\winbugs14.exe, though inside a script it must be slash escaped as shown above. The /PAR tells WinBUGS an argument is coming, and $1 is a variable containing the first passed-in argument in bash scripts. The -- serves as a spacer to distinguish arguments to the shell from arguments to the shell command being executed (wine, in this case.)
- Edit the Matlab or R interface script so that it uses run_winbugs.sh to execute WinBUGS.
I'm not aware of any way to run WinBUGS from the command line via Darwine. If you find a way to do this, please let me know.
WinBUGS language and syntax issuesAre there if-then (conditionals) in WinBUGS? WinBUGS has no built-in syntax for conditionals. However, these can be written using multiplication and assignment. For example, 'if x then y = foo else y = bar' can be written as: y <- x*foo + (1-x)*bar;
Stochastic versus logical nodes. WinBUGS distinguishes stochastic nodes, whose values are defined directly by a distribution, from logical nodes whose values are computed deterministically from other (potentially stochastic) nodes. The trouble is that WinBUGS only allows tracking of stochastic nodes during sampling, but sometimes the variable of interest is computed deterministically from other stochastic nodes (making its value random, in spite of its technical status as a logical node.) One way around this is to make the logical node a stochastic node with effectively zero variance, so that the deterministic operation that computes the node's value will not add any randomness in practice. For example, suppose our variable of interest x is the sum of two Gaussian random variables, g1 and g2, and we want to track x's values through sampling:
g1 ~ dnorm(0, 2);
g2 ~ dnorm(0, 2);
x <- g1 + g2;
Setting a mark on x's value now will not work, since it's a logical node. The solution is to redefine x as follows:
y <- g1 + g2;
x ~ dnorm(y, 1000);
Now x is a stochastic node whose value will always be g1 + g2, due to the large value of the precision parameter to dnorm (recall that precision is equal to 1/variance.)
Working with uniform distributions. WinBUGS supports uniform distribution through dunif. However, sampling with dunif can occassionally fail with precision/numerical errors. Sometimes this can be fixed by using dbeta(1, 1) instead and rescaling to the interval of choice. (Thanks to Michael Lee for this tip.)