By contrast, the posterior distributions of the mean and of the effect size are very stable despite changing the vague prior on the standard deviation.
Although I caution against using Bayes factors (BFs) to routinely test null hypotheses (for many reasons; see Ch. 12 of DBDA2E, or this article, or Appendix D of this article), there might be times when you want to give them a try. A nice way to approximate a BF for a null hypothesis test is with the Savage-Dickey method (again, see Ch. 12 of DBDA2E and references cited there, specifically pp. 352-354). Basically, to test the null hypothesis for a parameter, we consider a narrow region around the null value and see how much of the distribution is in that narrow region, for the prior and for the posterior. The ratio of the posterior to prior probabilities in that zone is the BF for the null hypothesis.
Consider a batch of data randomly sampled from a normal distribution, with N=43. We standardize the data and shift them up by 0.5, so the data have a mean of 0.5 and an SD of 1.0. Figure 1, below, shows the posterior distribution on the parameters
|Figure 1. Posterior when using unif(0,1000) prior on sigma, shown in Fig's 2 and 3.|
To determine Bayes factors (BFs) for mu and effect size, we need to consider the prior distribution in more detail. It has a broad normal prior on mu with an SD of 100 and a broad uniform prior on sigma from near 0 to 1000, as shown in Figures 2 and 3:
|Figure 2. Prior with unif(0,1000) on sigma. Effect size is shown better in Figure 3.|
The implied prior on the effect size, in the lower right above, is plotted badly because of a few outliers in the MCMC chain, so I replot it below in more detail:
|Figure 3. Implied prior on effect size for unif(0,1000) prior on sigma.|
The BF for a test of the null hypothesis on mu is the probability mass inside the ROPE for the posterior relative to the prior. In this case, the BF is 0.7% / 0.1% (rounded in the displays) which equals about 7. That is, the null hypothesis is 7 times more probable in the posterior than in the prior (or, more carefully stated, the data are 7 times more probable under the null hypothesis than under the alternative hypothesis). Thus, the BF for mu decides in favor of the null hypothesis.
The BF for a test of the null hypothesis on the effect size is the analogous ratio of probabilities in the ROPE for the effect size. The BF is 0.8% / 37.2% which indicates a strong preference against the null hypothesis. Thus, the BF for mu disagrees with the BF for the effect size.
Now we use a different vague prior on sigma, namely unif(0,10), but keeping the same vague prior on mu:
|Figure 4. Prior with unif(0,10) on sigma. Effect size is replotted in Figure 5. Compare with Figure 2.|
|Figure 5. Effect size replotted from Figure 4, using unif(0,10) on sigma.|
The resulting posterior distribution looks like this:
|Figure 6. Posterior when using unif(0,10) prior on sigma.|
But the BF for effect size is rather different than before. Now it is 0.8% / 0.4%, which is to say that the probability of the null hypothesis has gone up, i.e., this is a BF that leans in favor of the null hypothesis. Thus, a less vague prior on sigma has affected the implied prior on the effect size, which, of course, strongly affected the BF on effect size.
To summarize so far, a change in the breadth of the prior on sigma had essentially no effect on the HDIs of the posterior distribution, but had a big effect on the BF for the effect size while having no effect on the BF for mu.
Proponents of BFs will quickly point out that the priors used here are not well calibrated, i.e., they are too wide, too diluted. Instead, an appropriate use of BFs demands a well calibrated prior. (Proponents of BFs might even argue that an appropriate use of BFs would parameterize differently, focusing on effect size and sigma instead of mu and sigma.) I completely agree that the alternative prior must be meaningful and appropriate (again, see Ch. 12 of DBDA2E, or this article, or Appendix D of this article) and that the priors used here might not satisfy those requirements for a useful Bayes factor.
But there are still two take-away messages:
First, the BF for the mean (mu) need not lead to the same conclusion as the BF for the effect size unless the prior is set up just right.
Second, the posterior distribution on mu and effect size is barely affected at all by big changes in the vagueness of the prior, unlike the BF.