First, a recap:
I'm doing Monte Carlo simulation of sampling distributions to compute p values. For example, consider the slope parameter, β1, in simple linear regression. I want to find out if p < .05 for a null hypothesis that β1=0. I'm working with two different ways to compute a p value, and am finding that the results do not agree with each other. Here are the two ways:
- Consider the maximum likelihood estimate (MLE) of β1 from the actual data, denoted β1MLEactual, and see where it falls in a sampling distribution of β1MLEnull for simulated data from the null hypothesis.
- Consider the likelihood ratio statistic, G2 = -2log(LR) where LR is the ratio of the maximum likelihood of the restricted model with β1=0 over the maximum likelihood of the full model with β1 free. We see where G2actual falls in the sampling distribution of G2null for simulated data from the null hypothesis.
Now the news: Arash Khodadadi, an advanced graduate student in Psychological & Brain Sciences at Indiana University, read the post and pointed out to me that not all samples that have G2null > G2actual are the same samples that have β1MLEnull > β1MLEactual. Essentially he was saying that I really should be looking at the joint sampling distribution. So I made a plot, and here it is:
are quite different than the points contribution to the p value for G2. It is conceivable that the red and blue points would exactly equal each other mathematically (perhaps in a two-tailed version), but I doubt it.
Conclusion: This exercise leads me to conclude that the p values are different because they are referring to different questions about the the null hypothesis. Which one is more meaningful? For me, the sampling distribution of β1MLE makes more direct intuitive contact with what people want to know (in a frequentist framework), namely, Is the observed magnitude of the slope very probable under the null hypothesis? The sampling distribution of G2 is less intuitive, as it is asking, Is the observed ratio, of the probability of the data under the zero-slope model over the probability of the data under the free-slope model, very probable under the null hypothesis? Both questions are meaningful, but the first one asks directly about the magnitude of the slope, whereas the second one asks about the relative probabilities of data under the model structures.
Now the difficult semantic question: When the p values conflict, as they do in this example (i.e., p < .05 for β1MLE, while p > .05 for G2 [and if you prefer two-tailed tests, then we could contrive a case in which the p values conflict there, too]), is the slope "significantly" non-zero? The answer: It depends! It depends on the question you are asking. I think it makes intuitive sense to ask the question about the magnitude of the slope and, therefore, to say in this case that the slope is significantly non-zero. But if you specifically want to ask the model comparison question, then you need the model-comparison p value, and you would conclude that the free-slope model is not significantly better than the zero-slope model.
Addendum: SEE THE FOLLOW-UP POST.