Loading web-font TeX/Math/Italic

Search

Wednesday, April 27, 2016

A simple proof that the p-value distribution is uniform when the null hypothesis is true

[Scroll to graphic below if math doesn't render for you]

Thanks to Mark Andrews for correcting some crucial typos (I hope I got it right this time!).

Thanks also to Andrew Gelman for pointing out that the proof below holds only when the null hypothesis is a point null H_0: \mu = 0, and the dependent measure is continuous, such as reading time in milliseconds, or EEG responses.

Someone asked this question in my linear modeling class: why is it that the p-value has a uniform distribution when the null hypothesis is true? The proof is remarkably simple (and is called the probability integral transform).

First, notice that when a random variable Z comes from a Uniform(0,1) distribution, then the probability that Z is less than (or equal to) some value z is exactly z: P(Z\leq z)=z.

Next, we prove the following proposition:

Proposition:
If a random variable Z=F(T), then Z \sim Uniform(0,1).

Note here that the p-value is a random variable, call it Z. The p-value is computed by calculating the probability of seeing a t-statistic or something more extreme under the null hypothesis. The t-statistic comes from a random variable T that is a transformation of the random variable \bar{X}: T=(\bar{X}-\mu)/(\sigma/\sqrt{n}). This random variable T has a CDF F.

So, if we can prove the above proposition, we have shown that the p-value's distribution under the null hypothesis is Uniform(0,1).

Proof:

Let Z=F(T).

P(Z\leq z) = P(F(T)\leq z) = P(F^{-1} F(T) \leq F^{-1}(z) ) = P(T \leq F^{-1} (z) ) = F(F^{-1}(z))= z.

Since P(Z\leq z)=z, Z is uniformly distributed, that is, Uniform(0,1).

A screengrab in case the above doesn't render:




2 comments:

Unknown said...

Am I missing a trick or is there a typo in the third term in the equation?
I.e. should P(Z <= F^-1(z)) be P(T <= F^-1(z))?

Here's how I reason out the proof (sorry about ascii math):

Let Z = F(T)

Pr { Z <= z }
= Pr {F(T) <= z } # By definition of Z = F(T).
= Pr {F^-1(F(T)) <= F^-1(z)} # Apply inverse of F to both sides.
= Pr {T <= F^-1(z)} # F and F^-1 cancel on lhs.
= F(F^-1(z)) # Because previous step defines cumulative dist function.
= z # And F and F^-1 cancel again.

Shravan Vasishth said...

One should show this to people who say/think that P( statistic | Null True ) \approx P( Null True ).