
 March 2, 2007 


Null Distribution
One natural test statistic is the distance of the closest pair of points, one from the first set and
one from the second set. When this distance gets small enough it provides evidence to reject
the Null hypothesis. But how small is small? To know the threshold point requires knowing
the distribution of the closest pair of points under the Null hypothesis. To estimate this distribution,
we will perform a Monte Carlo experiment which has 10,000 trials. In each trial we generate two
point data sets, each of whose points are independently and uniformly distributed on the unit square,
and we calculate the distance between the closest pair of points. This results in 10,000 closest
distances. We sort these distances from smallest to largest and find the value of the
100th distance. The 100th distance has value .005513.
Under the Null hypothesis the probability that the closest distance will be as small
or smaller than the 100th distance is 100/10,000 = 1/100, the significance level of the test.
The critical region for the test statistic is the interval [0,.005513]
This is a 50 bin histogram of the smallest interpoint distances between point set 1 and point set 2
obtained from a 10,000 trial Monte Carlo experiment in which the point sets are
uniformly distributed and independent
of each other and therefore satisfy the Null hypothesis. Each bin has width .005.
< Previous >
< Next >



