About

This package contains a Matlab implementation of a kernel-based statistical hypothesis test for independence, as described in GreEtAl08a and GreEtAl08b

We propose to test whether random variables X and Y are independent based on a sample of observed pairs (x_i,y_i). We use as our test statistic the Hilbert-Schmidt norm of the covariance operator between RKHS mappings of X and Y: this is called the Hilbert-Schmidt Independence Criterion (HSIC). The population HSIC is zero at independence, so the sample is unlikely to be independent when the empirical HSIC is large. The test software returns both HSIC and a threshold, where the latter is a user-specified quantile of the empirical HSIC distribution at independence. When HSIC exceeds this threshold, we reject the independence hypothesis. Aside from the papers mentioned above, a more intuitive explanation of HSIC and the associated test may be found in these talk slides.

Two strategies are used to calculate the test threshold:

Code

Code may be downloaded here. The zipfile contains three programs: hsicTestBoot.m uses a resampling procedure to obtain the test threshold, hsicTestGamma.m uses a two-parameter Gamma approximation to the null distribution to get the test threshold, and rbf_dot.m computes the kernel matrices.

References

[GreEtAl08a] Gretton, A., K. Fukumizu, C.-H. Teo, L. Song, B. Schoelkopf and A. Smola: A Kernel Statistical Test of Independence. NIPS 21, 2007.
[GreEtAl08b] Gretton, A., K. Fukumizu, C.-H. Teo, L. Song, B. Schoelkopf and A. Smola: A Kernel Statistical Test of Independence. MPI Technical Report 168, 2008.

Contact

arthur@tuebingen.mpg.de