4.und 5.Vorlesung: Kernels, PCA, and kernel PCA
PCA: Principle Component Analysis is a linear technique to reduce data
dimensionality. The main idea is to find the directions in the (high dimensional)
space where the data variability is highest and ignore all other directions.
We discuss two different ways to derive PCA: as the projection minimizing the
squared error, and the one maximizing the data variance.
Literature on PCA: Classical PCA is covered in many statistics books:
- A complete book on PCA is Jolliffe: Principal Component Analysis. Springer,
2002.
- Chapter 8 in Mardia, Kent, Bibby: Multivariate Analysis. Academic Press,
1979. A classic.
Kernels: are very convenient similarity functions which automatically
come together with an embedding in a high-dimensional space.
Kernel PCA: combines the kernel trick with PCA.
Literature on kernel PCA:
- Chapter 14.2 of Schölkopf and Smola: Learning
with Kernels, MIT Press, 2002.
- Chapter 6.2. of Shawe-Taylor and Cristianini: Kernel
Methods for Pattern Analysis. Cambridge University Press, 2004.
- The original article: B. Schölkopf, A. Smola, and K.-R. Müller.
Kernel Principal
component Analysis. In B. Schölkopf, C. J. C. Burges, and A. J.
Smola, editors, Advances in Kernel Methods--Support Vector Learning, pages
327-352. MIT Press, Cambridge, MA, 1999.
Demos:
- PCA first demo: this demo shows projections
produced by a simple PCA application, step by step. Call it for example with
with demo_pca(3,2)
- PCA second demo: shows how PCA
can be applied to handwritten digits, called for example with demo_pca_usps(300).
For this demo you also need the USPS (US Postal Handwritten digits) data
set.
- Kernel PCA demo: shows how
kernel PCA works