2. Vorlesung: Similarity and Dissimilarity Functions, Embeddings, and Multidimensional
Scaling
Keywords and background reading (note
that some of the references are quite advanced!):
General definitions:
- Similarities, dissimilarities and their properties
(summary p.13-16 in Luxburg04,
and references therein)
- Metrics, norms, scalar products, kernels (see any
functional analysis book)
- Metric spaces, Banach spaces, Euclidean spaces, Hilbert spaces
(see any functional analysis book)
Examples for similarity and dissimilarity functions:
[Here are some more examples I did not talk about in the lecture:
- distances between graphs or other strucutred objects:
convolution kernels (Haussler, link to do ), label sequence kernels (KashimaTsudaInokuchi04)
- distances between probability distributions: Kullback-Leibler
(any information theory book), Correlation coefficient (any statistics book)
]
Multidimensional Scaling: Embedding of data points
in a Euclidean space.
There are at least two complete books on this topic:
Cox and Cox: Multidimensional Scaling. Chapman & Hall, 2001; Borg and Groenen:
Modern Multidimensional scaling. Springer, 2005.
Here is are some matlab demos to see how it works:
- a very simple implementation: demo_mds.m.
- matlab's own implementation (subject to copyright!!!):
cmdscale.m
[Other embeddings we did not not talk about in the lecture:
- Embedding finite metric spaces into L_p spaces: check
out the papers on the homepage of Jiri
Matousek
- an infinite metric space into a Hilbert space: Schönbergs
results (the original papers by Schönberg: On certain metric spaces arising
from Euclidean spaces by a change of metric and their embedding in Hiilbert
space. Ann of Math. 38,4,787-793,1938. And: Metric Spaces and Positive Definite
Functions, TAMS 44, 522-536, 1938.
- an arbitary metric space into a Banach space: Kuratowski
embedding (see HeinBousquet03)
and Arens-Eells embedding (see LuxburgBousquet03).
- A similar theorem to Johnson-Lindenstrauss, but for
arbitrary metrics: Bourgain's
theorem.
]
Puzzles and exercises:
- We have seen that MDS only works if the distance matrix is Euclidean. How
can we test whether a distance matrix is Euclidean?
- What goes wrong in the MDS algorithm if the distance matrix is not Euclidean?
- What can we do to approximate MDS in case the matrix is not Euclidean?
- Prove that in general we embed in a (n-1)-dimensional space (where n is
the number of data points). Why not n-dimensional?
- If the original data lies in an k-dimensional space (k < n) the image
also will be a k-dim subspace of R^n. Why? How can we see this in the algorithm?
- We have seen that we are free to choose the origin of the coordinate system,
and many people choose the center of the data as origin, that is 0 = 1/n sum
x_i. Let S = (<x_i,x_j>) the scalar product matrix of n points x_1,...,x_n
in R^n. Now shift the coordinate system to have the center of the data as
origin. How does S now look like?