Title: On learning with similarity functions (with extensions to clustering)
ABSTRACT
========
There has been substantial work in both theory and practice on
learning with {\em kernel} functions. However, there is also a bit of
a disconnect between theory and practice. In practice, kernels are
viewed as measures of similarity, but the theory talks instead about
margins of separators in implicit, high dimensional spaces defined by
mappings that one may not even be able to calculate. In this talk I
will discuss work on developing an alternative theory of learning with
{\em similarity} functions (i.e., sufficient conditions for a
similarity function to allow one to learn well) that (a) does not
require reference to any implicit spaces, (b) does not require the
function to be positive semi-definite, and (c) generalizes the
standard theory in that any good kernel function in the standard sense
is also a good similarity function in the sense defined here. I will
then talk about some preliminary work attempting to extend this to
clustering: that is, what conditions on a similarity function would be
sufficient to allow one to cluster well, and how should that be
defined.
This is joint work with Nina Balcan.