Organizers: Shai Ben-David, Ulrike von Luxburg, John Shawe-Taylor and Naftali Tishby
Background:
Clustering is one of the most widely used techniques for exploratory data analysis.
Across all disciplines, from social sciences over biology to computer science,
people try to get a first intuition about their data by identifying meaningful
groups among the data points. In the past five decades, a wide variety of clustering
algorithms have been developed and applied to a wide range of practical problems.
Despite this large number of algorithms and applications, the goal of clustering
and its proper interpretation remains fuzzy and vague. There are in fact many
different problems that are clustered together under this single term, from
quantization with low distortion for compression, through various techniques
for graph partitioning whose goals are not fully specified, to methods for revealing
hidden structure and unobserved features in complex data. We are clearly not
talking about a single well defined problem. Moreover, the theoretical foundations
of clustering seem to be distressingly meager, covering only some sub-domains
and failing to address some of the most basic general aspects of the area. There
is not even an agreement among the researchers on the correct questions to pose,
let alone which tools and analysis techniques should be used to answer those
questions. In our opinion there is an urgent need to initiate a concerted discussion
on these issues, in order to move towards a consolidation of the theoretical
basis for - at least some of the aspects of - clustering. One prospective benefit
of building a theoretical framework for clustering may come from enabling the
transfer of tools developed in other related domains, such as machine learning
and information theory, where the usefulness of having a general mathematical
framework have been impressively demonstrated. We have the impression that recently
many researchers have become aware of this need and agree on the importance
of these issues.
Questions we wish to address:
Structure of the workshop:
Our one-day workshop will consist of invited talks, contributed talks, and extensive discussions.
Confirmed invited speakers:
Call for contributions:
We invite presentations addressing one or several of the questions raised above. To keep the workshop lively we intend to keep the individual presentations short, at most 15 minutes.We welcome presentations about work in all "stages of completion", ranging from completed work over work in progress to discussing potential directions of future research. In particular we encourage position papers. We would like to stress that the focus of this workshop is on foundations of clustering. We are not interested in contributions about "yet another ad-hoc clustering algorithm".
Please submit an extended abstract (at most two pages) summarizing your potential contribution to clustering_workshop@ipsi.fraunhofer.de.
*** The deadline is the 30th October 2005. ***
The organizers will review all submissions. You will be notified by November 11 whether your contribution is accepted.