The Rate Adapting Poisson model - RAP |
---|
This website provides matlab code and some details about the Rate Adapting Poisson Model. The model was published at ICML 2006. You can find the publication here.
The Rate Adapting Poisson (RAP) model is an undirected probabilistic graphical model suited to learn latent structure in count data. It can be used to find dimensionality reduced representations of such data which subsequently can be used for classification or retrieval algorithms. In the ICML paper it is shown that for some benchmark datasets for this task the RAP model generates superior representations for such tasks than its directed counterpart Probabilisitic Latent Semantic Analysis (pLSI). For more details and a description of the algorithm please have a look in the paper
Below you can see a picture showing a 2 dimensional projection of text data
Code
You can download my maltab implementation of the RAP model:
Matlab source for RAP
Just unzip the package and have a look at sampleRun.m to see how to use the code. The function har_learn.m has also an extensive help which might help you. Please note that you need to install Tom Minkas Lightspeed package. Make sure you compile all files inside it!! Besides the implementation of this model I also have other implementations which might be helpful. These can be found on my code website.
pLSI - probabilistic latent semantic analysis, including a version of the tempered EM algorithm. |
ePCA - exponential family PCA. |
And of course there is the wonderful spider with implementations of NMF, etc.
For the evaluation of the model we used several benchmark datasets for information retrieval. All of them were obtained from the website of Alessandro Moschitti. For more details and the original sources please have a look at this website. These corpora were processed using the Rainbow toolbox from Andrew McCallum. In the version offered on this website the switches like stemming, pruning, etc. are ignored without warning. I replaced the function lex-simple.c with this version to fix this. You can download the preprocessed matlab data and the scripts for the generation of this data:
Caltech 4 |
Caltech 101 |
The Rate Adapting Poisson (RAP) model for Information Retrieval and Object Recognition - Peter V. Gehler, Alex D. Holub and Max Welling, ICML 2006 |
Exponential Family Harmoniums with an Application to Information Retrieval - Max Welling,Michal Rosen-Zvi and Geoffrey Hinton, NIPS 2004 |