SPIDER The Spider Tutorial - Part II
Tell me how SVMs and KERNELS work
Some explanation of kernel objects is important because there are quite a few objects which use them.a kernel maps input data to output data where the output data is a matrix of inner products. it is often stored in the member object 'child', Example: in SVMs a=svm; a.child=kernel('rbf',1)
makes a svm with an rbf kernel with sigma=1. One can also do a.kerparam=0.5 to change sigma. One could also have created the same kernel witha=svm(kernel('rbf',1))
as kernels are automatically detected as a hyperparameters. Kernels are created using kernel([name],[params]). See help kernel for other types of kernel. One other particularly useful kernel is the custom kernel. This allows you to specify a matrix of inner products of your own choice, e.g:d=gen(toy); a=svm; a.child=kernel('custom',d.X*d.X');
is equivalent to a vanilla linear svm:
loss(train(cv(a),d))d=gen(toy); loss(train(cv(svm),d))
Training these algorithms in the normal way thus uses these kernels. The svm object uses mex files to implement fast training in C by calling svmlight or libsvm. This can be changed via the member optimizer, see help svm for more details. Note: 'custom' kernels do not work at present for LIBSVM, however they do work for svmlight.
To see what is actually happening with a kernel one can use it without a host algorithm such as svm. For example make some data with:d=data(rand(5)); k=kernel('rbf',2); r=train(k,d); r.X
The last command prints out the new kernel matrix returned in kernel object r . In the spider a kernel is viewed as another learning algorithm (just not a very complicated one in terms of generalization):After training one can test on new data (to know the inner products between the test data and the original data). Many of the spider objects don't implement things in this way yet, they use an old function called calc. Try not to use this. Seeing kernels in the train, test methodology allows one to create complex kernels e.g k=chain({ kernel('custom',D) kernel('rbf_from_dist',0.1) })
takes a custom distance matrix and transforms it to an rbf kernel. One can then use the param object to try different values of sigma of the rbf. Here's a simple example of using param and kernel objects together:s=svm(kernel('poly',1));
Note that poly is used to access the parameter of the kernel which is actually called kerparam in the object. This is because it is also aliased to be called poly, this is useful to differentiate it from other kernel parameters, e.g in a chain . See help algorithm on how to make aliases.
a=param(s,'poly',[1:3]);
[r a]=train(cv(a),toy)
get_mean(r)
Saving memory: data_global and custom_fast kernels
Because matlab uses a lot of memory (it passes by value and copies variables every time you call a function) we wrote some routines to use global variables to avoid this problem. data_global works in exactly the same way as data except it takes its X and Y components from global variables X and Y. Example: the following are equivalent, (apart from memory usage):rand('seed',1); d=gen(toy); get_mean(train(cv(svm),d))
is equivalent torand('seed',1); d=gen(toy);
global X; global Y; X=d.X; Y=d.Y; clear d; % use global data instead
d=data_global; get_mean(train(cv(svm),d))If you give the constructor of data_global extra parameters it creates local copies of X and Y and doesn't save memory any more. (It is very important when handling data objects, as with all objects, to use the access functions only, see implementation issues below.) Also, the drawback of data_global is that it can only handle one dataset at one time (although you can create multiple data_global objects which hold subsets of the data because they index the global matrices).
Likewise, the 'custom_fast' kernel can also replace 'custom' kernel but with a global variable K. Example: the following are equivalent, (apart from memory usage):d=gen(toy); a=svm; a.child=kernel('custom',d.X*d.X');
is equivalent to
loss(train(cv(a),d))d=gen(toy); a=svm; global K; K=d.X*d.X';
a.child=kernel('custom_fast'); loss(train(cv(a),d))
Structured objects: kernels for strings, trees and graphs
At the moment The Spider doesn't have much special facilities for structured kernels such as those on strings, trees and graphs.
Practically, you have to use the gram matrix in a 'custom' kernel at the moment to train on these, and produce that matrix with your own code.Potentially, though, there is nothing stopping you defining a data object with the same public functions as the usual data object which deals with another type of data ( data only deals with vectors), e.g. a string_data object and then a kernel or kernels which calculate dot products for this new data type.
(If someone wants to write this please send it to us!)
Implementation: What is an object in Matlab?
An object in matlab is implemented with a directory beginning with a "@" . So we have spider/pat/@knn e.g for k-nearest neighbours which includes all the methods of that objects (which are M-files). Spider objects have a constructor which sets default hyperparameters and initials the model. They also have a "training" and "testing" method.
Technically all the objects in the spider are children of another object called algorithm . This stores some basic flags and also hosts the methods "train" and "test" which call the "training" and "testing" of child. The parent method handles training multiple datasets and other complications.
Implementation: How do I make my own object?
Take a look at the @template object in the spider/pat directory which is a simple svm object and copy this directory and just change the training.m and testing.m files and the constructor (which should be the same name as the directory) to make your new algorithm. Then to find about adding the object to the publically available spider package click here .
Implementation: PROGRAMMING GUIDELINES
1) Try not to use "..." in your M-files to extend to the next line because switching between Linux and Windows adds extra line feeds and makes it not work...
2) If you use a kernel, try to call train and test members of kernel, see e.g the svm object. This will help to make complex kernel objects, e.g chain networks of kernels.
3) try to use get, get_x, set_x of data objects, and try not to create new data objects, e.g in your code do not dod2=data; d2=d.X(1:2,:); d2=d.Y(1:2,:);
but rather, do:
d3=data(d.name,d.X,d.Y); d3.X=xx;d2=get(d,1:2);
otherwise you may lose e.g indexing information and other transparent members in an object, or be incompatible if members are added later.
d3=set_x(d,xx);