EE4389/8591: Support Vector Machine Classifier

Professor Cherkassky, University of Minnesota

Description

The Support Vector Machine (SVM) is a universal constructive learning procedure based on the statistical learning theory (Vapnik 1995). For details about the Support Vector Classifiers (SVC), please refer to 1.

The STPRTool provides a number of interfaces for SVC. However in context of the course, we identify three interfaces for the implementation of SVC.

  • evalsvm

  • svmclass

  • psvm

Usage interface I

This interface is used to train and evaluate the SVM classifier.

[model, errors] = evalsvm(trn_data, val_data, options)

Input arguments

trn_data

This data structure contains the input training data.

  • trn_data.X = a dtimes n array of the input variables, where d is the dimension of the input data and n is the number of samples

  • trn_data.y = a 1times n array of the class labels

  • trn_data.num_data = total number of samples

  • trn_data.dim = input data dimension

Tunable parameters

val_data

This data structure contains the input validation data. The parameters of the data structure are same as the trn_data. It is only needed to be specified if the model selection has to be done on a validation data set.

options

This specifies the set of options on which the SVM classifier has to be evaluated.

  • options.ker = the type of Kernel for the SVM classifier

    • options.ker='linear’ linear, H(x,x')=x^Tx'

    • options.ker='poly’ polynomial of degree q, H(x,x')=(x^Tx'+1)^q

    • options.ker='rbf’ radial basis function, H(x,x')=expleft( -|x-x'|^2 /sigma^2 right)

    • options.ker='sigmoid’ Sigmoidal, H(x,x')=tanh ( nu x^Tx' +a)

  • options.dimarg = the dimension of arguments for the kernel type, dimarg=1 for ker='rbf’ and dimarg=2 for ker='sigmoid’

  • options.arg = the set of arguments for the Kernel over which the SVC is to be evaluated
    It is generally a vector of dimension dimargtimes 1. We can however set a range of arguments. In such a case, the dimension of this parameter changes to dimargtimes k, where k is the number of arguments we need to test the model for.

  • options.C = the set of regularization constants (also called as the constraints) over which the SVC is to be evaluated
    We can set a range of C values over which the model needs to be evaluated.

  • options.solver = the type of solver to be used by the SVC (default 'smo’)

  • options.num_folds = the number of folds of cross-validation that need to be performed for evaluating the model (default 5)

  • options.verb = the progress info is displayed if set to 1 (default 0)

Output

model

This is the best model selected by the evalsvm interface based on the validation set error or the cross-validation error (whichever was specified).

  • model.Alpha = the optimal Lagrange multipliers obtained by solving the dual problem

  • model.b = the bias term in the decision function

  • model.nsv = number of support vectors

  • model.trnerr = the error on the training data due to the best model

  • model.margin = the soft margin
    This is used by psvm interface while displaying the soft margin.

  • model.sv = a structure containing all the support vectors

  • model.options = the options used by the solver

  • model.fun = the type of classifier to be used while displaying the decision boundary (used by the psvm interface)

  • model.cputime = time taken to build the model

errors

This is the classification error provided by the best model on the Validation set. This may also represent the cross-validation error if the validation set is not provided.

Issue: One issue with the evalsvm interface (in general for any SVM solver interface for STPRTool) is that we need to specify some argument for the ‘linear’ SVM. For the other solver interfaces like ‘smo’ this value is taken to be one by default. But in case of the evalsvm interface the user needs to specify it. This somewhat confuses the user because a linear SVM can have no arguments. Although this parameter is not used internally it is suggested to set this value to be 1.

Usage interface II

This is used to classify new test data based on the SVM classifier that we obtained. For binary classification the discriminant function is:

   y_i = left{   	begin{array}{ll}   		1, & f(x_i) geq 0    		2, & f(x_i) < 0.    	end{array}right.

where f is the discrimiant function given by Alpha (mathrm{nsv}times 1), b (1times 1) and support vectors sv.X.

[ypred, dfce] = svmclass(X, model)

Input arguments

X

Input vectors to be classified. It should have the same dimensions as the trn_data.X used to evaluate the SVM classifier model.

model

This is the SVM classifier model.

  • model.Alpha = multipliers associated to support vectors (mathrm{nsv}times mathrm{nfun})
    mathrm{nfun} is the number of discriminant functions, and mathrm{nfun}=1 for binary classification.

  • model.b = biases (mathrm{nfun}times 1)

  • model.sv.X = the X values of the support vectors (dtimes mathrm{nsv})

  • model.options.ker = the type of kernel

  • model.options.arg = the Kernel argument for best model

Output

ypred

The predicted labels of the input test data, 1times n.

dfec

Values of discriminant functions, mathrm{nfun}times n.

Usage interface III

This interface is used to plot the SVM decision boundary along with the soft margin.

h = psvm(model)

Input arguments

model

This is the best model obtained by using the evalsvm interface.

Output

h

The handler to the graphical object.

Example

In this example we set the range of C from [1, 10, 20, 30]. We use Kernel type as ‘RBF’ and set the range of sigma = [0.1, 0.5, 1, 5]. The model selection will be done based on 15 fold cross-validation.

Load the training data
trn = load('riply_trn');
Define the model parameters (parameter tuning)
options.ker = 'rbf';			% 'rbf' kernel
options.arg = [0.1, 0.5, 1, 5];		% the range of sigma values
options.C = [1, 10, 20, 30];		% the range of C values
options.solver = 'smo';			% the type of solver
options.num_folds = 15;			% the number of folds for cross-validation
options.verb = 1;			% set to 1 if you need to print the CV errors
Perform model selection
[model, errors] = evalsvm(trn, options);	% use the interface for selecting the best Model

Now we have all the CV errors and the best model. Let us test the model on the Ripley's test data provided in STPRTool.

tst = load('riply_tst');
[ypred, dfce] = svmclass(tst.X, model);		% predict the class label for the test data
cerror(ypred, tst.y);
Decision boundary and the soft margin
figure; hold on;
ppatterns(trn);
psvm(model);
xlabel('x1'), ylabel('x2');
title('SVM decision boundary with soft margin');