Libsvm is a simple, easy-to-use, and efficient software for SVM classification and regression. It solves C-SVM classification, nu-SVM classification, one-class-SVM, epsilon-SVM regression, and nu-SVM regression. It also provides an automatic model selection tool for C-SVM classification. This document explains the use of libsvm.
Libsvm is available at http://www.csie.ntu.edu.tw/~cjlin/libsvm Please read the COPYRIGHT file before using libsvm.
If you are new to SVM and if the data is not large, please go to `tools' directory and use easy.py after installation. It does everything automatic -- from data scaling to parameter selection.
Usage: easy.py trainingfile [testingfile]
More information about parameter selection can be found in `tools/README.'
On Unix systems, type
make' to build thesvm-train',
svm-predict', andsvm-scale' programs. Run them without arguments to show the usages of them.
On other systems, consult
Makefile' to build them (e.g., see 'Building Windows binaries' in this file) or use the pre-built binaries (Windows binaries are in the directorywindows').
The format of training and testing data files is:
Each line contains an instance and is ended by a '\n' character. For
classification:
For regression,
For one-class SVM,
In the test set,
The pair : gives a feature (attribute) value: is an integer starting from 1 and is a real number. The only exception is the precomputed kernel, where starts from 0; see the section of precomputed kernels. Indices must be in ASCENDING order.
A sample classification data included in this package is
heart_scale'. To check if your data is in a correct form, usetools/checkdata.py' (details in `tools/README').
Type
svm-train heart_scale', and the program will read the training data and output the model fileheartscale.model'. If you have a test set called heartscale.t, then type
svm-predict heart_scale.t heart_scale.model output' to see the prediction accuracy. Theoutput' file contains the predicted class labels.
For classification, if training data are in only one class (i.e., all labels are the same), then
svm-train' issues a warning message:Warning: training data in only one class. See README for details,' which means the training data is very unbalanced. The label in the training data is directly returned when testing.
There are some other useful programs in this package.
svm-scale:
This is a tool for scaling input data file.
svm-toy:
This is a simple graphical interface which shows how SVM separate data in a plane. You can click in the window to draw data points. Use "change" button to choose class 1, 2 or 3 (i.e., up to three classes are supported), "load" button to load data from a file, "save" button to save data to a file, "run" button to obtain an SVM model, and "clear" button to clear the window.You can enter options in the bottom of the window, the syntax of options is the same as `svm-train'.
Note that "load" and "save" consider dense data format both in classification and the regression cases. For classification, each data point has one label (the color) that must be 1, 2, or 3 and two attributes (x-axis and y-axis values) in [0,1). For regression, each data point has one target value (y-axis) and one attribute (x-axis values) in [0, 1).
Type `make' in respective directories to build them.
You need Qt library to build the Qt version. (available from http://www.trolltech.com)
You need GTK+ library to build the GTK version. (available from http://www.gtk.org)
The pre-built Windows binaries are in the `windows' directory. We use Visual C++ on a 64-bit machine.
Usage: svm-train [options] trainingsetfile [modelfile] options: -s svmtype : set type of SVM (default 0) 0 -- C-SVC (multi-class classification) 1 -- nu-SVC (multi-class classification) 2 -- one-class SVM 3 -- epsilon-SVR (regression) 4 -- nu-SVR (regression) -t kerneltype : set type of kernel function (default 2) 0 -- linear: u'v 1 -- polynomial: (gammau'v + coef0)^degree 2 -- radial basis function: exp(-gamma|u-v|^2) 3 -- sigmoid: tanh(gammau'v + coef0) 4 -- precomputed kernel (kernel values in trainingsetfile) -d degree : set degree in kernel function (default 3) -g gamma : set gamma in kernel function (default 1/numfeatures) -r coef0 : set coef0 in kernel function (default 0) -c cost : set the parameter C of C-SVC, epsilon-SVR, and nu-SVR (default 1) -n nu : set the parameter nu of nu-SVC, one-class SVM, and nu-SVR (default 0.5) -p epsilon : set the epsilon in loss function of epsilon-SVR (default 0.1) -m cachesize : set cache memory size in MB (default 100) -e epsilon : set tolerance of termination criterion (default 0.001) -h shrinking : whether to use the shrinking heuristics, 0 or 1 (default 1) -b probability_estimates : whether to train a SVC or SVR model for probability estimates, 0 or 1 (default 0) -wi weight : set the parameter C of class i to weight*C, for C-SVC (default 1) -v n: n-fold cross validation mode -q : quiet mode (no outputs)
option -v randomly splits the data into n parts and calculates cross validation accuracy/mean squared error on them.
See libsvm FAQ for the meaning of outputs.
Usage: svm-predict [options] testfile modelfile outputfile options: -b probabilityestimates: whether to predict probability estimates, 0 or 1 (default 0); for one-class SVM only 0 is supported
modelfile is the model file generated by svm-train. testfile is the test data you want to predict. svm-predict will produce output in the output_file.
Usage: svm-scale [options] datafilename options: -l lower : x scaling lower limit (default -1) -u upper : x scaling upper limit (default +1) -y ylower yupper : y scaling limits (default: no y scaling) -s savefilename : save scaling parameters to savefilename -r restorefilename : restore scaling parameters from restore_filename
See 'Examples' in this file for examples.
svm-scale -l -1 -u 1 -s range train > train.scale svm-scale -r range test > test.scale
Scale each feature of the training data to be in [-1,1]. Scaling factors are stored in the file range and then used for scaling the test data.
svm-train -s 0 -c 5 -t 2 -g 0.5 -e 0.1 data_file
Train a classifier with RBF kernel exp(-0.5|u-v|^2), C=10, and stopping tolerance 0.1.
svm-train -s 3 -p 0.1 -t 0 data_file
Solve SVM regression with linear kernel u'v and epsilon=0.1 in the loss function.
svm-train -c 10 -w1 1 -w-2 5 -w4 2 data_file
Train a classifier with penalty 10 = 1 * 10 for class 1, penalty 50 = 5 * 10 for class -2, and penalty 20 = 2 * 10 for class 4.
svm-train -s 0 -c 100 -g 0.1 -v 5 data_file
Do five-fold cross validation for the classifier using the parameters C = 100 and gamma = 0.1
svm-train -s 0 -b 1 datafile svm-predict -b 1 testfile datafile.model outputfile
Obtain a model with probability information and predict test data with probability estimates
Users may precompute kernel values and input them as training and testing files. Then libsvm does not need the original training/testing sets.
Assume there are L training instances x1, ..., xL and. Let K(x, y) be the kernel value of two instances x and y. The input formats are:
New training instance for xi:
New testing instance for any x:
That is, in the training file the first column must be the "ID" of xi. In testing, ? can be any value.
All kernel values including ZEROs must be explicitly provided. Any permutation or random subsets of the training/testing files are also valid (see examples below).
Note: the format is slightly different from the precomputed kernel package released in libsvmtools earlier.
Examples:
Assume the original training data has three four-feature instances and testing data has one instance:15 1:1 2:1 3:1 4:1 45 2:3 4:3 25 3:1
15 1:1 3:1
If the linear kernel is used, we have the following new training/testing sets:
15 0:1 1:4 2:6 3:1 45 0:2 1:6 2:18 3:0 25 0:3 1:1 2:0 3:1
15 0:? 1:2 2:0 3:1
? can be any value.
Any subset of the above training file is also valid. For example,
25 0:3 1:1 2:0 3:1 45 0:2 1:6 2:18 3:0
implies that the kernel matrix is
[K(2,2) K(2,3)] = [18 0] [K(3,2) K(3,3)] = [0 1]
These functions and structures are declared in the header file
svm.h'. You need to #include "svm.h" in your C/C++ source files and link your program withsvm.cpp'. You can see
svm-train.c' andsvm-predict.c' for examples showing how to use them. We define LIBSVMVERSION and declare `extern int libsvmversion;' in svm.h, so you can check the version number.
Before you classify test data, you need to construct an SVM model (`svm_model') using training data. A model can also be saved in a file for later use. Once an SVM model is available, you can use it to classify new data.
Function: struct svmmodel *svmtrain(const struct svmproblem *prob, const struct svmparameter *param);
This function constructs and returns an SVM model according to the given training data and parameters.
struct svm_problem describes the problem:
struct svmproblem { int l; double *y; struct svmnode **x; };
where
l' is the number of training data, andy' is an array containing their target values. (integers in classification, real numbers in regression) `x' is an array of pointers, each of which points to a sparse representation (array of svm_node) of one training vector.
For example, if we have the following training data:
LABEL ATTR1 ATTR2 ATTR3 ATTR4 ATTR5
1 0 0.1 0.2 0 0 2 0 0.1 0.3 -1.2 0 1 0.4 0 0 0 0 2 0 0.1 0 1.4 0.5 3 -0.1 -0.2 0.1 1.1 0.1
then the components of svm_problem are:
l = 5
y -> 1 2 1 2 3
x -> [ ] -> (2,0.1) (3,0.2) (-1,?) [ ] -> (2,0.1) (3,0.3) (4,-1.2) (-1,?) [ ] -> (1,0.4) (-1,?) [ ] -> (2,0.1) (4,1.4) (5,0.5) (-1,?) [ ] -> (1,-0.1) (2,-0.2) (3,0.1) (4,1.1) (5,0.1) (-1,?)
where (index,value) is stored in the structure `svm_node':
struct svm_node { int index; double value; };
index = -1 indicates the end of one vector. Note that indices must be in ASCENDING order.
struct svm_parameter describes the parameters of an SVM model:
struct svmparameter { int svmtype; int kernel_type; int degree; /* for poly / double gamma; / for poly/rbf/sigmoid / double coef0; / for poly/sigmoid */
/* these are for training only */ double cache_size; /* in MB */ double eps; /* stopping criteria */ double C; /* for C_SVC, EPSILON_SVR, and NU_SVR */ int nr_weight; /* for C_SVC */ int *weight_label; /* for C_SVC */ double* weight; /* for C_SVC */ double nu; /* for NU_SVC, ONE_CLASS, and NU_SVR */ double p; /* for EPSILON_SVR */ int shrinking; /* use the shrinking heuristics */ int probability; /* do probability estimates */
};
svmtype can be one of CSVC, NUSVC, ONECLASS, EPSILONSVR, NUSVR.
CSVC: C-SVM classification NUSVC: nu-SVM classification ONECLASS: one-class-SVM EPSILONSVR: epsilon-SVM regression NU_SVR: nu-SVM regression
kernel_type can be one of LINEAR, POLY, RBF, SIGMOID.
LINEAR: u'v POLY: (gammau'v + coef0)^degree RBF: exp(-gamma|u-v|^2) SIGMOID: tanh(gammau'v + coef0) PRECOMPUTED: kernel values in trainingsetfile
cache_size is the size of the kernel cache, specified in megabytes. C is the cost of constraints violation. eps is the stopping criterion. (we usually use 0.00001 in nu-SVC, 0.001 in others). nu is the parameter in nu-SVM, nu-SVR, and one-class-SVM. p is the epsilon in epsilon-insensitive loss function of epsilon-SVM regression. shrinking = 1 means shrinking is conducted; = 0 otherwise. probability = 1 means model with probability information is obtained; = 0 otherwise.
nrweight, weightlabel, and weight are used to change the penalty for some classes (If the weight for a class is not changed, it is set to 1). This is useful for training classifier using unbalanced input data or with asymmetric misclassification cost.
nrweight is the number of elements in the array weightlabel and weight. Each weight[i] corresponds to weightlabel[i], meaning that the penalty of class weightlabel[i] is scaled by a factor of weight[i].
If you do not want to change penalty for any of the classes, just set nr_weight to 0.
NOTE Because svmmodel contains pointers to svmproblem, you can not free the memory used by svmproblem if you are still using the svmmodel produced by svm_train().
NOTE To avoid wrong parameters, svmcheckparameter() should be called before svm_train().
struct svm_model stores the model obtained from the training procedure. It is not recommended to directly access entries in this structure. Programmers should use the interface functions to get the values.
struct svmmodel { struct svmparameter param; /* parameter / int nr_class; / number of classes, = 2 in regression/one class svm / int l; / total #SV / struct svm_node *SV; /* SVs (SV[l]) / double *svcoef; /* coefficients for SVs in decision functions (svcoef[k-1][l]) / double *rho; / constants in decision functions (rho[k(k-1)/2]) */ double *probA; / pairwise probability information / double *probB; int *sv_indices; / svindices[0,...,nSV-1] are values in [1,...,numtraning_data] to indicate SVs in the training set */
/* for classification only */int label; / label of each class (label[k]) / int nSV; / number of SVs for each class (nSV[k]) / / nSV[0] + nSV[1] + ... + nSV[k-1] = l / / XXX / int free_sv; / 1 if svm_model is created by svm_load_model/ /* 0 if svm_model is created by svm_train */
};
param describes the parameters used to obtain the model.
nr_class is the number of classes. It is 2 for regression and one-class SVM.
l is the number of support vectors. SV and svcoef are support vectors and the corresponding coefficients, respectively. Assume there are k classes. For data in class j, the corresponding svcoef includes (k-1) y*alpha vectors, where alpha's are solutions of the following two class problems: 1 vs j, 2 vs j, ..., j-1 vs j, j vs j+1, j vs j+2, ..., j vs k and y=1 for the first j-1 vectors, while y=-1 for the remaining k-j vectors. For example, if there are 4 classes, sv_coef and SV are like:
+-+-+-+--------------------+ |1|1|1| | |v|v|v| SVs from class 1 | |2|3|4| | +-+-+-+--------------------+ |1|2|2| | |v|v|v| SVs from class 2 | |2|3|4| | +-+-+-+--------------------+ |1|2|3| | |v|v|v| SVs from class 3 | |3|3|4| | +-+-+-+--------------------+ |1|2|3| | |v|v|v| SVs from class 4 | |4|4|4| | +-+-+-+--------------------+
See svmtrain() for an example of assigning values to svcoef.
rho is the bias term (-b). probA and probB are parameters used in probability outputs. If there are k classes, there are k*(k-1)/2 binary problems as well as rho, probA, and probB values. They are aligned in the order of binary problems: 1 vs 2, 1 vs 3, ..., 1 vs k, 2 vs 3, ..., 2 vs k, ..., k-1 vs k.
svindices[0,...,nSV-1] are values in [1,...,numtraning_data] to indicate support vectors in the training set.
label contains labels in the training data.
nSV is the number of support vectors in each class.
freesv is a flag used to determine whether the space of SV should be released in freemodelcontent(struct svmmodel) and freeanddestroymodel(struct svmmodel*). If the model is generated by svmtrain(), then SV points to data in svmproblem and should not be removed. For example, freesv is 0 if svmmodel is created by svmtrain, but is 1 if created by svmload_model.
Function: double svmpredict(const struct svmmodel *model, const struct svm_node *x);
This function does classification or regression on a test vector x given a model.
For a classification model, the predicted class for x is returned. For a regression model, the function value of x calculated using the model is returned. For an one-class model, +1 or -1 is returned.
Function: void svmcrossvalidation(const struct svmproblem *prob, const struct svmparameter *param, int nr_fold, double *target);
This function conducts cross validation. Data are separated to nr_fold folds. Under given parameters, sequentially each fold is validated using the model from training the remaining. Predicted labels (of all prob's instances) in the validation process are stored in the array called target.
The format of svmprob is same as that for svmtrain().
Function: int svmgetsvmtype(const struct svmmodel *model);
This function gives svmtype of the model. Possible values of svmtype are defined in svm.h.
Function: int svmgetnrclass(const svmmodel *model);
For a classification model, this function gives the number of classes. For a regression or an one-class model, 2 is returned.
Function: void svmgetlabels(const svm_model model, int label)
For a classification model, this function outputs the name of labels into an array called label. For regression and one-class models, label is unchanged.
Function: void svmgetsvindices(const struct svmmodel *model, int *sv_indices)
This function outputs indices of support vectors into an array called svindices. The size of svindices is the number of support vectors and can be obtained by calling svmgetnrsv. Each svindices[i] is in the range of [1, ..., numtraningdata].
Function: int svmgetnrsv(const struct svmmodel *model)
This function gives the number of total support vector.
Function: double svmgetsvrprobability(const struct svmmodel *model);
For a regression model with probability information, this function outputs a value sigma > 0. For test data, we consider the probability model: target value = predicted value + z, z: Laplace distribution e^(-|z|/sigma)/(2sigma)
If the model is not for svr or does not contain required information, 0 is returned.
Function: double svmpredictvalues(const svmmodel *model, const svmnode x, double dec_values)
This function gives decision values on a test vector x given a model, and return the predicted label (classification) or the function value (regression).
For a classification model with nrclass classes, this function gives nrclass*(nrclass-1)/2 decision values in the array decvalues, where nrclass can be obtained from the function svmgetnrclass. The order is label[0] vs. label[1], ..., label[0] vs. label[nrclass-1], label[1] vs. label[2], ..., label[nrclass-2] vs. label[nrclass-1], where label can be obtained from the function svmgetlabels. The returned value is the predicted class for x. Note that when nrclass = 1, this function does not give any decision value.
For a regression model, decvalues[0] and the returned value are both the function value of x calculated using the model. For a one-class model, decvalues[0] is the decision value of x, while the returned value is +1/-1.
Function: double svmpredictprobability(const struct svmmodel *model, const struct svmnode x, double prob_estimates);
This function does classification or regression on a test vector x given a model with probability information.
For a classification model with probability information, this function gives nrclass probability estimates in the array probestimates. nrclass can be obtained from the function svmgetnrclass. The class with the highest probability is returned. For regression/one-class SVM, the array probestimates is unchanged and the returned value is the same as that of svmpredict.
Function: const char *svmcheckparameter(const struct svmproblem *prob, const struct svmparameter *param);
This function checks whether the parameters are within the feasible range of the problem. This function should be called before calling svmtrain() and svmcross_validation(). It returns NULL if the parameters are feasible, otherwise an error message is returned.
Function: int svmcheckprobabilitymodel(const struct svmmodel *model);
This function checks whether the model contains required information to do probability estimates. If so, it returns +1. Otherwise, 0 is returned. This function should be called before calling svmgetsvrprobability and svmpredict_probability.
Function: int svmsavemodel(const char *modelfilename, const struct svm_model *model);
This function saves a model to a file; returns 0 on success, or -1 if an error occurs.
Function: struct svmmodel *svmloadmodel(const char *modelfile_name);
This function returns a pointer to the model read from the file, or a null pointer if the model could not be loaded.
Function: void svmfreemodelcontent(struct svmmodel *model_ptr);
This function frees the memory used by the entries in a model structure.
Function: void svmfreeanddestroymodel(struct svmmodel **modelptr_ptr);
This function frees the memory used by a model and destroys the model structure. It is equivalent to svmdestroymodel, which is deprecated after version 3.0.
Function: void svmdestroyparam(struct svm_parameter *param);
This function frees the memory used by a parameter set.
Function: void svmsetprintstringfunction(void (*print_func)(const char *));
Users can specify their output format by a function. Use svmsetprintstringfunction(NULL); for default printing to stdout.
The pre-compiled java class archive `libsvm.jar' and its source files are in the java directory. To run the programs, use
java -classpath libsvm.jar svmtrain java -classpath libsvm.jar svmpredict java -classpath libsvm.jar svmtoy java -classpath libsvm.jar svmscale
Note that you need Java 1.5 (5.0) or above to run it.
You may need to add Java runtime library (like classes.zip) to the classpath. You may need to increase maximum Java heap size.
Library usages are similar to the C version. These functions are available:
public class svm { public static final int LIBSVMVERSION=324; public static svmmodel svmtrain(svmproblem prob, svmparameter param); public static void svmcrossvalidation(svmproblem prob, svmparameter param, int nrfold, double[] target); public static int svmgetsvmtype(svmmodel model); public static int svmgetnrclass(svmmodel model); public static void svmgetlabels(svmmodel model, int[] label); public static void svmgetsvindices(svmmodel model, int[] indices); public static int svmgetnrsv(svmmodel model); public static double svmgetsvrprobability(svmmodel model); public static double svmpredictvalues(svmmodel model, svmnode[] x, double[] decvalues); public static double svmpredict(svmmodel model, svmnode[] x); public static double svmpredictprobability(svmmodel model, svmnode[] x, double[] probestimates); public static void svmsavemodel(String modelfilename, svmmodel model) throws IOException public static svmmodel svmloadmodel(String modelfilename) throws IOException public static String svmcheckparameter(svmproblem prob, svmparameter param); public static int svmcheckprobabilitymodel(svmmodel model); public static void svmsetprintstringfunction(svmprintinterface print_func); }
The library is in the "libsvm" package. Note that in Java version, svm_node[] is not ended with a node whose index = -1.
Users can specify their output format by
your_print_func = new svm_print_interface() { public void print(String s) { // your own format } }; svm.svm_set_print_string_function(your_print_func);
Windows binaries are available in the directory `windows'. To re-build them via Visual C++, use the following steps:
"C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Auxiliary\Build\vcvars64.bat"
You may have to modify the above command according which version of VC++ or where it is installed.
nmake -f Makefile.win clean all
nmake -f Makefile.win lib
Another way is to build them from Visual C++ environment. See details in libsvm FAQ.
See the README file in the tools directory.
Please check the file README in the directory `matlab'.
See the README file in python directory.
If you find LIBSVM helpful, please cite it as
Chih-Chung Chang and Chih-Jen Lin, LIBSVM : a library for support vector machines. ACM Transactions on Intelligent Systems and Technology, 2:27:1--27:27, 2011. Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm
LIBSVM implementation document is available at http://www.csie.ntu.edu.tw/~cjlin/papers/libsvm.pdf
For any questions and comments, please email [email protected]
Acknowledgments: This work was supported in part by the National Science Council of Taiwan via the grant NSC 89-2213-E-002-013. The authors thank their group members and users for many helpful discussions and comments. They are listed in http://www.csie.ntu.edu.tw/~cjlin/libsvm/acknowledgements