Support Vector Machines

Reference implementations

  • SVMlight - lean reference implementation by Thorsten Joachims. I had some problems to get it running in my code (linker errors …)
  • SVMlightlib - an easy to use wrapper around the original SVMlight by Miha Grčar. Here I had no problems to get it running. Version 1.0 also comes with a .NET wrapper and it shows how to use the SVMlightlib functions:
public static void Main(string[] args)
    // *** Test SVM^light inductive mode ***
    Console.WriteLine("Testing SVM^light inductive mode ...");
    Console.WriteLine("Training ...");
    StreamReader reader = new StreamReader(@"W:\\01_data\\97_svm_training_test_data\\example_textclassification\\train.dat");
    List<int> train_set = ReadFeatureVectors(reader);
    int model_id = SvmLightLib.TrainModel("", train_set.Count, train_set.ToArray());
    Console.WriteLine("Classifying ...");
    reader = new StreamReader(@"W:\\01_data\\97_svm_training_test_data\\example_textclassification\\test.dat");
    List<int> test_set = ReadFeatureVectors(reader);
    int correct = 0;
    foreach (int vec_id in test_set)
        int true_lbl = (int)SvmLightLib.GetFeatureVectorLabel(vec_id);
        SvmLightLib.Classify(model_id, 1, new int[] { vec_id });
        Debug.Assert(SvmLightLib.GetFeatureVectorClassifScoreCount(vec_id) == 1);
        double result = SvmLightLib.GetFeatureVectorClassifScore(vec_id, 0);
        int predicted_lbl = result > 0 ? 1 : -1;
        if (true_lbl == predicted_lbl) { correct++; }
    Console.WriteLine("Accuracy: {0:0.00}%", (double)correct / (double)test_set.Count * 100.0);
    // cleanup
    foreach (int vec_id in train_set) { SvmLightLib.DeleteFeatureVector(vec_id); }
    foreach (int vec_id in test_set) { SvmLightLib.DeleteFeatureVector(vec_id); }

How can we provide class probabilities instead of class labels?

Normally a SVM gives you a class label y=+1 or y=-1 for a two-class problem and vector x to be classified. But sometimes you rather want class probabilities: P(y=1|x).

There is a lot of work on how SVMs can provide class probabilities. A famous method is Platt's method. It is described in this talk:

Probabilistic Outputs for SVMs and Comparisons to Regularized Likelihood Methods by Nikos Karampatziakis & John Platt.

This talk also mentions alternative approaches to do so: 1.Wahba’s approach, 2.Vapnik’s approach, 3.Hastie's & Tibshirani's approach.

Platt's method is also implemented in libsvm if you use svm-train -b 1 to train a SVM model which can provide class probabilities and svm-predict -b 1 to predict test vectors to class probabilities.

The main idea seems to consider the scores f(x) of the SVM decision function (not only the sign, but also the absolut value!), to count how often each score value appears for each class and to estimate the class probability by these relative frequencies. The idea is also described in this talk: Obtaining Calibrated Probability Estimates from SVMs by Joseph Drish:

• We now reach the part where the outputs of the classifier are converted into well
  calibrated posterior probabilities.
• Given a test example x, knowing its distance from the separating hyperplane, and the class
  to which it belongs, the problem of interest is to know the probability with which x
  belongs to class j.
• Binning is a simple histogram technique.
• Training examples are first ranked according to their distance scores.
• They are then divided into subsets of equal size –these are called bins.
• Each bin therefore has an upper bound and a lower bound distance score value.
• The number of subsets bis chosen experimentally so that the variance in the resulting
  probability estimates is reduced.
• Given a test example x, it is placed in the bin according to the distance score predicted
  for it by the SVM.
• The corresponding estimated probability that xbelongs to its predicted class jis the
  fraction of true positives in the bin, i.e. the fraction of training examples in the bin
  that actually belong to the class that has been predicted for x.

In Platt's paper

Probabilistic Outputs for Support Vector Machines and Comparisons in Advances In Large Margin Classifiers, 1999, 61–74, MIT Press

Platt says that the SVM is trained and then the parameters of an additional sigmoid function are trained to map the SVM outputs into probabilities.

Using libsvm the classifier performance is bad: what to do?

public/svm.txt · Last modified: 2012/06/27 19:35 (external edit) · []
Recent changes RSS feed Powered by PHP Valid XHTML 1.0 Valid CSS Driven by DokuWiki