Anomaly detection with one-class SVC

The design of the one-class SVC is an extension of the binary SVC. The main difference is that a single class contains most of the baseline (or normal) observations and the other class is replaced by a reference point known as the SVC origin. The outliers (or abnormal) observations reside beyond (or outside) the support vector of the single class:

Anomaly detection with one-class SVC

Illustration of the one-class SVC

The outlier observations have a labeled value of -1, while the remaining training sets are labeled +1. In order to create a relevant test, we add four more companies that have drastically cut their dividends (ticker symbols WLT, RGS, MDC, NOK, and GM). The dataset includes the stock prices and financial metrics recorded prior to the cut in dividends.

The implementation of this test case is very similar to the binary SVC driver code, except for the following:

  • The classifier uses the Nu-SVM formulation, OneSVFormulation
  • The labeled data is generated by assigning -1 to companies that have eliminated their dividend and +1 for all other companies

The test is executed against the dataset resources/data/chap8/dividends2.csv. First, we need to define the formulation for the one-class SVM:

class OneSVCFormulation(nu: Double) extends SVMFormulation {
  override def update(param: svm_parameter): Unit = {
     param.svm_type = svm_parameter.ONE_CLASS
     param.nu = nu
  }
}

The test code is similar to the execution code for the binary SVC. The only difference is the definition of the output labels; -1 for companies eliminating dividends and +1 for all other companies:

val NU = 0.2; val GAMMA = 0.5; val NFOLDS = 2
val path = "resources/data/chap8/dividends2.csv"

val xs = DataSource(path, true, false, 1) |> extractor
val config = SVMConfig(new OneSVCFormulation(NU),
                           RbfKernel(GAMMA), 
                           SVMExecution(EPS, NFOLDS))
val features = XTSeries.transpose(xs.dropRight(1))
val svc = SVM[Double](config, features, xs.last.map( filter (_)))
svc.accuracy match {
  case Some(acc) => Display.show("Accuracy: $acc", logger)
  case None => { … }
}

The test is executed with the following features: relPriceChange, debtToEquity, dividendCoverage, cashPerShareToPrice, and epsTrend.

The model is generated with the accuracy of 0.821. This level of accuracy should not be a surprise; the outliers (companies that completely eliminated their dividends) are added to the original dividend .csv file. These outliers differ significantly from the baseline observations (companies who have reduced, maintained, or increased their dividend) in the original input file.

Where the labeled observations are available, the one-class support vector machine is an excellent alternative to clustering techniques.

Note

Definition of anomaly

The results generated by a one-class support vector classifier depend heavily on the subjective definition of an outlier. The test case assumes that the companies that eliminate their dividends have unique characteristics that set them apart, and are different even from companies who have cut, maintained, or increased their dividend. There is no guarantee that this assumption is indeed always valid.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset