# What is support vector machine/SVM

Last updated on：a month ago

SVM is usually mentioned in Machine Learning. But sometimes I still get confused that how it relates to ML.

# Definition

In machine learning, support vector machines/SVMs are **supervised** **learning models** with associated learning algorithms that analyse data for **classification and regression analysis**.

Also, the neural network is a learning model of machine learning. Different learning models with a different cost function, characteristics, application.

Support vector machine is a large margin classifier.

A Support Vector Machine (SVM) performs classification by finding the hyperplane that maximizes the margin between the two classes. The vectors (cases) that define the hyperplane are the support vectors.

# SVM hypothesis

$$\min_\theta C \sum^{m}*{i=1} [y^{(i)} cost_1 (\theta^T x^{(i)}) + (1 - y^{(i)}) cost_0( \theta^T x^{(i)})] + \frac{1}{2} \sum^{n}*{i=1} \theta_j^2$$

## Need to specify

- Choice of parameter C
- Choice of kernel (similarity function)

For C, remember, C is larger, then theta or omega is larger, then the model is going to overfit

# Kernels

Adapt SVM to develop complex nonlinear classifier

$$f_i = similarity(x, l^{(i)}) = exp ( - \frac{|| x - l^{(i)}|| ^2}{2 \sigma ^2})$$

Superscript is still the level of layer.

## Kernel types

**Linear kernel**

$$\theta_0 + \theta_1 x_1 + … + \theta_n x_n \ge 0$$

**Polynomial kernel**

$$ k(x,l) = (x^T l)^2, (x^T l)^3, (x^T l+1)^2, (x^T l + \text{constant})^{\text{degree}}$$

**More esoteric**

string k, chi-square k, histogram intersection k

# Logistic regression vs SVM

- If n is larger than m, use logistic regression or SVM without a kernel

With so many features, linear functions can fit very complicated non-linear function - If n is small, m is intermediate

Use SVM with Gaussian kernel - If n is small, m is large

Create/add more features, then use logistic regression or SVM without a kernel

Neural network likely to work well for most of these settings, but maybe slower to train.

# Reference

[1] Andrew NG, Machine learning

本博客所有文章除特别声明外，均采用 CC BY-SA 4.0 协议 ，转载请注明出处！