Implementation of SVM in Python

Introduction

In this blog, we would discuss the Implementation of SVM in Python. SVM, or Support Vector Machine, is a powerful machine learning algorithm that can be used for both classification and regression tasks. The key idea behind SVM is to find a hyperplane that maximally separates the data points of one class from the data points of the other class. In the case of binary classification, the data points are separated by a line (or hyperplane). SVM can also be used for multi-class classification, where the data points are separated by a plane.

 

 

 

What is SVM?

SVM works by finding a hyperplane that best separates our data points into classes. This hyperplane is also known as a decision boundary. To find this decision boundary, SVM optimizes a cost function that is sensitive to outliers. This cost function is called the hinge loss function. SVM is a powerful tool because it can find decision boundaries in high-dimensional space. This is important because many datasets are not linearly separable. By using the kernel trick, SVM can find decision boundaries in non-linear space.

 

There are a few things to keep in mind when using SVM for classification. First, SVM is sensitive to the scale of the data. This means that it is important to scale your data before using SVM. Second, SVM requires a lot of memory and can be slow to train. Finally, SVM is a binary classifier, meaning it can only classify data into two classes. Despite these limitations, SVM is a powerful tool that can be used to achieve high accuracy on many classification tasks.

 

 

 

Implementation of SVM 

Firstly, we import all the required libraries.

 

# Importing the libraries

import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

 

Then we load the dataset, manipulate the dataset and divide it into X and Y labels. you can download the dataset at Dataset

 

# Importing the datasets and preprocessing

datasets = pd.read_csv('Social_Network_Ads.csv')
X = datasets.iloc[:, [2,3]].values
Y = datasets.iloc[:, 4].values

 

We then split the dataset into training and testing sets.

 

# Splitting the dataset into the Training set and Test set

from sklearn.model_selection import train_test_split
X_Train, X_Test, Y_Train, Y_Test = train_test_split(X, Y, test_size = 0.25, random_state = 0)

 

# Feature Scaling

from sklearn.preprocessing import StandardScaler
sc_X = StandardScaler()
X_Train = sc_X.fit_transform(X_Train)
X_Test = sc_X.transform(X_Test)

 

We import the inbuilt SVM from sklearn and fit the dataset to the model.

 

# Fitting the classifier into the Training set

from sklearn.svm import SVC
classifier = SVC(kernel = 'linear', random_state = 0)
classifier.fit(X_Train, Y_Train)

 

# Predicting the test set results

Y_Pred = classifier.predict(X_Test)

 

We use the Confusion Matrix as the Evaluation Metrics.

 

# Making the Confusion Matrix 

from sklearn.metrics import confusion_matrix
cm = confusion_matrix(Y_Test, Y_Pred)

 

 

# Visualising the Training set results

from matplotlib.colors import ListedColormap
X_Set, Y_Set = X_Train, Y_Train
X1, X2 = np.meshgrid(np.arange(start = X_Set[:, 0].min() - 1, stop = X_Set[:, 0].max() + 1, step = 0.01),
                     np.arange(start = X_Set[:, 1].min() - 1, stop = X_Set[:, 1].max() + 1, step = 0.01))
plt.contourf(X1, X2, classifier.predict(np.array([X1.ravel(), X2.ravel()]).T).reshape(X1.shape),
             alpha = 0.75, cmap = ListedColormap(('red', 'green')))
plt.xlim(X1.min(), X1.max())
plt.ylim(X2.min(), X2.max())
for i, j in enumerate(np.unique(Y_Set)):
    plt.scatter(X_Set[Y_Set == j, 0], X_Set[Y_Set == j, 1],
                c = ListedColormap(('red', 'green'))(i), label = j)
plt.title('Support Vector Machine (Training set)')
plt.xlabel('Age')
plt.ylabel('Estimated Salary')
plt.legend()
plt.show()

 

 

# Visualising the Test set results

from matplotlib.colors import ListedColormap
X_Set, Y_Set = X_Test, Y_Test
X1, X2 = np.meshgrid(np.arange(start = X_Set[:, 0].min() - 1, stop = X_Set[:, 0].max() + 1, step = 0.01),
                     np.arange(start = X_Set[:, 1].min() - 1, stop = X_Set[:, 1].max() + 1, step = 0.01))
plt.contourf(X1, X2, classifier.predict(np.array([X1.ravel(), X2.ravel()]).T).reshape(X1.shape),
             alpha = 0.75, cmap = ListedColormap(('red', 'green')))
plt.xlim(X1.min(), X1.max())
plt.ylim(X2.min(), X2.max())
for i, j in enumerate(np.unique(Y_Set)):
    plt.scatter(X_Set[Y_Set == j, 0], X_Set[Y_Set == j, 1],
                c = ListedColormap(('red', 'green'))(i), label = j)
plt.title('Support Vector Machine (Test set)')
plt.xlabel('Age')
plt.ylabel('Estimated Salary')
plt.legend()
plt.show()

 

 

 

 

Also, read – Implementation of K Nearest Neighbors in python.

 

Share this post

One thought on “Implementation of SVM in Python

Leave a Reply

Your email address will not be published. Required fields are marked *