Creating the Model file using Scikit Learn 

In your project folder, create a python file with the following code to create a model file:

# importing required packages
import numpy as np
import pandas as pd
# Reading in and parsing data
raw_data = open('SMSSpamCollection.txt', 'r')
sms_data = []
for line in raw_data:
split_line = line.split(" ")
sms_data.append(split_line)

#Splitting data into messages and labels and training and test in y we are having labels and x with the message text

sms_data = np.array(sms_data)
X = sms_data[:, 1]
y = sms_data[:, 0]

#Build a LinearSVC model
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.svm import LinearSVC

#Build tf-idf vector representation of data
vectorizer = TfidfVectorizer()

# converting the message text as vector
vectorized_text = vectorizer.fit_transform(X)

text_clf = LinearSVC()
# fitting the model
text_clf = text_clf.fit(vectorized_text, y)

Test the fitted model, we can append the following code:

print text_clf.predict(vectorizer.transform(["""XXXMobileMovieClub: To use your credit, click the WAP link in the next txt message or click here>> http://wap. xxxmobilemovieclub.com?n=QJKGIGHJJGCBL"""]))
Upon executing the preceding program, it will show you whether the given message is spam or non-spam.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset