Predicting using a dataset

Without much talking, let's take a look at the following code:

import numpy as np
import pandas as pd
from sklearn.neighbors import KNeighborsClassifier

knn = KNeighborsClassifier(n_neighbors=5)
data = pd.read_csv('dataset.csv')

x = np.array(data[['Time', 'Temp']])
y = np.array(data[['State']]).ravel()

knn.fit(x,y)

time = raw_input("Enter time")
temp = raw_input("Enter temp")

data =. []

data.append(float(time))
data.append(float(temp))

a = knn.predict([data])

print(a[0])}

So, let's see what we are doing here:

import numpy as np

We are importing numpy to our program; this helps us handle lists and matrices:

import pandas as pd

Here, we are importing a library named pandas; this helps us read files in comma-separated values or in other words, CSV files. We will be using CSV files to store our data and access it for learning process:

from sklearn.neighbors import KNeighborsClassifier

Here, we are importing KneighborsClassifier from the library sklearn. sklearn itself is a huge library; hence, we are importing only a part of it as we will not be using all of it in this program:

knn = KNeighborsClassifier(n_neighbors=5)

Here, we are giving value to variable knn wherein the value would be KNeighborsClassifer(n_neighbors =5); what this means is that it is using the KneighborsClassifer() function with the argument as n_neighbors=5. This argument tells the KneighborsClassifer function that we will be having five neighbors in the algorithm. Further to this using this declaration, the whole function can be called using knn:

data = pd.read_csv('dataset.csv')

Here, we are providing value to a variable called data and the value passed is pd.read_csv('dataset.csv'); what this means is that whenever data is called, then a pd.read_csv() function from the pandas library will be called. The purpose of this function is to read data from the CSV files. Here, the argument passed is dataset.csv; hence, it is indicating which data would be read by the function. In our case, it will read from a file name: dataset.csv:

x = np.array(data[['Time', 'Temp']])

In the following line, we are passing value to the variable x, and the value being passed is np.array(data[['Time, 'Temp']]). Now the np.array function to make an array through the numpy library. This array will store the data by the name of Time and Temp:

y = np.array(data[['State']]).ravel()

Just like the previous time, we are storing State in an array made through the numpy library .ravel() function at the end would transpose the array. This is done so that the mathematical functions can be done between two arrays—x and y:

knn.fit(x,y)

In this small line, we are using the function from the knn library called fit() what it is doing is fitting the model using the x as the primary data and y as the output resultant data:

time = raw_input("Enter time")
temp = raw_input("Enter temp")

In this line, we are requesting the data from the user. In the first line, we will be printing Enter time and thereafter wait for user to enter the time. After user has entered the time, it will be stored in the variable named time. Once that is done, then it would move on to the next line; the code and it would print Enter temp once that is prompted to the user it would wait for data to be collected. Once data is fetched by the user, it will store that data in the variable called temp:

data =. []

Here, we are making an empty list by the name of data; this list will be used for calculating the resultant state of the output. As all the machine learning algorithm is working in list data type. Hence, the input must be given for decision in the form of a list itself:

data.append(float(time))
data.append(float(temp))

Here, we are adding data to the list that we just created with the name data. First, time will be added, followed by temp:

a = knn.predict([data])

Once that is done, a function named predict from the knn algorithm will be used to predict the output based on the list provided with the name of data. The output of the prediction algorithm is fetched to a variable by the name a:

print(a[0])

Finally, once the prediction is done, then we would read the value of a and remember that all the data I/O is happening in the form of lists. Hence, the data output given by the prediction algorithm would also be in the list format. Hence, we are printing the first element of the list.

This output will predict which state will be of the fan according to the dataset given by the user. So, go ahead and give a temperature and a time and let the system predict the outcome for you. See if it works fine or not. If it doesn't, then try adding some more datasets to the CSV files or see whether the values in the dataset actually make any sense. I am sure that you end up with a wonderful predictive system.

Table of Contents for Predicting using a dataset

Create new playlist

Sign In

Sign Up

Table of Contents for
Predicting using a dataset