Indexing for visualization

Once the data has been properly prepared, the data scientist uses the Pandas set_index command to set the Order Date column, indexing the rows (of sales transactions) as the field we will perform the initial analysis on. In other words, we want to ultimately be able to predict furniture sales per month. The following screenshot shows the Python statements executed in our Watson project to set the index and print the results of the command:

The data scientist then goes on to point out that it would be more reasonable (given the data) to look at average daily sales for each month, so the following commands are used to resample the sales data, using the start of each month (MS) as the timestamp, and then, as a sanity check, have a peek at the some of the data (order month followed by the computed average sales):

y = furniture['Sales'].resample('MS').mean()
y["2017":]

The following screenshot shows the preceding commands executed in our Watson project:

 

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset