Cleaning everything up

Now that we have everything we want, it's time to do the final cleaning; remember we still have the 'cmp_name' and 'user' columns. Those are useless now, so they have to go. Also, I want to reorder the columns in DataFrame so that it is more relevant to the data it now contains. In order to do this, we just need to filter df on the column list we want. We'll get back a brand new DataFrame that we can reassign to df itself:

#20
final_columns = [
'Type', 'Start', 'End', 'Duration', 'Day of Week', 'Budget',
'Currency', 'Clicks', 'Impressions', 'Spent', 'CTR', 'CPC',
'CPI', 'Target Age', 'Target Gender', 'Username', 'Email',
'Name', 'Gender', 'Age'
]
df = df[final_columns]

I have grouped the campaign information at the beginning, then the measurements, and finally the user data at the end. Now our DataFrame is clean and ready for us to inspect.

Before we start going crazy with graphs, what about taking a snapshot of DataFrame so that we can easily reconstruct it from a file, rather than having to redo all the steps we did to get here. Some analysts may want to have it in spreadsheet form, to do a different kind of analysis than the one we want to do, so let's see how to save DataFrame to a file. It's easier done than said.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset