Examining the model performance

We'll start by making predictions on our test data and then we'll examine whether our predictions were correct:

y_hat = clf.predict(X_test) 
y_true = y_test 
 
pdf = pd.DataFrame({'y_true': y_true, 'y_hat': y_hat}) 
 
pdf['correct'] = pdf.apply(lambda x: 1 if x['y_true'] == x['y_hat'] else 0, axis=1) 
 
pdf 

The preceding code generates the following output:

Let's now look at what percentage of the 200 IPOs in our test dataset we should have invested in—remember, that means they rose over 2.5% from the open to the close:

pdf['y_true'].value_counts(normalize=True) 

The preceding code generates the following output:

So, just north of half the IPOs rose over 2.5% from their opening tick to the closing tick. Let's see how accurate our model's calls were:

pdf['correct'].value_counts(normalize=True) 

The preceding code generates the following output:

Well, it looks like our model was about as accurate as a coin flip. That doesn't seem too promising. But with investing, what is important is not the accuracy but the expectancy. If we had a number of small losses, but a couple of huge wins, overall, the model could still be very profitable. Let's examine whether that's the case here. We'll join our results data with the first-day change data to explore this:

results = pd.merge(df[['1st Day Open to Close $ Chg']], pdf, left_index=True, right_index=True) 
 
results 

The preceding code generates the following output:

First, let's see what our results would have looked like for one share of every one of the 200 IPOs in our test data:

results['1st Day Open to Close $ Chg'].sum() 

The preceding code generates the following output:

From this, we see that we would have gained over $215 in an ideal cost-free scenario. Now, let's examine some of the other statistics concerning these IPOs:

results['1st Day Open to Close $ Chg'].describe() 

The preceding code generates the following output:

Based on the preceding, we see that the average gain was just over $1, and the largest loss was 15 times that. How does our model fare against those numbers? First, we look at the trades our model said we should take and the resulting gains:

# ipo buys 
results[results['y_hat']==1]['1st Day Open to Close $ Chg'].sum() 

The preceding code generates the following output:

Let's look at the other statistics as well:

# ipo buys 
results[results['y_hat']==1]['1st Day Open to Close $ Chg'].describe() 

The preceding code generates the following output:

Here, we see that our model suggested investing in only 34 IPOs, the mean gain rose to $1.50, the largest loss was reduced to under $10, and we still were able to capture the best performing IPO. It's not stellar, but we may be on to something. We'd need to explore further to really know whether we do have something worth expanding further.

Now, let's move on to examine the factors that seem to influence our model's performance.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset