Histogram

histogram is used to graphically recap and display the distribution of data points within a dataset.

Using the following Python code, we can now try a histogram for each numeric input value within our data (GCR, ILD, ILM, NPHI, PE, and PEF):

import pylab as pl
df_data_1.drop('lito_ID' ,axis=1).hist(bins=30, figsize=(10,10))
pl.suptitle("Histogram for each numeric input variable")
plt.savefig('lithofacies_hist')
plt.show()

This gives you the following output:

Once a histogram is generated from the data, the first question that is usually asked is whether the shape of the histogram is normal. A characteristic of a normal distribution (of data), s, is that it is symmetrical. This means that if the distribution is cut in half, each side will be the mirror of the other, forming a bell-shaped curve, as shown in the following screenshot:

From our generated histograms, perhaps the NPHI data point comes the closest to showing a normal distribution.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset