Making violin plots show different distributions

What if we don't want to show the interquartile range of data, but just the full distribution? We have a similar kind of plot called a violin plot for this. We will add some extra different kinds of distributions to it that are not Gaussian.

We will include a lognormal, which is a Gaussian and logarithmic space. This is often used in astrophysics to describe a wide variety of physical phenomena. We will also include a pareto distribution, which is used frequently in economics since it describes a lot of economic systems.

First, we will take a look at a violin plot:

# Some other kinds of distributions
rands4 = np.random.lognormal(size=500, sigma=0.5)
rands5 = np.random.pareto(size=500, a=5)
dists = (rands1, rands4, rands5)

# Basic Violin plot
plt.violinplot(dists);

We will get the following output:

In the preceding output, a lot of information is conveyed.

The different distributions are as follows:

  • In the first distribution, we can see that the Gaussian has a little bit of a bump as we are sampling it imperfectly. We have 500 points, so there's a short noise.
  • Log normal (the second distribution) is mostly centered on 0 and has a long tail.
  • The pareto distribution is incredibly unequal.

Hence, this is a great way of conveying the differences in distributions that we see within multiple datasets.

By adding some extra annotations and setting showmeans = True, we get an extra, line which is the mean or average value:

We can also show the median, which is the value at which half the data points lie on each side. The mean is the value that we get by adding all of the values together and then dividing by the number of values there are, whereas the median is the true middle. The output is as follows:

# Show Median & Means
plt.violinplot(dists, showmeans=True, showmedians=True);

Following is the output of the preceding code:

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset