Visualization plots with Anaconda

From getting data, manipulating and processing data to visualizing and communicating the research results, Python and Anaconda support a variety of processes in the scientific data workflow. Python can be used in a wide variety of applications (even beyond scientific computing); users can adopt this language quickly and don't need to learn new software or programming languages. Python's open source availability enhances the research results and enables users to connect with a large community of scientists and engineers around the world.

The following are some of the common plotting libraries that you can use with Anaconda:

  • matplotlib: This is one of the most popular plotting libraries for Python. Coupled with NumPy and SciPy, this is one of the major driving forces in the scientific Python community. IPython has a pylab mode, which was specifically designed to perform interactive plotting using matplotlib.
  • Plotly: This is a collaborative plotting and analytics platform that works on a browser. It supports interactive graphs using IPython notebooks. Graphs are interactive and can be styled by modifying the code and viewing the results interactively. Any plotting code that is generated using matplotlib can be easily exported to a Plotly version.
  • Veusz: This is a GPL-scientific plotting package written in Python and PyQt. Veusz can also be embedded in other Python programs.
  • Mayavi: This is a three-dimensional plotting package that is fully scriptable from Python and is similar to a simple pylab and MATLAB-like interface for plotting arrays.
  • NetworkX: This is a Python language software package for the creation, manipulation, and study of the structure, dynamics, and functions of complex networks.
  • pygooglechart: This is a powerful package that enables you to create visualization methods and allows you to interface with the Google Chart API.

The surface-3D plot

Three-dimensional plots are generated from the data defined as Z as a function of (X,Y). This is mathematically denoted as Z=f(X,Y). In our example here, we will plot Z=sin(sqrt(X2+Y2)), and this is essentially similar to a two-dimensional parabola. The following steps need to be followed for our plot:

  1. First, generate the X and Y grid with the following code:
    import numpy as np
    
    X = np.arange(-4, 4, 0.25) 
    Y = np.arange(-4, 4, 0.25) 
    X, Y = np.meshgrid(X, Y)
    Generate the Z data:
    R = np.sqrt(X**2 + Y**2)
    Z = np.sin(R)

    Plotting a simple three-dimensional surface sin(sqrt(X**2+Y**2)) using the mpl_toolkits package is shown here; the blow and the plot diagram is represented using a color bar:

    The surface-3D plot
  2. Then, plot the surface, as shown in the following code:
    from mpl_toolkits.mplot3d import Axes3d
    from matplotlib import cm
    from matplotlib.ticker import LinearLocator, FormatStrFormatter
    import matplotlib.pyplot as plt
    import numpy as np
    
    fig = plt.figure(figsize=(12,9))
    ax = fig.gca(projection='3d')
    X = np.arange(-4, 4, 0.25)
    Y = np.arange(-4, 4, 0.25)
    X, Y = np.meshgrid(X, Y)
    R = np.sqrt(X**2 + Y**2)
    Z = np.sin(R)
    surf = ax.plot_surface(X, Y, Z, rstride=1, cstride=1, cmap=cm.coolwarm, linewidth=0, antialiased=False)
    
    ax.set_zlim(-1.01, 1.01)
    ax.zaxis.set_major_locator(LinearLocator(10))
    ax.zaxis.set_major_formatter(FormatStrFormatter('%.02f'))
    
    fig.colorbar(surf, shrink=0.6, aspect=6)
    
    plt.show()

In order to make this three-dimensional plot work, you have to make sure that matplotlib and NumPy are installed. The default package in Anaconda comes with these installed.

The square map plot

With the comparison and ranking example that we discussed in the previous chapter to display the top 12 countries in Africa by GDP using the squarify algorithm (with matplotlib), you can obtain a plot that looks similar to a tree map, as shown in the following code:

# Squarified Treemap Layout : source file (squarify.py)
# Implements algorithm from Bruls, Huizing, van Wijk, "Squarified Treemaps"
# squarify was created by Uri Laserson 
# primarily intended to support d3.js 

def normalize_sizes(sizes, dx, dy):
  total_size = sum(sizes)
  total_area = dx * dy
  sizes = map(float, sizes)
  sizes = map(lambda size: size * total_area / total_size, sizes)
  return sizes

def pad_rectangle(rect):
  if rect['dx'] > 2:
    rect['x'] += 1
    rect['dx'] -= 2
  if rect['dy'] > 2:
    rect ['y'] += 1
    rect['dy'] -= 2

def layoutrow(sizes, x, y, dx, dy):
  covered_area = sum(sizes)
  width = covered_area / dy
  rects = []
  for size in sizes:  
    rects.append({'x': x, 'y': y, 'dx': width, 'dy': size / width})
    y += size / width
  return rects


def layoutcol(sizes, x, y, dx, dy):
  covered_area = sum(sizes)
  height = covered_area / dx
  rects = []
  for size in sizes:
    rects.append({'x': x, 'y': y, 'dx': size / height, 'dy': height})
    x += size / height
  return rects

def layout(sizes, x, y, dx, dy):
  return layoutrow(sizes, x, y, dx, dy) if dx >= dy else layoutcol(sizes, x, y, dx, dy)

def leftoverrow(sizes, x, y, dx, dy):
  covered_area = sum(sizes)
  width = covered_area / dy
  leftover_x = x + width
  leftover_y = y
  leftover_dx = dx - width
  leftover_dy = dy
  return (leftover_x, leftover_y, leftover_dx, leftover_dy)

def leftovercol(sizes, x, y, dx, dy):
  covered_area = sum(sizes)
  height = covered_area / dx
  leftover_x = x
  leftover_y = y + height
  leftover_dx = dx
  leftover_dy = dy - height
  return (leftover_x, leftover_y, leftover_dx, leftover_dy)

def leftover(sizes, x, y, dx, dy):
  return leftoverrow(sizes, x, y, dx, dy) if dx >= dy else leftovercol(sizes, x, y, dx, dy)

def worst_ratio(sizes, x, y, dx, dy):
  return max([max(rect['dx'] / rect['dy'], rect['dy'] / rect['dx']) for rect in layout(sizes, x, y, dx, dy)])

def squarify(sizes, x, y, dx, dy):
  sizes = map(float, sizes)
  if len(sizes) == 0:
    return []
  if len(sizes) == 1:
    return layout(sizes, x, y, dx, dy)
  # figure out where 'split' should be
  i = 1
  while i < len(sizes) and worst_ratio(sizes[:i], x, y, dx, dy) >= worst_ratio(sizes[:(i+1)], x, y, dx, dy):
    i += 1
  current = sizes[:i]
  remaining = sizes[i:]
  (leftover_x, leftover_y, leftover_dx, leftover_dy) = leftover(current, x, y, dx, dy)
  return layout(current, x, y, dx, dy) + 
squarify(remaining, leftover_x, leftover_y, leftover_dx, leftover_dy)

def padded_squarify(sizes, x, y, dx, dy):
  rects = squarify(sizes, x, y, dx, dy)
  for rect in rects:
    pad_rectangle(rect)
  return rects

The squarify function displayed in the preceding code can be used to display the top 12 countries by GDP in Africa, as shown in the following code:

import matplotlib.pyplot as plt
import matplotlib.cm
import random
import squarify

x = 0.
y = 0.
width = 950.
height = 733.
norm_x=1000
norm_y=1000

fig = plt.figure(figsize=(15,13))
ax=fig.add_subplot(111,axisbg='white')

initvalues = [285.4,188.4,173,140.6,91.4,75.5,62.3,39.6,29.4,28.5, 26.2, 22.2]
values = initvalues
labels = ["South Africa", "Egypt", "Nigeria", "Algeria", "Morocco",
"Angola", "Libya", "Tunisia", "Kenya", "Ethiopia", "Ghana", "Cameron"]

colors = [(214,27,31),(229,109,0),(109,178,2),(50,155,18), 
(41,127,214),(27,70,163),(72,17,121),(209,0,89), 
(148,0,26),(223,44,13), (195,215,0)] 
# Scale the RGB values to the [0, 1] range, which is the format matplotlib accepts. 
for i in range(len(colors)): 
  r, g, b = colors[i] 
  colors[i] = (r / 255., g / 255., b / 255.) 

# values must be sorted descending (and positive, obviously)
values.sort(reverse=True)

# the sum of the values must equal the total area to be laid out
# i.e., sum(values) == width * height
values = squarify.normalize_sizes(values, width, height)

# padded rectangles will probably visualize better for certain cases
rects = squarify.padded_squarify(values, x, y, width, height)

cmap = matplotlib.cm.get_cmap()

color = [cmap(random.random()) for i in range(len(values))]
x = [rect['x'] for rect in rects]
y = [rect['y'] for rect in rects]
dx = [rect['dx'] for rect in rects]
dy = [rect['dy'] for rect in rects]

ax.bar(x, dy, width=dx, bottom=y, color=colors, label=labels)

va = 'center'
idx=1

for l, r, v in zip(labels, rects, initvalues):
  x, y, dx, dy = r['x'], r['y'], r['dx'], r['dy']
  ax.text(x + dx / 2, y + dy / 2+10, str(idx)+"--> "+l, va=va,
     ha='center', color='white', fontsize=14)
  ax.text(x + dx / 2, y + dy / 2-12, "($"+str(v)+"b)", va=va,
     ha='center', color='white', fontsize=12)
  idx = idx+1
ax.set_xlim(0, norm_x)
ax.set_ylim(0, norm_y)
plt.show()
The square map plot
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset