Other visualization tools

The matplotlib module was originally designed for use on workstations and desktops, not servers. Its design did not arise from use cases for high-volume or large datasets. However, as you saw in this chapter, by using the right tools and taking the appropriate measures, matplotlib can perform admirably with hundreds of millions of data points.

Should you ever hit insurmountable barriers for matplotlib (such as real-time visualization and user interaction with billions of data points), you can make use of the following open source projects that were originally designed by keeping large datasets in mind:

  • ParaView (http://www.paraview.org/): This is an open source, multiplatform data analysis and visualization application. ParaView was developed to analyze extremely large datasets by using distributed memory computing resources. It can be run on supercomputers to analyze datasets of petascale size as well as on laptops for smaller data. Paraview also offers the Python Scripting Interface.
  • VisIt (https://wci.llnl.gov/simulation/computer-codes/visit): This is an open source, interactive, scalable tool for visualization, animation, and analysis. VisIt has a parallel and distributed architecture that allows users to interactively visualize and analyze data, which ranges in scale from small (fewer than 102 cores) desktop-sized projects to large (more than 105 cores) computing facility simulation campaigns. VisIt is capable of visualizing data from over 120 different scientific data formats. It offers a Python interface.
  • Bokeh (http://bokeh.pydata.org/en/latest/): As mentioned previously, Bokeh is a Python interactive visualization library that targets modern web browsers for presentation. Its goal is to not only provide elegant, concise construction of novel graphics in the style of D3.js, but also deliver this capability with high-performance interactivity over very large or streaming datasets.
  • Vispy (http://vispy.org/): This is a new 2D and 3D high-performance visualization library that can handle very large datasets. Vispy uses the OpenGL library and GPUs for increased performance. With Vispy, users can interactively explore plots that have hundreds of millions of points. A basic knowledge of OpenGL is very helpful when using Vispy.

That being said, matplotlib is a powerful, well-known tool in the scientific computing community. Organizations and teams have uncountable years of cumulative experience building, installing, augmenting, and using matplotlib and the libraries of related projects, such as NumPy and SciPy. If there is a new way to put old tools to use without having to suffer the losses in productivity and the re-engineering of infrastructure associated with platform changes, it is often in everyone's best interest to do so.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset