Data in images: what tools for data visualization?

Producing aesthetic or interactive graphs

Contents

Don’t we say “a picture is worth a thousand words”? We could also say that Chemistry and Physics, are visual sciences (1). A sketch of a reaction scheme in a laboratory notebook, traction curves in an article or absorption spectra, are data visualization techniques.

Data formatting leads to efficiently communicating a result, but researchers and their students rarely receive training in data visualization.

These following tools will help you embellish your articles and presentations.

Python

Python contains many packages for producing aesthetic, interactive graphs. Some are generic, which means they can produce a large number of different graphs for use in several contexts, while others were specifically developed for one discipline.

Focus on Matplotlib

Matplotlib is one of the best-known Python libraries for creating graphs. It gives control over every graphical parameter (lines, axes, colours, fonts, etc.) for an optimal result. Developed in 2003, Matplotlib is sometimes put aside for the benefit of plotly, a more recent tool coming from the JavaScript library with the same name: plotly.js. It is based on the famous JavaScript d3.js library, shown here. The plotly tool uses light syntax for producing good graphs with only a few code lines, easy to include in a Jupyter notebook (also read the article here) via widgets. The library can be used to create many rather exclusive 3D graphs.

Discipline-based packages include ASE’s GUI, a user interface for handling and visualizing molecules and atomic objects. It also performs different types of calculations and transfers in different formats: traj, xyz, pdb, cube, and many others available here. This is a module belonging to a Python set of tools for atomic simulation: ASE. Besides ase-gui module, matplotlib is also cited in the visualization section.

R

The basis function of R can be used to create graphs. On the other hand, a number of packages can be added to improve and vary data representation. The best-known package in R for data visualization – and entirely dedicated to it – is undoubtedly ggplot2. It also exists in Python as ggplot. It can be used for almost any type of customizable graph. See documentation here. Ggplot2 is an extremely powerful tool, but learning to use it, and understanding the philosophy of the grammar of graphics is more confusing than the basic R tool.

Focus on R-Shiny

Another interesting tool for data visualisation in R is R-Shiny. The package can be used to create web applications directly from the RStudio environment. It is also easy to publish one or more graphs, text, and other types of content on a web page. These applications can be also integrated into an R Markdown document. To get an overview of the possibilities of R-Shiny, you can go to the gallery here. If you are still unable to choose a data visualization tool, go to this website with filters and a network scheme. The site features loads of resources about how to produce graphs via R and Python using different packages (including ggplot2 and plotly). You will also find a tutorial about using R Markdown for producing documents containing interactive graphs.

Useful resources

Cabanski, Christopher, et al. « Can Graphics Tell Lies? A Tutorial on How To Visualize Your Data: Tutorial on How To Visualize Your Data ». Clinical and Translational Science, vol. 11, nᵒ 4, juillet 2018, p. 371‑77, doi:10.1111/cts.12554.

Weiss, Charles J. « Scientific Computing for Chemists: An Undergraduate Course in Simulations, Data Processing, and Visualization ». Journal of Chemical Education, vol. 94, nᵒ 5, mai 2017, p. 592‑97, doi:10.1021/acs.jchemed.7b00078.

Parish, Chad M., et Philip D. Edmondson. « Data Visualization Heuristics for the Physical Sciences ». Materials & Design, vol. 179, octobre 2019, p. 107868. ScienceDirect, doi:10.1016/j.matdes.2019.107868.

Chen, Chun-houh, et al. Handbook of data visualization. Springer, 2008.

Course on Ggplot 2 and Shiny developed by Colin Bousige, CNRS researcher, Laboratoire des Multimatériaux et des Interfaces, University of Lyon 1.

Course on Ggplot2 available on-line on a website dedicated to R, where researchers from various disciplines (political sciences, economy, sociology, etc.) contribute

Python package for computational chemistry.

Galerie de graphiques avec R.

  1. Wu, Hsin-Kai, and Priti Shah. “Exploring Visuospatial Thinking in Chemistry Learning”. Science Education, vol. 88, no. 3, May 2004, p. 465, doi:10.1002/sce.10126.