Why Biology needs Data Visualization

A manifesto for integrating data visualization as research method in the life sciences.
datavis
biovis
Author

Helena Jambor

Published

December 15, 2025

Why Seeing Still Matters in Big Data Biology

Biology probes form and function of Life. Form is easy to grasp: cells under a microscope, subcellular structures in electron micrographs, or organisms on camera readily present their shapes.

Figure 1: Examples of microscopy images that allow understanding of Life: Fruit fly ovaries development and the changing RNA localizations during the process. Jambor et al

Function is different: it emerges from molecular compositions, interactions, and temporal changes. Such data is not directly visible - we use statistics to make sense of it. But summaries and p-values alone rarely reveal how complex biological systems are organized, the variability in the samples and the resulting uncertainty in the data, or unexpected relationships and pattersn. As datasets grow larger and more complex, these insights only become accessible when data are visualized.

Figure 2: Statistical chart comparing RNA localizations across development Jambor et al.

Despite being used widely, data visualization is still treated as a final step in research, a way to communicate results once the real analysis is finished. In reality, visualization plays a much earlier and more fundamental role. Visuals expose batch effects, hidden subpopulations, nonlinear behaviors, and experimental artifacts that often remain invisible to summary statistics alone. These insights directly shape which data can be trusted, which controls are needed, and which experiments should come next.

Figure 3: Examples of plots used during the research process to visualize technical and sample variability.

While the urgency to visualize data feels modern, the principle itself is not new. Seeing has always been central to biological understanding. Darwin’s and Linnaeus’s classification of species relied on careful visual comparison. In the nineteenth century, Florence Nightingale pioneered statistical charts to reform healthcare, while John Snow’s maps of cholera outbreaks transformed how disease transmission was understood. In the twentieth century, Michaelis and Menten introduced the kinetic plot as a standardized visual language for enzyme activity, and more recently, interactive genome browsers have made entire genomes navigable at nucleotide resolution.

Darwin’s phylogenetic tree

Snow’s Cholera map

Nightingale’s chart invented to document health care reforms

Marey’s animation of human locomotion
Figure 4: Early data visualizations from life and health sciences.

Today, data visualization is however still poorly formalized in the life sciences. It lacks dedicated training programs, shared standards, and institutional recognition. This gap matters, as data visualization leads to hypotheses generation, insightful data presentation, and builds trust in the results.

Just as early scientists needed training in scientific drawing to accurately document what they observed, today’s researchers must learn to engineer and interpret data visualizations with comparable rigor. In an era where biology increasingly unfolds in data rather than images alone, learning how to see again has never been more important.