Home Artificial Intelligence A Killer Fix for Scrunched Axes, Step-by-step Step 0: Get Data Step 1: Axes Grid Step 2: Plot Content Step 3: Zoom Indicators Et Voilà! Bonus: Want Insets As an alternative? Further Information Authorship Citations Appendix

A Killer Fix for Scrunched Axes, Step-by-step Step 0: Get Data Step 1: Axes Grid Step 2: Plot Content Step 3: Zoom Indicators Et Voilà! Bonus: Want Insets As an alternative? Further Information Authorship Citations Appendix

0
A Killer Fix for Scrunched Axes, Step-by-step
Step 0: Get Data
Step 1: Axes Grid
Step 2: Plot Content
Step 3: Zoom Indicators
Et Voilà!
Bonus: Want Insets As an alternative?
Further Information
Authorship
Citations
Appendix

Make beautiful multi-scale plots with matplotlib in 3 easy steps.

Towards Data Science

Large-magnitude outliers, tiny features, and sharp spikes are common frustrations to data visualization. All three could make visual details illegible by scrunching plot components into too small an area.

Sometimes a fix could be had by simply excluding unruly data. When including such data is chief to an issue at hand, applying a log scale to axes can realign spacing for higher separation amongst lower magnitude data. This approach can only go to this point, nevertheless.

In this text, we’ll take a take a look at an alternative choice: zoom plots, which augment a visualization with panels providing magnified views of areas of interest.

The visualizations we’ll be constructing on this tutorial.

Zoom plots are commonly arranged as inserts into the important plot, but can be combined as a lattice with the unique plot. We’ll delve into each.

This text provides a code-oriented tutorial on learn how to use matplotlib with specialized tools from the outset library to construct zoom plots. We’ll construct a visualization of rainfall data from Texas made available by Evett et al. via the USDA. This data set comprises a full yr of rain gauge readings from two nearby sites, taken at 15 minute intervals.

The short duration of rain events and extreme intensity of the heaviest rainfall complicates matters. Throwing a month’s price of Evett et al.’s rainfall data into an easy line plot of reveals the visualization problem we’re up against.

We’ve definitely got some work to do to nice this up! In our visualization, we’ll deal with recovering three particular components of the info.

  1. the little shower around day 72,
  2. the massive rainstorm around day 82, and
  3. light precipitation events over the course of the complete month.

To higher show these details, we’ll create a zoom panel for every.

Our plan is laid out, so let’s get into the code 👍

Fetch the rain gauge records via the Open Science Framework.

# ----- see appendix for package imports
df = pd.read_csv("https://osf.io/6mx3e/download") # download data

Here’s a peek at the info.

+------+-------------+--------------+--------------+------------+-----------+
| 12 months | Decimal DOY | NW dew/frost | SW dew/frost | NW precip | SW precip |
+------+-------------+--------------+--------------+------------+-----------+
| 2019 | 59.73958 | 0 | 0 | 0 | 0 |
| 2019 | 59.74999 | 0 | 0 | 0.06159032 | 0 |
| 2019 | 59.76041 | 0 | 0 | 0 | 0 |
| 2019 | 59.77083 | 0 | 0 | 0.05895544 | 0.0813772 |
| 2019 | 59.78124 | 0 | 0 | 0.05236824 | 0.0757349 |
+ ... + ... + ... + ... + ... + ... +

Before moving on, some minor preparatory chores.

nwls = "NW Lysimetern(35.18817624°N, -102.09791°W)"
swls = "SW Lysimetern(35.18613985°N, -102.0979187°W)"
df[nwls], df[swls] = df["NW precip in mm"], df["SW precip in mm"]

# filter right down to just data from March 2019
march_df = df[np.clip(df["Decimal DOY"], 59, 90) == df["Decimal DOY"]]

Within the code above, we’ve created more detailed column names and subset the info right down to a single month

Our first plotting step is to initialize an outset.OutsetGrid instance to administer our latice of magnification plots. This class operates analogously to seaborn’s FacetGrid, which facilitates construction of ordinary lattice plots by breaking data across axes based on a categorical variable.

OutsetGrid differs from FacetGrid, though, in that along with axes with faceted data it prepares an initial “source” axes containing all data together. Further, OutsetGrid includes tools to mechanically generate “marquee” annotations that show how magnifications correspond to the unique plot. The schematic below overviews OutsetGrid’s plotting model.

Getting back to our example, we’ll construct an OutsetGrid by providing an inventory of the important plot regions we wish to magnify through the datakwarg. Subsequent kwargs provide styling and layout information.

grid = otst.OutsetGrid(  # initialize axes grid manager
data=[
# (x0, y0, x1, y1) regions to outset
(71.6, 0, 72.2, 2), # little shower around day 72
(59, 0, 90, 0.2), # all light precipitation events
(81.3, 0, 82.2, 16), # big rainstorm around day 82
],
x="Time", # axes label
y="Precipitation (mm)", # axes label
aspect=2, # make subplots wide
col_wrap=2, # wrap subplots right into a 2x2 grid
# styling for zoom indicator annotations, discussed later
marqueeplot_kws={"frame_outer_pad": 0, "mark_glyph_kws": {"zorder": 11}},
marqueeplot_source_kws={"zorder": 10, "frame_face_kws": {"zorder": 10}},
)

Here we’ve specified a wider-than-tall aspect ratio for subplots and what number of columns we wish to have.

Our axes grid is ready up, we’re ready for the following step.

It’s time to place some content on our axes.

We are able to use area plots to co-visualize our rain gauges’ readings. (For those unfamiliar, area plots are only line plots with a fill right down to the x axis.) Applying a transparency effect will elegantly show where the gauges agree — and where they don’t.

We are able to harness matplotlib’s stackplotto attract our overlapped area plots. Although designed to create plots with areas “stacked” on top of one another, we will get overlapped areas by splitting out two calls to the plotter— one for every gauge.

To attract this same content across all 4 axes of the grid, we are going to use OutsetGrid’s broadcast method. This method takes a plotter function as its first argument then calls it on each axis using any subsequent arguments.

# draw semi-transparent filled lineplot on all axes for every lysimeter
for y, color in zip([nwls, swls], ["fuchsia", "aquamarine"]):
grid.broadcast(
plt.stackplot, # plotter
march_df["Decimal DOY"], # all kwargs below forwarded to plotter...
march_df[y],
colours=[color],
labels=[y],
lw=2,
edgecolor=color,
alpha=0.4, # set to 60% transparent (alpha 1.0 is non-transparent)
zorder=10,
)

For higher contrast against background fills, we’ll also use broadcast so as to add white underlay across the stackplots.

grid.broadcast(
plt.stackplot, # plotter
march_df["Decimal DOY"], # all kwargs below forwarded to plotter...
np.maximum(march_df["SW precip in mm"], march_df["NW precip in mm"]),
colours=["white"],
lw=20, # thick line width causes protrusion of white border
edgecolor="white",
zorder=9, # note lower zorder positions underlay below stackplots
)

Here’s how our plot looks before we move on to the following stage.

Looking good already — we will already see magnifications showing up on their proper axes at this stage.

Now it’s time so as to add zoom indicator boxes, a.k.a. outset “marquees,” to point out how the scales of our auxiliary plots relate to the size of the important plot.

# draw "marquee' zoom indicators showing correspondences between important plot
# and outset plots
grid.marqueeplot(equalize_aspect=False) # allow axes aspect ratios to differ

Note the kwarg passed to permit outset plots to tackle different aspect ratios from the important plot. This manner, outset data can fully expanded to benefit from all available axes space.

We’re most of the best way there — just just a few ending touches left at this point.

Our last business is so as to add a legend and switch out numeric x ticks for correct timestamps.

grid.source_axes.legend(  # add legend to primary axes
loc="upper left",
bbox_to_anchor=(0.02, 1.0), # legend positioning
frameon=True, # styling: activate legend frame
)

# ----- see appendix for code to relabel axes ticks with timestamps

With that, the plot is complete.

That’s all there’s to it, a zoom plot in 3 easy steps.

We are able to create insets by rearranging the magnification lattice axes into position over the important axes. Here’s how, using the outset library’s inset_outsets tool.

otst.inset_outsets(
grid,
insets=otst_util.layout_corner_insets(
3, # three insets
"NW", # arrange in upper-left corner
inset_margin_size=(0.02, 0), # allow closer to important axes bounds
inset_grid_size=(0.67, 0.9), # grow to take up available space
),
equalize_aspect=False,
)
sns.move_legend( # move legend centered above figure
grid.source_axes, "lower center", bbox_to_anchor=(0.5, 1.1), ncol=2
)

On this case, we’ve also used outset.util.layout_inset_axes for positive tuned control over inset sizing and positioning.

And identical to that, we’ve got three zoom inserts arranged within the upper left hand corner.

There’s lots more you may do with outset.

outset library wordmark

Along with explicit zoom area specification, the outset library also provides a seaborn-like data-oriented API to infer zoom inserts containing categorical subsets of a dataframe. Extensive styling and layout customization options are also available.

Here’s a peek at some highlights from the library’s gallery…

You’ll be able to learn more about using outset within the library’s documentation at https://mmore500.com/outset. Specifically, make sure to take a look at the quickstart guide.

Outset could be installed via pip as python3 -m pip install outset.

This tutorial is contributed by me, Matthew Andres Moreno.

I currently function a postdoctoral scholar on the University of Michigan, where my work is supported by the Eric and Wendy Schmidt AI in Science Postdoctoral Fellowship, a Schmidt Futures program.

My appointment is split between the university’s Ecology and Evolutionary Biology Department, the Center for the Study of Complexity, and the Michigan Institute for Data Science.

Find me on Twitter as @MorenoMatthewA and on GitHub as @mmore500.

disclosure: I’m the creator of the outset library.

Evett, Steven R.; Marek, Gary W.; Copeland, Karen S.; Howell, Terry A. Sr.; Colaizzi, Paul D.; Brauer, David K.; Ruthardt, Brice B. (2023). Evapotranspiration, Irrigation, Dew/frost — Water Balance Data for The Bushland, Texas Soybean Datasets. Ag Data Commons. https://doi.org/10.15482/USDA.ADC/1528713. Accessed 2023–12–26.

J. D. Hunter, “Matplotlib: A 2D Graphics Environment”, Computing in Science & Engineering, vol. 9, no. 3, pp. 90–95, 2007. https://doi.org/10.1109/MCSE.2007.55

Marek, G. W., Evett, S. R., Colaizzi, P. D., & Brauer, D. K. (2021). Preliminary crop coefficients for late planted short-season soybean: Texas High Plains. Agrosystems, Geosciences & Environment, 4(2). https://doi.org/10.1002/agg2.20177

Data structures for statistical computing in python, McKinney, Proceedings of the ninth Python in Science Conference, Volume 445, 2010. https://doi.org/ 10.25080/Majora-92bf1922–00a

Waskom, M. L., (2021). seaborn: statistical data visualization. Journal of Open Source Software, 6(60), 3021, https://doi.org/10.21105/joss.03021.

You’ll find the complete code as a gist here and as a notebook here.

To put in dependencies for this exercise,

python3 -m pip install 
matplotlib `# ==3.8.2`
numpy `# ==1.26.2`
outset `# ==0.1.6`
opytional `# ==0.1.0`
pandas `# ==2.1.3`
seaborn `# ==0.13.0`

All images are works of the creator.

LEAVE A REPLY

Please enter your comment!
Please enter your name here