Benchmarks 4: Spatial Chunk Variation

Explanation

In the previous notebook, we saw how chunk size impacts performance. However, using a small chunk size will result in more chunks. In this notebook, we explore how the number of chunks spatially can impact performance, especially at low zoom levels.

Dataset Generation

We compared the performance of tiling artificially generated Zarr data with constant chunk size and increased the spatial resolution, so a varied number of chunks is required for spatial coverage.

The code to produce the zarr stores are in the tile-benchmarking repo: 01-generate-datasets/generate-fake-data-with-chunks.ipynb.

Tests

Tests were run via the tile-benchmarking/02-run-tests/04-number-of-spatial-chunks.ipynb notebook.

import pandas as pd
import hvplot.pandas
import holoviews as hv
pd.options.plotting.backend = 'holoviews'
import warnings
warnings.filterwarnings('ignore')
git_url_path = "https://raw.githubusercontent.com/developmentseed/tile-benchmarking/main/02-run-tests/results-csvs/"
df = pd.read_csv(f"{git_url_path}/04-number-of-spatial-chunks-results.csv")
zooms = range(6)

plt_opts = {"width": 400, "height": 300}

plts = []

for zoom_level in zooms:
    df_level = df[df["zoom"] == zoom_level]
    plts.append(
        df_level.hvplot.box(
            y="time",
            by=["number_of_spatial_chunks"],
            c="number_of_spatial_chunks",
            cmap='Plasma_r',
            ylabel="Time to render (ms)",
            xlabel="Number of spatial chunks",
            legend=False,
            title=f"Zoom level {zoom_level}",
        ).opts(**plt_opts)
    )
hv.Layout(plts).cols(2)

Interpretation of the Results

Having a greater number of spatial chunks degrades performance at low zoom levels as seen above most notably for zooms 0 and 1. At high zoom levels, since fewer chunks need to be loaded, there is no difference in performance.

We can solve the problem of slow performance at low zoom levels with pyramids, or multiscale, datasets, as demonstrated in Benchmarks: Pyramids.