Benchmarking: Projection

import holoviews as hv
import hvplot
import hvplot.pandas  # noqa
import pandas as pd
import statsmodels.formula.api as smf

pd.options.plotting.backend = "holoviews"

Read summary of all benchmarking results.

summary = pd.read_parquet("s3://carbonplan-benchmarks/benchmark-data/v0.2/summary.parq")

Subset the data to isolate the impact of projection and chunk size.

df = summary[(summary["pixels_per_tile"] == 128) & (summary["region"] == "us-west-2")]

Set plot options.

cmap = ["#E66100", "#5D3A9B"]
plt_opts = {"width": 600, "height": 400}

Create a box plot showing how the rendering time depends on Zarr version and chunk size.

df.hvplot.box(
    y="duration",
    by=["actual_chunk_size", "projection"],
    c="projection",
    cmap=cmap,
    ylabel="Time to render (ms)",
    xlabel="Chunk size (MB); EPSG number",
    legend=False,
).opts(**plt_opts)

Fit a multiple linear regression to the results. The results show that the projection, along with the chunk size, strongly impacts the rendering time, with web mercator pyramids rendering faster than equidistant cylindrical pyramids.

model = smf.ols("duration ~ actual_chunk_size + C(projection)", data=df).fit()
model.summary()
OLS Regression Results
Dep. Variable: duration R-squared: 0.431
Model: OLS Adj. R-squared: 0.430
Method: Least Squares F-statistic: 387.2
Date: Tue, 29 Aug 2023 Prob (F-statistic): 7.16e-126
Time: 20:29:50 Log-Likelihood: -8152.5
No. Observations: 1024 AIC: 1.631e+04
Df Residuals: 1021 BIC: 1.633e+04
Df Model: 2
Covariance Type: nonrobust
coef std err t P>|t| [0.025 0.975]
Intercept 1864.6662 39.029 47.776 0.000 1788.080 1941.252
C(projection)[T.4326] 580.6684 43.438 13.368 0.000 495.430 665.906
actual_chunk_size 60.3908 2.474 24.408 0.000 55.536 65.246
Omnibus: 7.834 Durbin-Watson: 1.830
Prob(Omnibus): 0.020 Jarque-Bera (JB): 7.973
Skew: -0.210 Prob(JB): 0.0186
Kurtosis: 2.897 Cond. No. 31.2


Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.

Show the rendering time at different zoom levels.

plt_opts = {"width": 400, "height": 300}

plts = []

for zoom_level in range(4):
    df_level = df[df["zoom"] == zoom_level]
    plts.append(
        df_level.hvplot.box(
            y="duration",
            by=["actual_chunk_size", "projection"],
            c="projection",
            cmap=cmap,
            ylabel="Time to render (ms)",
            xlabel="Chunk size (MB); EPSG number",
            legend=False,
            title=f"Zoom level {zoom_level}",
        ).opts(**plt_opts)
    )
hv.Layout(plts).cols(2)
/Users/max/mambaforge/envs/benchmark-maps/lib/python3.10/site-packages/holoviews/plotting/bokeh/plot.py:987: UserWarning: found multiple competing values for 'toolbar.active_drag' property; using the latest value
  layout_plot = gridplot(
/Users/max/mambaforge/envs/benchmark-maps/lib/python3.10/site-packages/holoviews/plotting/bokeh/plot.py:987: UserWarning: found multiple competing values for 'toolbar.active_scroll' property; using the latest value
  layout_plot = gridplot(

Add a multiplicative interaction term with zoom level to the multiple linear regression. The results show that projection has a significant impact on rendering performance at higher zoom levels, with the most pronounced affect at zoom level 3. Larger chunk size increases the amount of time for rendering equidistant cylindrical pyramids relative to web mercator pyramids.

model = smf.ols(
    "duration ~ actual_chunk_size * C(projection) + C(projection) * C(zoom)", data=df
).fit()
model.summary()
OLS Regression Results
Dep. Variable: duration R-squared: 0.729
Model: OLS Adj. R-squared: 0.726
Method: Least Squares F-statistic: 302.5
Date: Tue, 29 Aug 2023 Prob (F-statistic): 5.84e-280
Time: 20:29:51 Log-Likelihood: -7773.7
No. Observations: 1024 AIC: 1.557e+04
Df Residuals: 1014 BIC: 1.562e+04
Df Model: 9
Covariance Type: nonrobust
coef std err t P>|t| [0.025 0.975]
Intercept 1760.2638 48.690 36.152 0.000 1664.719 1855.809
C(projection)[T.4326] -311.9528 68.858 -4.530 0.000 -447.074 -176.832
C(zoom)[T.1.0] 1043.0435 60.224 17.319 0.000 924.866 1161.221
C(zoom)[T.2.0] 239.5532 60.224 3.978 0.000 121.376 357.731
C(zoom)[T.3.0] -186.3396 60.224 -3.094 0.002 -304.517 -68.162
C(projection)[T.4326]:C(zoom)[T.1.0] 42.7015 85.169 0.501 0.616 -124.427 209.830
C(projection)[T.4326]:C(zoom)[T.2.0] 741.8614 85.169 8.710 0.000 574.733 908.990
C(projection)[T.4326]:C(zoom)[T.3.0] 1428.6270 85.169 16.774 0.000 1261.499 1595.755
actual_chunk_size 42.9576 2.426 17.710 0.000 38.198 47.717
actual_chunk_size:C(projection)[T.4326] 34.8665 3.430 10.164 0.000 28.135 41.598
Omnibus: 78.010 Durbin-Watson: 1.395
Prob(Omnibus): 0.000 Jarque-Bera (JB): 95.056
Skew: -0.699 Prob(JB): 2.29e-21
Kurtosis: 3.525 Cond. No. 151.


Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.