Benchmarking: Projection

import holoviews as hv
import hvplot
import hvplot.pandas  # noqa
import pandas as pd
import statsmodels.formula.api as smf

pd.options.plotting.backend = "holoviews"

Read summary of all benchmarking results.

summary = pd.read_parquet("s3://carbonplan-benchmarks/benchmark-data/v0.2/summary.parq")

Subset the data to isolate the impact of projection and chunk size.

df = summary[(summary["pixels_per_tile"] == 128) & (summary["region"] == "us-west-2")]

Set plot options.

cmap = ["#E66100", "#5D3A9B"]
plt_opts = {"width": 600, "height": 400}

Create a box plot showing how the rendering time depends on Zarr version and chunk size.

df.hvplot.box(
    y="duration",
    by=["actual_chunk_size", "projection"],
    c="projection",
    cmap=cmap,
    ylabel="Time to render (ms)",
    xlabel="Chunk size (MB); EPSG number",
    legend=False,
).opts(**plt_opts)

Fit a multiple linear regression to the results. The results show that the projection, along with the chunk size, strongly impacts the rendering time, with web mercator pyramids rendering faster than equidistant cylindrical pyramids.

model = smf.ols("duration ~ actual_chunk_size + C(projection)", data=df).fit()
model.summary()

OLS Regression Results
Dep. Variable:	duration	R-squared:	0.431
Model:	OLS	Adj. R-squared:	0.430
Method:	Least Squares	F-statistic:	387.2
Date:	Tue, 29 Aug 2023	Prob (F-statistic):	7.16e-126
Time:	20:29:50	Log-Likelihood:	-8152.5
No. Observations:	1024	AIC:	1.631e+04
Df Residuals:	1021	BIC:	1.633e+04
Df Model:	2
Covariance Type:	nonrobust

	coef	std err	t	P>\|t\|	[0.025	0.975]
Intercept	1864.6662	39.029	47.776	0.000	1788.080	1941.252
C(projection)[T.4326]	580.6684	43.438	13.368	0.000	495.430	665.906
actual_chunk_size	60.3908	2.474	24.408	0.000	55.536	65.246

Omnibus:	7.834	Durbin-Watson:	1.830
Prob(Omnibus):	0.020	Jarque-Bera (JB):	7.973
Skew:	-0.210	Prob(JB):	0.0186
Kurtosis:	2.897	Cond. No.	31.2

Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.

Show the rendering time at different zoom levels.

plt_opts = {"width": 400, "height": 300}

plts = []

for zoom_level in range(4):
    df_level = df[df["zoom"] == zoom_level]
    plts.append(
        df_level.hvplot.box(
            y="duration",
            by=["actual_chunk_size", "projection"],
            c="projection",
            cmap=cmap,
            ylabel="Time to render (ms)",
            xlabel="Chunk size (MB); EPSG number",
            legend=False,
            title=f"Zoom level {zoom_level}",
        ).opts(**plt_opts)
    )
hv.Layout(plts).cols(2)

/Users/max/mambaforge/envs/benchmark-maps/lib/python3.10/site-packages/holoviews/plotting/bokeh/plot.py:987: UserWarning: found multiple competing values for 'toolbar.active_drag' property; using the latest value
  layout_plot = gridplot(
/Users/max/mambaforge/envs/benchmark-maps/lib/python3.10/site-packages/holoviews/plotting/bokeh/plot.py:987: UserWarning: found multiple competing values for 'toolbar.active_scroll' property; using the latest value
  layout_plot = gridplot(

Add a multiplicative interaction term with zoom level to the multiple linear regression. The results show that projection has a significant impact on rendering performance at higher zoom levels, with the most pronounced affect at zoom level 3. Larger chunk size increases the amount of time for rendering equidistant cylindrical pyramids relative to web mercator pyramids.

model = smf.ols(
    "duration ~ actual_chunk_size * C(projection) + C(projection) * C(zoom)", data=df
).fit()
model.summary()

OLS Regression Results
Dep. Variable:	duration	R-squared:	0.729
Model:	OLS	Adj. R-squared:	0.726
Method:	Least Squares	F-statistic:	302.5
Date:	Tue, 29 Aug 2023	Prob (F-statistic):	5.84e-280
Time:	20:29:51	Log-Likelihood:	-7773.7
No. Observations:	1024	AIC:	1.557e+04
Df Residuals:	1014	BIC:	1.562e+04
Df Model:	9
Covariance Type:	nonrobust

	coef	std err	t	P>\|t\|	[0.025	0.975]
Intercept	1760.2638	48.690	36.152	0.000	1664.719	1855.809
C(projection)[T.4326]	-311.9528	68.858	-4.530	0.000	-447.074	-176.832
C(zoom)[T.1.0]	1043.0435	60.224	17.319	0.000	924.866	1161.221
C(zoom)[T.2.0]	239.5532	60.224	3.978	0.000	121.376	357.731
C(zoom)[T.3.0]	-186.3396	60.224	-3.094	0.002	-304.517	-68.162
C(projection)[T.4326]:C(zoom)[T.1.0]	42.7015	85.169	0.501	0.616	-124.427	209.830
C(projection)[T.4326]:C(zoom)[T.2.0]	741.8614	85.169	8.710	0.000	574.733	908.990
C(projection)[T.4326]:C(zoom)[T.3.0]	1428.6270	85.169	16.774	0.000	1261.499	1595.755
actual_chunk_size	42.9576	2.426	17.710	0.000	38.198	47.717
actual_chunk_size:C(projection)[T.4326]	34.8665	3.430	10.164	0.000	28.135	41.598

Omnibus:	78.010	Durbin-Watson:	1.395
Prob(Omnibus):	0.000	Jarque-Bera (JB):	95.056
Skew:	-0.699	Prob(JB):	2.29e-21
Kurtosis:	3.525	Cond. No.	151.

Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.