import holoviews as hv
import hvplot
import hvplot.pandas # noqa
import pandas as pd
import statsmodels.formula.api as smf
= "holoviews" pd.options.plotting.backend
Benchmarking: Projection
Read summary of all benchmarking results.
= pd.read_parquet("s3://carbonplan-benchmarks/benchmark-data/v0.2/summary.parq") summary
Subset the data to isolate the impact of projection and chunk size.
= summary[(summary["pixels_per_tile"] == 128) & (summary["region"] == "us-west-2")] df
Set plot options.
= ["#E66100", "#5D3A9B"]
cmap = {"width": 600, "height": 400} plt_opts
Create a box plot showing how the rendering time depends on Zarr version and chunk size.
df.hvplot.box(="duration",
y=["actual_chunk_size", "projection"],
by="projection",
c=cmap,
cmap="Time to render (ms)",
ylabel="Chunk size (MB); EPSG number",
xlabel=False,
legend**plt_opts) ).opts(
Fit a multiple linear regression to the results. The results show that the projection, along with the chunk size, strongly impacts the rendering time, with web mercator pyramids rendering faster than equidistant cylindrical pyramids.
= smf.ols("duration ~ actual_chunk_size + C(projection)", data=df).fit()
model model.summary()
Dep. Variable: | duration | R-squared: | 0.431 |
Model: | OLS | Adj. R-squared: | 0.430 |
Method: | Least Squares | F-statistic: | 387.2 |
Date: | Tue, 29 Aug 2023 | Prob (F-statistic): | 7.16e-126 |
Time: | 20:29:50 | Log-Likelihood: | -8152.5 |
No. Observations: | 1024 | AIC: | 1.631e+04 |
Df Residuals: | 1021 | BIC: | 1.633e+04 |
Df Model: | 2 | ||
Covariance Type: | nonrobust |
coef | std err | t | P>|t| | [0.025 | 0.975] | |
Intercept | 1864.6662 | 39.029 | 47.776 | 0.000 | 1788.080 | 1941.252 |
C(projection)[T.4326] | 580.6684 | 43.438 | 13.368 | 0.000 | 495.430 | 665.906 |
actual_chunk_size | 60.3908 | 2.474 | 24.408 | 0.000 | 55.536 | 65.246 |
Omnibus: | 7.834 | Durbin-Watson: | 1.830 |
Prob(Omnibus): | 0.020 | Jarque-Bera (JB): | 7.973 |
Skew: | -0.210 | Prob(JB): | 0.0186 |
Kurtosis: | 2.897 | Cond. No. | 31.2 |
Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
Show the rendering time at different zoom levels.
= {"width": 400, "height": 300}
plt_opts
= []
plts
for zoom_level in range(4):
= df[df["zoom"] == zoom_level]
df_level
plts.append(
df_level.hvplot.box(="duration",
y=["actual_chunk_size", "projection"],
by="projection",
c=cmap,
cmap="Time to render (ms)",
ylabel="Chunk size (MB); EPSG number",
xlabel=False,
legend=f"Zoom level {zoom_level}",
title**plt_opts)
).opts(
)2) hv.Layout(plts).cols(
/Users/max/mambaforge/envs/benchmark-maps/lib/python3.10/site-packages/holoviews/plotting/bokeh/plot.py:987: UserWarning: found multiple competing values for 'toolbar.active_drag' property; using the latest value
layout_plot = gridplot(
/Users/max/mambaforge/envs/benchmark-maps/lib/python3.10/site-packages/holoviews/plotting/bokeh/plot.py:987: UserWarning: found multiple competing values for 'toolbar.active_scroll' property; using the latest value
layout_plot = gridplot(
Add a multiplicative interaction term with zoom level to the multiple linear regression. The results show that projection has a significant impact on rendering performance at higher zoom levels, with the most pronounced affect at zoom level 3. Larger chunk size increases the amount of time for rendering equidistant cylindrical pyramids relative to web mercator pyramids.
= smf.ols(
model "duration ~ actual_chunk_size * C(projection) + C(projection) * C(zoom)", data=df
).fit() model.summary()
Dep. Variable: | duration | R-squared: | 0.729 |
Model: | OLS | Adj. R-squared: | 0.726 |
Method: | Least Squares | F-statistic: | 302.5 |
Date: | Tue, 29 Aug 2023 | Prob (F-statistic): | 5.84e-280 |
Time: | 20:29:51 | Log-Likelihood: | -7773.7 |
No. Observations: | 1024 | AIC: | 1.557e+04 |
Df Residuals: | 1014 | BIC: | 1.562e+04 |
Df Model: | 9 | ||
Covariance Type: | nonrobust |
coef | std err | t | P>|t| | [0.025 | 0.975] | |
Intercept | 1760.2638 | 48.690 | 36.152 | 0.000 | 1664.719 | 1855.809 |
C(projection)[T.4326] | -311.9528 | 68.858 | -4.530 | 0.000 | -447.074 | -176.832 |
C(zoom)[T.1.0] | 1043.0435 | 60.224 | 17.319 | 0.000 | 924.866 | 1161.221 |
C(zoom)[T.2.0] | 239.5532 | 60.224 | 3.978 | 0.000 | 121.376 | 357.731 |
C(zoom)[T.3.0] | -186.3396 | 60.224 | -3.094 | 0.002 | -304.517 | -68.162 |
C(projection)[T.4326]:C(zoom)[T.1.0] | 42.7015 | 85.169 | 0.501 | 0.616 | -124.427 | 209.830 |
C(projection)[T.4326]:C(zoom)[T.2.0] | 741.8614 | 85.169 | 8.710 | 0.000 | 574.733 | 908.990 |
C(projection)[T.4326]:C(zoom)[T.3.0] | 1428.6270 | 85.169 | 16.774 | 0.000 | 1261.499 | 1595.755 |
actual_chunk_size | 42.9576 | 2.426 | 17.710 | 0.000 | 38.198 | 47.717 |
actual_chunk_size:C(projection)[T.4326] | 34.8665 | 3.430 | 10.164 | 0.000 | 28.135 | 41.598 |
Omnibus: | 78.010 | Durbin-Watson: | 1.395 |
Prob(Omnibus): | 0.000 | Jarque-Bera (JB): | 95.056 |
Skew: | -0.699 | Prob(JB): | 2.29e-21 |
Kurtosis: | 3.525 | Cond. No. | 151. |
Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.