import hvplot
import hvplot.pandas # noqa
import pandas as pd
import statsmodels.formula.api as smf
pd.options.plotting.backend = "holoviews"Benchmarking: Pixels per Tile
Read summary of all benchmarking results.
summary = pd.read_parquet("s3://carbonplan-benchmarks/benchmark-data/v0.2/summary.parq")Subset the data to isolate the impact of the number of pixels per tile and chunk size.
df = summary[
(summary["projection"] == 3857) & (summary["region"] == "us-west-2")
].sort_values(by=["target_chunk_size", "pixels_per_tile"])Set plot options.
cmap = ["#994F00", "#006CD1"]
plt_opts = {"width": 600, "height": 400}Create a box plot showing how the rendering time depends on the number of pixels per tile and chunk size.
df.hvplot.box(
y="duration",
by=["actual_chunk_size", "pixels_per_tile"],
c="pixels_per_tile",
cmap=cmap,
ylabel="Time to render (ms)",
xlabel="Chunk size (MB); Pixels per tile",
legend=False,
).opts(**plt_opts)Fit a multiple linear regression to the results. The results show that the number of pixels per tile independent of the chunk size does not significantly impact rendering time. Datasets with larger chunks take longer to render.
model = smf.ols("duration ~ actual_chunk_size + C(pixels_per_tile)", data=df).fit()
model.summary()| Dep. Variable: | duration | R-squared: | 0.275 |
| Model: | OLS | Adj. R-squared: | 0.273 |
| Method: | Least Squares | F-statistic: | 193.4 |
| Date: | Tue, 29 Aug 2023 | Prob (F-statistic): | 6.08e-72 |
| Time: | 20:29:05 | Log-Likelihood: | -7981.0 |
| No. Observations: | 1024 | AIC: | 1.597e+04 |
| Df Residuals: | 1021 | BIC: | 1.598e+04 |
| Df Model: | 2 | ||
| Covariance Type: | nonrobust |
| coef | std err | t | P>|t| | [0.025 | 0.975] | |
| Intercept | 2031.8289 | 33.974 | 59.805 | 0.000 | 1965.162 | 2098.496 |
| C(pixels_per_tile)[T.256] | -3.3655 | 37.576 | -0.090 | 0.929 | -77.101 | 70.370 |
| actual_chunk_size | 43.2144 | 2.250 | 19.209 | 0.000 | 38.800 | 47.629 |
| Omnibus: | 35.870 | Durbin-Watson: | 2.052 |
| Prob(Omnibus): | 0.000 | Jarque-Bera (JB): | 39.124 |
| Skew: | 0.477 | Prob(JB): | 3.19e-09 |
| Kurtosis: | 2.907 | Cond. No. | 29.2 |
Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.