Benchmarking: AWS Region

import hvplot
import hvplot.pandas  # noqa
import pandas as pd
import statsmodels.formula.api as smf

pd.options.plotting.backend = "holoviews"

Read summary of all benchmarking results.

summary = pd.read_parquet("s3://carbonplan-benchmarks/benchmark-data/v0.2/summary.parq")

Subset the data to isolate the impact of location and chunk size.

df = summary[
    (summary["projection"] == 3857)
    & (summary["pixels_per_tile"] == 128)
    & (summary["shard_size"] == 0)
]

Set plot options.

cmap = ["#FFC20A", "#0C7BDC"]
plt_opts = {"width": 600, "height": 400}

Create a box plot showing how the rendering time depends on the AWS region and chunk size.

df.hvplot.box(
    y="duration",
    by=["actual_chunk_size", "region"],
    c="region",
    cmap=cmap,
    ylabel="Time to render (ms)",
    xlabel="Chunk size (MB); AWS region",
    legend=False,
).opts(**plt_opts)

Fit a multiple linear regression to the results. The results show that the chunk size strongly impacts the time to render the data. Datasets with larger chunk sizes take longer to render. The AWS region does not have a noticeable impact on rendering time.

model = smf.ols("duration ~ actual_chunk_size + C(region)", data=df).fit()
model.summary()

OLS Regression Results
Dep. Variable:	duration	R-squared:	0.446
Model:	OLS	Adj. R-squared:	0.444
Method:	Least Squares	F-statistic:	205.1
Date:	Tue, 29 Aug 2023	Prob (F-statistic):	4.58e-66
Time:	20:28:30	Log-Likelihood:	-3916.0
No. Observations:	512	AIC:	7838.
Df Residuals:	509	BIC:	7851.
Df Model:	2
Covariance Type:	nonrobust

	coef	std err	t	P>\|t\|	[0.025	0.975]
Intercept	1859.2163	40.422	45.995	0.000	1779.801	1938.631
C(region)[T.us-west-2]	-53.6344	44.989	-1.192	0.234	-142.021	34.752
actual_chunk_size	51.8170	2.563	20.221	0.000	46.782	56.852

Omnibus:	22.416	Durbin-Watson:	1.979
Prob(Omnibus):	0.000	Jarque-Bera (JB):	12.956
Skew:	0.227	Prob(JB):	0.00154
Kurtosis:	2.367	Cond. No.	31.2

Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.