Zarr Visualization Report

Methods and benchmarks for visualizing Zarr

This site documents different approaches and benchmarks for zarr visualization. If you use or create Zarr data and wish to visualize it in a web browser, this guide is for you. It describes the requirements for data pre-processing and chunking in order to support visualization through tiling server and dynamic client approahces.

Background + Motivation

Visualization is key to exploring and understanding Earth data. Web browsers offer a near-universal platform for data exploration. However, web users expect near instantaneous page rendering. The scale of geospatial data makes it challenging to serve this data instantenously as browsers cannot read, reproject and create image tiles fast enough for a good user experience.

Pre-generated Map Tiles - Drawbacks

The challenge of visualizing large geospatial datasets led to the development of pre-generated static map tiles. While pregenerated map tiles make it possible to visualize data quickly, there are drawbacks. The most significant drawback is the data provider chooses how the data will appear. Next generation approaches give that power to the user. Other drawbacks impact the data provider, such as storage costs and maintaining a pipeline to constantly update or reprocess the tile storage with new and updated data. But the user is impacted by having no power to adjust the visualization, such as modifying the color scale, color map or perform “band math” where multiple variables are combined to produce a new variable.

New Approaches

More recent years have seen the success of the dynamic tiling approach which allows for on-demand map tile creation. This approach has traditionally relied on reading data from Cloud-Optimized GeoTIFFs (COGs). When the Zarr data format gained popularity for large-scale N-dimensional data analysis, users started to ask for browser-based visualization. The conventional Zarr chunk size stored for analysis (~100MB) was acknowledged to be too large to be fetched by a browser.

Now there are two options for visualizing Zarr data: a tile server and dynamic client. rio_tiler’s XarrayReader supports tile rendering from anything that is xarray-readable. This means a tile server can render tiles from Zarr stores as well as netCDF4/HDF5 and other formats. However, a tile server still requires running a server while the second option, “dynamic client”, reads Zarr directly in the browser client and uses webGL to render map tiles.

Goals

This report will describe these two approaches. We will discuss the tradeoffs, pre-processing options and provide performance benchmarks for a variety of data configurations. We hope readers will be able to reuse lessons learned and recommendations to deliver their Zarr data to users in web browser and contribute to the wider adoption of this format for large scale environmental data understanding.