Download STAC assets

Programatically download assets for local use
Author

Julia Signell

Published

July 31, 2023

Run this notebook

You can launch this notebook in VEDA JupyterHub by clicking the link below.

Launch in VEDA JupyterHub (requires access)

Learn more

Inside the Hub

This notebook was written on the VEDA JupyterHub and as such is designed to be run on a jupyterhub which is associated with an AWS IAM role which has been granted permissions to the VEDA data store via its bucket policy. The instance used provided 16GB of RAM.

See (VEDA Analytics JupyterHub Access)[https://nasa-impact.github.io/veda-docs/veda-jh-access.html] for information about how to gain access.

Outside the Hub

The data is in a protected bucket. Please request access by emailng aimee@developmentseed.org or alexandra@developmentseed.org and providing your affiliation, interest in or expected use of the dataset and an AWS IAM role or user Amazon Resource Name (ARN). The team will help you configure the cognito client.

You should then run:

%run -i 'cognito_login.py'

Approach

This notebook shows how to download data for local use.

This is generally not the recommended approach. Whenever possible it is better to not transfer large volumes of data out of the original physical storage location. Instead users should practice data-proximate computing by processing in the same cloud and region. That is why the data for VEDA are hosted in the same region as this VEDA JupyterHub instance.

However, sometimes you do need to download assets. This might be because the assets cannot be accessed directly from remote storage, or you don’t have access to an environment running in the same cloud/region.

For these special cases, this is how you go about downloading data:

  1. Use pystac_client to open and search the STAC catalog
  2. Use stac-asset to download the assets related to that search
  3. If you need the file on your local machine, zip and download the output directory

Note that the default examples environment is missing the stac-asset package. We can pip install that before trying to import.

!pip install -q stac-asset
import stac_asset
from pystac_client import Client

Declare your collection of interest

You can discover available collections the following ways:

  • Programmatically: see example in the list-collections.ipynb notebook
  • JSON API: https://staging-stac.delta-backend.com/collections
  • STAC Browser: http://veda-staging-stac-browser.s3-website-us-west-2.amazonaws.com
STAC_API_URL = "https://staging-stac.delta-backend.com/"
collection = "caldor-fire-burn-severity"
catalog = Client.open(STAC_API_URL)
search = catalog.search(collections=[collection])

print(f"Found {len(search.item_collection())} items")
Found 1 items

Download the assets

Once you have identified the items that you are interested in, use stac_asset to download the related assets.

await stac_asset.download_item_collection(
    search.item_collection(), 
    directory="data", 
    config=stac_asset.Config(make_directory=True, s3_requester_pays=True)
)

Note: For downloading just one item use stac_asset.download_item.

Download from JupyterHub

If you want to further download from this JupyterHub to your local machine you can zip the data directory:

!zip -r data.zip data
updating: data/ (stored 0%)
updating: data/item-collection.json (deflated 74%)
updating: data/bs_to_save/ (stored 0%)
updating: data/bs_to_save/bs_to_save.tif (deflated 16%)

Then right click on the the zipped file in the Jupyter file browser and select “Download”

Right click on zip file to see options that include “Download”