STAC Collection Creation

Starting point for data providers who want to add a new dataset to the STAC API.
Author

Julia Signell

Published

June 12, 2023

Run this notebook

You can launch this notebook in VEDA JupyterHub by clicking the link below.

Launch in VEDA JupyterHub (requires access)

Learn more

Inside the Hub

This notebook was written on a VEDA JupyterHub instance

See (VEDA Analytics JupyterHub Access)[https://nasa-impact.github.io/veda-docs/veda-jh-access.html] for information about how to gain access.

Outside the Hub

You are welcome to run this anywhere you like (Note: alternatively you can run this on https://daskhub.veda.smce.nasa.gov/, MAAP, locally, …), just make sure that the data is accessible, or get in contact with the VEDA team to enable access.

Install extra packages

!pip install -U pystac nbss-upload --quiet
from datetime import datetime, timezone
import pystac

Create pystac.Collection

In this section we will be creating a pystac.Collection object. This is the part of that notebook that you should update.

Declare constants

Start by declaring some string and boolean fields.

COLLECTION_ID = "no2-monthly-diff"
TITLE = "NO₂ (Diff)"
DESCRIPTION = (
    "This layer shows changes in nitrogen dioxide (NO₂) levels. Redder colors "
    "indicate increases in NO₂. Bluer colors indicate lower levels of NO₂. "
    "Missing pixels indicate areas of no data most likely associated with "
    "cloud cover or snow."
)
DASHBOARD__IS_PERIODIC = True
DASHBOARD__TIME_DENSITY = "month"
LICENSE = "CC0-1.0"

Extents

The extents indicate the start (and potentially end) times of the data as well as the footprint of the data.

# Time must be in UTC
demo_time = datetime.now(tz=timezone.utc)

extent = pystac.Extent(
    pystac.SpatialExtent([[-180.0, -90.0, 180.0, 90.0]]),
    pystac.TemporalExtent([[demo_time, None]]),
)

Providers

We know that the data host, processor, and producter is “VEDA”, but you can include other providers that fill other roles in the data creation pipeline.

providers = [
    pystac.Provider(
        name="VEDA",
        roles=[pystac.ProviderRole.PRODUCER, pystac.ProviderRole.PROCESSOR, pystac.ProviderRole.HOST],
        url="https://github.com/nasa-impact/veda-data-pipelines",
    )
]

Put it together

Now take your constants and the extents and providers and create a pystac.Collection

collection = pystac.Collection(
    id=COLLECTION_ID,
    title=TITLE,
    description=DESCRIPTION,
    extra_fields={
        "dashboard:is_periodic": DASHBOARD__IS_PERIODIC,
        "dashboard:time_density": DASHBOARD__TIME_DENSITY,
    },
    license=LICENSE,
    extent=extent,
    providers=providers,
)

Try it out!

Now that you have a collection you can try it out and make sure that it looks how you expect and that it passes validation checks.

collection.validate()
['https://schemas.stacspec.org/v1.0.0/collection-spec/json-schema/collection.json']
collection.to_dict()
{'type': 'Collection',
 'id': 'no2-monthly-diff',
 'stac_version': '1.0.0',
 'description': 'This layer shows changes in nitrogen dioxide (NO₂) levels. Redder colors indicate increases in NO₂. Bluer colors indicate lower levels of NO₂. Missing pixels indicate areas of no data most likely associated with cloud cover or snow.',
 'links': [],
 'dashboard:is_periodic': True,
 'dashboard:time_density': 'month',
 'title': 'NO₂ (Diff)',
 'extent': {'spatial': {'bbox': [[-180.0, -90.0, 180.0, 90.0]]},
  'temporal': {'interval': [['2023-06-12T17:36:30.161697Z', None]]}},
 'license': 'CC0-1.0',
 'providers': [{'name': 'VEDA',
   'roles': [<ProviderRole.PRODUCER: 'producer'>,
    <ProviderRole.PROCESSOR: 'processor'>,
    <ProviderRole.HOST: 'host'>],
   'url': 'https://github.com/nasa-impact/veda-data-pipelines'}]}

Upload this notebook

You can upload the notebook to anyplace you like, but one of the easiest ones is notebook sharing space. Just change the following cell from “Raw” to “Code”, run it and copy the output link.

Before uploading make sure: 1) you have not hard-coded any secrets or access keys. 2) you have saved this notebook. Hint (ctrl+s) will do it

!nbss-upload new-collection.ipynb