Visualizing NCEO Aboveground Woody Biomass 2017 prioritization areas

Explores NCEO Aboveground Woody Biomass priority areas in Guinea using zonal statistics.
Author

Kathryn Berger

Published

March 20, 2023

Run this notebook

You can launch this notebook in VEDA JupyterHub by clicking the link below.

Launch in VEDA JupyterHub (requires access)

Learn more

Inside the Hub

This notebook was written on the VEDA JupyterHub and as such is designed to be run on a jupyterhub which is associated with an AWS IAM role which has been granted permissions to the VEDA data store via its bucket policy. The instance used provided 16GB of RAM.

See (VEDA Analytics JupyterHub Access)[https://nasa-impact.github.io/veda-docs/veda-jh-access.html] for information about how to gain access.

Outside the Hub

The data is in a protected bucket. Please request access by emailng aimee@developmentseed.org or alexandra@developmentseed.org and providing your affiliation, interest in or expected use of the dataset and an AWS IAM role or user Amazon Resource Name (ARN). The team will help you configure the cognito client.

You should then run:

%run -i 'cognito_login.py'

Approach

  1. Query STAC API and explore item contents for a given collection
  2. Read and access the data
  3. Visualize the collection with hvplot
  4. Run zonal statistics on collection using rasterstats
  5. Visualize resultant zonal statistics on a choropleth map

About the Data

The NCEO Aboveground Woody Biomass 2017 dataset is a map for Africa at 100 m spatial resolution which was developed using a combination of LiDAR, Synthetic Aperture Radar (SAR) and optical based data. Aboveground woody biomass (AGB) plays an key role in the study of the Earth’s carbon cycle and response to climate change. Estimation based on Earth Observation measurements is an effective method for regional scale studies and the results are expressed as dry matter in Mg ha-1.

Important Note: Users of this dataset should keep in mind that the map is a continental-scale dataset, generated using a combination of different remote sensing data types, with a single method for the whole study area produced in 2017. Users, therefore, should understand that accuracy may vary for different regions and vegetation types.

The Case Study - Guinea

Mapping and understanding the spatial distribution of AGB is key to understanding carbon dioxide emissions from tropical deforestation through the loss of woody carbon stocks. The resulting carbon fluxes from these land-use changes and vegetation degradation can have negative impacts on the global carbon cycle. Change analysis between AGB maps overtime can display losses in high biomass forests, due to suspected deforestation and forest degredation.

The forests of southern Guinea are reported to have some of the highest density AGB of any forest in the world and are one of the most threatened ecoregions in Africa. Importantly, this area was also the epicenter of the 2014 Ebola outbreak, which had a lasting impact on the region. There is more and more evidence that human deforestation activities in this area may have accelerated the spread of the deadly virus as a result of increasing human-bat interactions in the region.

In this example we explore the NCEO AGB dataset for 2017, running zonal statistics at the district (administrative 2) level to understand those areas in Guinea that need greatest prioritization for protection and conservation.

Setting up the Environment

To run zonal statistics we’ll need to import a python package called rasterstats into our environment. You can uncomment the following line for installation. This cell needs only needs to be run once.

!pip install rasterstats
Requirement already satisfied: rasterstats in /srv/conda/envs/notebook/lib/python3.10/site-packages (0.18.0)
Requirement already satisfied: fiona<1.9 in /srv/conda/envs/notebook/lib/python3.10/site-packages (from rasterstats) (1.8.22)
Requirement already satisfied: shapely in /srv/conda/envs/notebook/lib/python3.10/site-packages (from rasterstats) (1.8.5.post1)
Requirement already satisfied: cligj>=0.4 in /srv/conda/envs/notebook/lib/python3.10/site-packages (from rasterstats) (0.7.2)
Requirement already satisfied: simplejson in /srv/conda/envs/notebook/lib/python3.10/site-packages (from rasterstats) (3.19.1)
Requirement already satisfied: affine<3.0 in /srv/conda/envs/notebook/lib/python3.10/site-packages (from rasterstats) (2.3.1)
Requirement already satisfied: click>7.1 in /srv/conda/envs/notebook/lib/python3.10/site-packages (from rasterstats) (8.1.3)
Requirement already satisfied: numpy>=1.9 in /srv/conda/envs/notebook/lib/python3.10/site-packages (from rasterstats) (1.23.5)
Requirement already satisfied: rasterio>=1.0 in /srv/conda/envs/notebook/lib/python3.10/site-packages (from rasterstats) (1.3.4)
Requirement already satisfied: attrs>=17 in /srv/conda/envs/notebook/lib/python3.10/site-packages (from fiona<1.9->rasterstats) (22.2.0)
Requirement already satisfied: six>=1.7 in /srv/conda/envs/notebook/lib/python3.10/site-packages (from fiona<1.9->rasterstats) (1.16.0)
Requirement already satisfied: setuptools in /srv/conda/envs/notebook/lib/python3.10/site-packages (from fiona<1.9->rasterstats) (65.6.3)
Requirement already satisfied: certifi in /srv/conda/envs/notebook/lib/python3.10/site-packages (from fiona<1.9->rasterstats) (2022.12.7)
Requirement already satisfied: click-plugins>=1.0 in /srv/conda/envs/notebook/lib/python3.10/site-packages (from fiona<1.9->rasterstats) (1.1.1)
Requirement already satisfied: munch in /srv/conda/envs/notebook/lib/python3.10/site-packages (from fiona<1.9->rasterstats) (2.5.0)
Requirement already satisfied: snuggs>=1.4.1 in /srv/conda/envs/notebook/lib/python3.10/site-packages (from rasterio>=1.0->rasterstats) (1.4.7)
Requirement already satisfied: pyparsing>=2.1.6 in /srv/conda/envs/notebook/lib/python3.10/site-packages (from snuggs>=1.4.1->rasterio>=1.0->rasterstats) (3.0.9)

Querying the STAC API

from pystac_client import Client
# Provide STAC API endpoint
STAC_API_URL = "https://staging-stac.delta-backend.com/"

# Declare collection of interest - NCEO Biomass
collection = "nceo_africa_2017"

Now let’s check how many total items are available.

search = Client.open(STAC_API_URL).search(collections=[collection])
items = list(search.items())
print(f"Found {len(items)} items")
Found 1 items

This makes sense as there is only one item available: a map for 2017.

# Explore the "cog_default" asset of one item to see what it contains
items[0].assets["cog_default"].to_dict()
{'href': 's3://nasa-maap-data-store/file-staging/nasa-map/nceo-africa-2017/AGB_map_2017v0m_COG.tif',
 'type': 'image/tiff; application=geotiff; profile=cloud-optimized',
 'title': 'Default COG Layer',
 'description': 'Cloud optimized default layer to display on map',
 'raster:bands': [{'scale': 1.0,
   'nodata': 'inf',
   'offset': 0.0,
   'sampling': 'area',
   'data_type': 'uint16',
   'histogram': {'max': 429.0,
    'min': 0.0,
    'count': 11.0,
    'buckets': [405348.0,
     44948.0,
     18365.0,
     6377.0,
     3675.0,
     3388.0,
     3785.0,
     9453.0,
     13108.0,
     1186.0]},
   'statistics': {'mean': 37.58407913145342,
    'stddev': 81.36678677343947,
    'maximum': 429.0,
    'minimum': 0.0,
    'valid_percent': 50.42436439336373}}],
 'roles': ['data', 'layer']}

Explore through the item’s assets. We can see from the data’s statistics values that the min and max values for the observed values range from 0 to 429 Mg ha-1.

Exploring the properties ["proj:epsg"] we also notice something strange, as the value is float and should be integer. We’ll revise this using the code below, but it will be revised upstream in the future.

items[0].properties["proj:epsg"] = int(items[0].properties["proj:epsg"])

Reading and accessing the data

Now that we’ve explored the dataset through the STAC API, let’s read and access the dataset itself.

# This is a workaround that is planning to move up into stackstac itself

import boto3
import stackstac
import rasterio as rio
import rioxarray

gdal_env = stackstac.DEFAULT_GDAL_ENV.updated(
    always=dict(
        AWS_NO_SIGN_REQUEST=True, session=rio.session.AWSSession(boto3.Session())
    )
)
da = stackstac.stack(items[0], gdal_env=gdal_env)
da
<xarray.DataArray 'stackstac-486a109067eae0fbd2de23580e0f93f3' (time: 1,
                                                                band: 1,
                                                                y: 81025,
                                                                x: 78078)>
dask.array<fetch_raster_window, shape=(1, 1, 81025, 78078), dtype=float64, chunksize=(1, 1, 1024, 1024), chunktype=numpy.ndarray>
Coordinates: (12/16)
  * time            (time) datetime64[ns] NaT
    id              (time) <U19 'AGB_map_2017v0m_COG'
  * band            (band) <U11 'cog_default'
  * x               (x) float64 -18.27 -18.27 -18.27 ... 51.86 51.86 51.86
  * y               (y) float64 37.73 37.73 37.73 37.73 ... -35.05 -35.05 -35.05
    proj:transform  object {0.0, 1.0, 37.73103856358817, 0.000898315284119521...
    ...              ...
    proj:geometry   object {'type': 'Polygon', 'coordinates': [[[-18.27352950...
    proj:shape      object {81024.0, 78077.0}
    title           <U17 'Default COG Layer'
    description     <U47 'Cloud optimized default layer to display on map'
    raster:bands    object {'scale': 1.0, 'nodata': 'inf', 'offset': 0.0, 'sa...
    epsg            int64 4326
Attributes:
    spec:        RasterSpec(epsg=4326, bounds=(-18.274427824843425, -35.05405...
    crs:         epsg:4326
    transform:   | 0.00, 0.00,-18.27|\n| 0.00,-0.00, 37.73|\n| 0.00, 0.00, 1.00|
    resolution:  0.0008983152841195214
# Create an AOI for our study area

# Guinea
guinea_aoi = {
    "type": "Feature",
    "properties": {},
    "geometry": {
        "coordinates": [
            [
                [-15.519958756713947, 12.732440363049193],
                [-15.519958756713947, 6.771426493209475],
                [-7.078554695621165, 6.771426493209475],
                [-7.078554695621165, 12.732440363049193],
                [-15.519958756713947, 12.732440363049193],
            ]
        ],
        "type": "Polygon",
    },
}
# Subset to bounding box of Guinea
subset = da.rio.clip([guinea_aoi["geometry"]])
subset
<xarray.DataArray 'stackstac-486a109067eae0fbd2de23580e0f93f3' (time: 1,
                                                                band: 1,
                                                                y: 6636, x: 9397)>
dask.array<getitem, shape=(1, 1, 6636, 9397), dtype=float64, chunksize=(1, 1, 1024, 1024), chunktype=numpy.ndarray>
Coordinates: (12/17)
  * time            (time) datetime64[ns] NaT
    id              (time) <U19 'AGB_map_2017v0m_COG'
  * band            (band) <U11 'cog_default'
  * x               (x) float64 -15.52 -15.52 -15.52 ... -7.081 -7.08 -7.079
  * y               (y) float64 12.73 12.73 12.73 12.73 ... 6.773 6.772 6.772
    proj:transform  object {0.0, 1.0, 37.73103856358817, 0.000898315284119521...
    ...              ...
    proj:shape      object {81024.0, 78077.0}
    title           <U17 'Default COG Layer'
    description     <U47 'Cloud optimized default layer to display on map'
    raster:bands    object {'scale': 1.0, 'nodata': 'inf', 'offset': 0.0, 'sa...
    epsg            int64 4326
    spatial_ref     int64 0
Attributes:
    spec:        RasterSpec(epsg=4326, bounds=(-18.274427824843425, -35.05405...
    resolution:  0.0008983152841195214
# select the band of interest, as there is only one in this dataset we'll select the default
data_band = subset.sel(band="cog_default")
data_band
<xarray.DataArray 'stackstac-486a109067eae0fbd2de23580e0f93f3' (time: 1,
                                                                y: 6636, x: 9397)>
dask.array<getitem, shape=(1, 6636, 9397), dtype=float64, chunksize=(1, 1024, 1024), chunktype=numpy.ndarray>
Coordinates: (12/17)
  * time            (time) datetime64[ns] NaT
    id              (time) <U19 'AGB_map_2017v0m_COG'
    band            <U11 'cog_default'
  * x               (x) float64 -15.52 -15.52 -15.52 ... -7.081 -7.08 -7.079
  * y               (y) float64 12.73 12.73 12.73 12.73 ... 6.773 6.772 6.772
    proj:transform  object {0.0, 1.0, 37.73103856358817, 0.000898315284119521...
    ...              ...
    proj:shape      object {81024.0, 78077.0}
    title           <U17 'Default COG Layer'
    description     <U47 'Cloud optimized default layer to display on map'
    raster:bands    object {'scale': 1.0, 'nodata': 'inf', 'offset': 0.0, 'sa...
    epsg            int64 4326
    spatial_ref     int64 0
Attributes:
    spec:        RasterSpec(epsg=4326, bounds=(-18.274427824843425, -35.05405...
    resolution:  0.0008983152841195214

Visualizing the NCEO Biomass 2017 layer for our study area in Guinea

Now that we’ve got the NCEO data layer subsetted for Guinea, let’s visualize it using hvplot.

import hvplot.xarray

biomass = data_band.squeeze()
biomass

biomass.hvplot(
    x="x",
    y="y",
    coastline=True,
    rasterize=True,
    cmap="viridis",
    widget_location="bottom",
    frame_width=600,
)