4.4 Read Sentinel-1 RTC data from Microsoft Planetary Computer#

This notebook demonstrates how to access Sentinel-1 RTC imagery from Microsoft Planetary Computer using stackstac. STAC stands for Spatio-Temporal Asset Catalog, it is a common framework to describe geospatial information and a way for data providers, developers, and users to work and exchange information efficiently.

A. Connect to Microsoft Planetary Computer

    1. Explore STAC metadata

B. Read data and create Xarray data cube

    1. Create a Dask distributed cluster

    1. Use stackstac to pull queried data from Planetary Computer

    1. Inspect dataset

C. Visualize data

    1. Ascending and descending pass acquisitions

    1. Variability over time

    1. Seasonal variability

Concepts

  • Querying large cloud-hosted dataset

  • Accessing cloud-hosted data stored as COGs (cloud-optimized GeoTIFFs)

  • Extracting and organizing metadata

Techniques

  • Introduction to working with STAC data

  • Using pystac_client to query cloud-hosted datasets, observe metadata

  • Using stackstac to read cloud-hosted data as xarray objects

  • Using xarray to manipulate and organize Sentinel-1 SAR data

  • Performing grouping and reductions on xarrray objects

  • Visualizing xarray objects using FacetGrid

Hide code cell source
%xmode minimal
import geopandas as gpd
import matplotlib.pyplot as plt
import numpy as np
import planetary_computer
from pystac_client import Client
import rich.table
import stackstac

import s1_tools
Hide code cell output
Exception reporting mode: Minimal

A. Connect to Microsoft Planetary Computer#

We use the pystac_client package to interact with and query the Microsoft Planetary Computer Sentinel-1 RTC dataset. In the cell below, we will create an object called catalog by calling the .open() method of the Client class. This establishes a connection with the data hosted at the url provided.

1) Explore STAC metadata#

catalog = Client.open("https://planetarycomputer.microsoft.com/api/stac/v1")
catalog

Next, define some parameters to help us query the data catalog for a specific collection, time range, and geographic area of interest.

time_range = "2021-05-04/2022-05-21"
bbox = [88.214935, 27.92767, 88.302, 28.034]

Now, search the catalog for entries that match the specified criteria for collection (Sentinel-1 RTC), bbox (the AOI defined above) and datetime (the specified time range):

search = catalog.search(collections=["sentinel-1-rtc"], bbox=bbox, datetime=time_range)
items = search.item_collection()
len(items)
95

We’ve created a few more instances of pystac_client classes. Check out the object types below to better familiarize yourself with the STAC metadata objects.

print(type(catalog))
print(type(search))
print(type(items))
<class 'pystac_client.client.Client'>
<class 'pystac_client.item_search.ItemSearch'>
<class 'pystac.item_collection.ItemCollection'>

items is an instance of the class ItemCollection; we can explore it via the embedded html interface.

items

To make it easier to work with, we can convert the items object to a dictionary, and from there, to a geopandas.GeoDataFrame. The metadata from within each item of the ItemCollection object is present in the GeoDataFrame but its easier to scan and organize this way.

df = gpd.GeoDataFrame.from_features(items.to_dict(), crs="epsg:4326")
df.head(5)
geometry datetime platform s1:shape proj:bbox proj:shape end_datetime constellation s1:resolution proj:transform ... sar:resolution_range s1:product_timeliness sar:resolution_azimuth sar:pixel_spacing_range sar:observation_direction sar:pixel_spacing_azimuth sar:looks_equivalent_number s1:instrument_configuration_ID sat:platform_international_designator proj:code
0 POLYGON ((86.20642 27.84432, 86.21062 27.78391... 2022-05-21T12:14:09.786375Z SENTINEL-1A [28153, 21599] [421860.0, 2916220.0, 703390.0, 3132210.0] [21599, 28153] 2022-05-21 12:14:22.285304+00:00 Sentinel-1 high [10.0, 0.0, 421860.0, 0.0, -10.0, 3132210.0, 0... ... 20 Fast-24h 22 10 right 10 4.4 7 2014-016A EPSG:32645
1 POLYGON ((89.98022 27.01218, 90.99374 27.17283... 2022-05-16T12:06:02.314833Z SENTINEL-1A [28422, 21971] [618240.0, 2960440.0, 902460.0, 3180150.0] [21971, 28422] 2022-05-16 12:06:14.814536+00:00 Sentinel-1 high [10.0, 0.0, 618240.0, 0.0, -10.0, 3180150.0, 0... ... 20 Fast-24h 22 10 right 10 4.4 7 2014-016A EPSG:32645
2 POLYGON ((86.20632 27.84423, 86.21041 27.78364... 2022-05-09T12:14:08.980398Z SENTINEL-1A [28152, 21599] [421850.0, 2916220.0, 703370.0, 3132210.0] [21599, 28152] 2022-05-09 12:14:21.479341+00:00 Sentinel-1 high [10.0, 0.0, 421850.0, 0.0, -10.0, 3132210.0, 0... ... 20 Fast-24h 22 10 right 10 4.4 7 2014-016A EPSG:32645
3 POLYGON ((89.98426 27.01255, 90.99525 27.17279... 2022-05-04T12:06:01.122366Z SENTINEL-1A [28428, 21971] [619630.0, 2960660.0, 895970.0, 3178430.0] [28428, 21971] 2022-05-04 12:06:13.622093+00:00 Sentinel-1 high [10.0, 0.0, 618410.0, 0.0, -10.0, 3180150.0, 0... ... 20 Fast-24h 22 10 right 10 4.4 7 2014-016A EPSG:32645
4 POLYGON ((89.80391 27.02294, 89.82898 27.27428... 2022-04-30T00:03:27.201903Z SENTINEL-1A [27531, 21084] [524470.0, 2980840.0, 799780.0, 3191680.0] [21084, 27531] 2022-04-30 00:03:39.701264+00:00 Sentinel-1 high [10.0, 0.0, 524470.0, 0.0, -10.0, 3191680.0, 0... ... 20 Fast-24h 22 10 right 10 4.4 7 2014-016A EPSG:32645

5 rows × 36 columns

Construct a table with metadata for a single scene (ie. a single element of the list items). This is similar (but not identical) to the information stored in the file name and README files of the Sentinel-1 data we read from local storage.

table = rich.table.Table("key", "value")
for k, v in sorted(items[0].properties.items()):
    table.add_row(k, str(v))
table
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ key                                    value                                                       ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ constellation                         │ Sentinel-1                                                  │
│ datetime                              │ 2022-05-21T12:14:09.786375Z                                 │
│ end_datetime                          │ 2022-05-21 12:14:22.285304+00:00                            │
│ platform                              │ SENTINEL-1A                                                 │
│ proj:bbox                             │ [421860.0, 2916220.0, 703390.0, 3132210.0]                  │
│ proj:code                             │ EPSG:32645                                                  │
│ proj:shape                            │ [21599, 28153]                                              │
│ proj:transform                        │ [10.0, 0.0, 421860.0, 0.0, -10.0, 3132210.0, 0.0, 0.0, 1.0] │
│ s1:datatake_id                        │ 338944                                                      │
│ s1:instrument_configuration_ID        │ 7                                                           │
│ s1:orbit_source                       │ RESORB                                                      │
│ s1:processing_level                   │ 1                                                           │
│ s1:product_timeliness                 │ Fast-24h                                                    │
│ s1:resolution                         │ high                                                        │
│ s1:shape                              │ [28153, 21599]                                              │
│ s1:slice_number                       │ 6                                                           │
│ s1:total_slices                       │ 20                                                          │
│ sar:center_frequency                  │ 5.405                                                       │
│ sar:frequency_band                    │ C                                                           │
│ sar:instrument_mode                   │ IW                                                          │
│ sar:looks_azimuth                     │ 1                                                           │
│ sar:looks_equivalent_number           │ 4.4                                                         │
│ sar:looks_range                       │ 5                                                           │
│ sar:observation_direction             │ right                                                       │
│ sar:pixel_spacing_azimuth             │ 10                                                          │
│ sar:pixel_spacing_range               │ 10                                                          │
│ sar:polarizations                     │ ['VV', 'VH']                                                │
│ sar:product_type                      │ GRD                                                         │
│ sar:resolution_azimuth                │ 22                                                          │
│ sar:resolution_range                  │ 20                                                          │
│ sat:absolute_orbit                    │ 43309                                                       │
│ sat:orbit_state                       │ ascending                                                   │
│ sat:platform_international_designator │ 2014-016A                                                   │
│ sat:relative_orbit                    │ 12                                                          │
│ start_datetime                        │ 2022-05-21 12:13:57.287447+00:00                            │
└───────────────────────────────────────┴─────────────────────────────────────────────────────────────┘

We can also explore the object metadata outside of the table. Try typing .assets, .links, .STAC_extensions and .properties onto the term below. You can query the object programmatically for the same metadata stored in the table using dictionary syntax on the properties accessor.

items[0]
# items[0].assets
# items[0].links
# items[0].properties

Now that we’e explored the items that fit our query of the dataset and seen the metadata, let’s read the data using Xarray.

B. Read data with Xarray#

1) Create a Dask distributed cluster#

We use dask.distributed to parallelize operations in this notebook. Similarly to PySTAC, Dask distributed also uses a client. The dask distributed client allows us to interact with the scheduler that manages jobs across a cluster of compute resources whether they are cores on a local machine or virtual machines distributed across a cloud-compute platform.

# give dask client a name other than 'Client'
from dask.distributed import Client as da_Client

client = da_Client(processes=False)
print(client.dashboard_link)
http://192.168.86.116:8787/status

The client.dashboard_link points to a dask dashboard for the client we’ve just initialized.

2) Use stackstac to pull queried data from Planetary Computer#

Now that we have queried the data that is available from Microsoft Planetary Computer and inspected the metadata using pystac, we use stackstac to read the data into memory as an Xarray object. Calling stackstac.stack() takes a STAC collection and produces a lazy xarray.DataArray backed by Dask arrays. For more on Dask arrays see relevant concepts.

type(items)
pystac.item_collection.ItemCollection

More detail on stackstack.stac()#

In the code cell below, you can see that we pass the object items, a pystac.ItemCollection to stackstac.stack(). The wrapper planetary_computer.sign() uses Planetary Computer subscription key credentials to access the data.

stackstac passes the metadata from the STAC collection into the Xarray object as coordinates allowing you to further organize and manipulate the object to fit your purposes. stackstac can also read the data in according to parameters passed during the stack() call. In the code cell below we pass parameters for bounding box and coordinate reference system. To specify the resolution as something other than the resolution at which its stored, pass a resolution = argument.

Note

During development of this notebook, an intermittent ‘Broken Pipe’ error occurred on this step. We believe this is upstream of any of the code used in this tutorial, so if it happens, try restarting the kernel and running the notebook again.

da = stackstac.stack(
    planetary_computer.sign(items),
    bounds_latlon=bbox,
    epsg=32645,
)

Load the data into memory (this may take a few minutes):

da = da.compute()

3) Inspect dataset#

Let’s take a look at what stackstac.stack() returns.

We have a xr.DataArray, meaning there is only one data variable. It has ‘x’,‘y’,‘time’ dimensions as well as a ‘band’ dimension. This is how the VV and VH polarization images are stored. Note that this dataset doesn’t come with a layover-shadow map like the ASF processed Sentinel-1 RTC dataset.

In addition, there are 39 coordinate variables that provide information about the backscatter imagery and how it was collected and processed. In addition, the object has attrs that describe the spatial resolution and coordinate reference information.

C. Visualize data#

Let’s do a little bit of looking around. We’ll define a function to convert the backscatter pixel values from power to dB scale but we won’t use it yet. This transformation applies a logarithmic scale to the data which makes visualization easier but we do not want to run any summary statistics on the dB data as it will be distorted.

A note on visualizing SAR data

The measurements that we’re provided in the RTC dataset are in intensity, or power, scale. Often, to visualize SAR backscatter, the data is converted from power to normalized radar cross section (the backscatter coefficient). This is in decibel (dB) units, meaning a log transform has been applied. This transformation makes it easier to visualize variability but it is important not to calculate summary statistics on log-transformed data as it will be distorted. You can read more about these concepts here.

s1_tools.power_to_db(da.mean(dim="time")).plot(col="band", cmap=plt.cm.Greys_r);
../../_images/c9f5aea57d282288ee5d74e80ad8cb866ae4afa86f0a9db69957c16d1cbf452a.png

1) Ascending and descending pass acquisitions#

What if we wanted to look only at imagery taken during ascending or descending passes of the satellite? Because orbital direction is a non-dimensional coordinate, we need to use .where() instead of .sel() to subset the dataset according to orbital direction.

For more discussion on orbital direction and the differences between ascending and descending passes, see the previous notebook

da_asc = da.where(da["sat:orbit_state"] == "ascending", drop=True)
da_desc = da.where(da["sat:orbit_state"] == "descending", drop=True)

Plot the mean over time of ascending and descending pass scenes:

asc_min = np.array(
    [
        s1_tools.power_to_db(da_asc.sel(band="vv").mean(dim="time")).min(),
        s1_tools.power_to_db(da_asc.sel(band="vh").mean(dim="time")).min(),
    ]
).min()

asc_max = np.array(
    [
        s1_tools.power_to_db(da_asc.sel(band="vv").mean(dim="time")).max(),
        s1_tools.power_to_db(da_asc.sel(band="vh").mean(dim="time")).max(),
    ]
).max()

fig, axs = plt.subplots(ncols=2, figsize=(15, 7))
s1_tools.power_to_db(da_asc.sel(band="vv").mean(dim="time")).plot(
    cmap=plt.cm.Greys_r,
    ax=axs[0],
    cbar_kwargs=({"label": "dB"}),
    vmin=asc_min,
    vmax=asc_max,
)
s1_tools.power_to_db(da_asc.sel(band="vh").mean(dim="time")).plot(
    cmap=plt.cm.Greys_r, ax=axs[1], cbar_kwargs=({"label": "dB"})
)
fig.suptitle("Mean over time of ascending scenes")
axs[0].set_title("VV")
axs[1].set_title("VH")
for i in range(len(axs)):
    axs[i].tick_params(axis="x", rotation=45)
../../_images/abd507c5efd20c5c4d42b2bfe6b4dc9c1fe9e6fe3884c24e3cb22e0fc6a42be7.png
desc_min = np.array(
    [
        s1_tools.power_to_db(da_asc.sel(band="vv").mean(dim="time")).min(),
        s1_tools.power_to_db(da_asc.sel(band="vh").mean(dim="time")).min(),
    ]
).min()

desc_max = np.array(
    [
        s1_tools.power_to_db(da_asc.sel(band="vv").mean(dim="time")).max(),
        s1_tools.power_to_db(da_asc.sel(band="vh").mean(dim="time")).max(),
    ]
).max()

fig, axs = plt.subplots(ncols=2, figsize=(15, 7))
s1_tools.power_to_db(da_desc.sel(band="vv").mean(dim="time")).plot(
    cmap=plt.cm.Greys_r,
    ax=axs[0],
    cbar_kwargs=({"label": "dB"}),
    vmin=desc_min,
    vmax=desc_max,
)
s1_tools.power_to_db(da_desc.sel(band="vh").mean(dim="time")).plot(
    cmap=plt.cm.Greys_r,
    ax=axs[1],
    cbar_kwargs=({"label": "dB"}),
    vmin=desc_min,
    vmax=desc_max,
)
fig.suptitle("Mean over time of descending scenes")
axs[0].set_title("VV")
axs[1].set_title("VH")
for i in range(len(axs)):
    axs[i].tick_params(axis="x", rotation=45)
../../_images/3ac8b0f8e50e1fe330e1c954be8ecddd8c02c66e38fa9035131b1f167b935bff.png

It looks like there is some interesting variability between the two images. What if we wanted to see how these differences persist over time?

2) Variability over time#

fig, ax = plt.subplots(figsize=(12, 5))

ax.set_title("Mean backscatter for VH band (red) and VV band (blue) over time")
s1_tools.power_to_db(da.sel(band="vv").mean(dim=["x", "y"])).plot(
    ax=ax, linestyle="None", marker="o", color="blue", label="VV"
)
s1_tools.power_to_db(da.sel(band="vh").mean(dim=["x", "y"])).plot(
    ax=ax, linestyle="None", marker="o", color="red", label="VH"
)
fig.legend(loc="center right")
fig.suptitle("Mean backscatter of VH and VV bands over time, ASC scenes", fontsize=12, y=0.98)
ax.set_title(None)
ax.set_ylabel("dB");
../../_images/c2ad2e22c59f293ac5cf5c1652cb6a5b71e8ad7b47b9fd8fcb0ea358826c0929.png

Interesting! It looks like there is more variability in the VH band than the VV band. Over the year, there is about 4 dB variability in the VV band but over twice as much in the VH band. Chapter 2 of the SAR handbook contains information about how polarization impacts radar returns.

3) Seasonal variability#

Next, let’s take a look at how backscatter values vary seasonally. To do this we will use xarray’s groupby() and .facetgrid methods.

seasons_gb = da.groupby(da.time.dt.season).mean()
# add the attrs back to the season groupby object
seasons_gb.attrs = da.attrs
# re-order the seasons
seasons_gb = seasons_gb.reindex({"season": ["DJF", "MAM", "JJA", "SON"]})
seasons_gb
<xarray.DataArray 'stackstac-6a421b456743118552bcdd4a548b77cc' (season: 4,
                                                                band: 2,
                                                                y: 1542, x: 888)> Size: 88MB
array([[[[0.02535364, 0.03147986, 0.02786191, ..., 0.02108686,
          0.02130041, 0.02315306],
         [0.0218459 , 0.02203541, 0.01941934, ..., 0.02368083,
          0.02490576, 0.02375024],
         [0.02043244, 0.01921614, 0.01841479, ..., 0.02435656,
          0.02297878, 0.02517332],
         ...,
         [0.07336626, 0.05900169, 0.05184992, ..., 0.06449757,
          0.05422511, 0.04861424],
         [0.06321969, 0.06112303, 0.04925801, ..., 0.05013352,
          0.04173466, 0.03783966],
         [0.05632453, 0.04643786, 0.04184489, ..., 0.0426184 ,
          0.03918812, 0.03669885]],

        [[0.08964307, 0.11235596, 0.07842911, ..., 0.0773935 ,
          0.06486455, 0.06993567],
         [0.07961441, 0.08544813, 0.07430832, ..., 0.08535827,
          0.06914322, 0.07503611],
         [0.06167143, 0.07350944, 0.09076815, ..., 0.11603812,
          0.09522323, 0.08346861],
...
         [0.0467003 , 0.03025291, 0.02203605, ..., 0.02916168,
          0.03355919, 0.03477128],
         [0.0422203 , 0.03006489, 0.02622437, ..., 0.02509039,
          0.03020153, 0.02791584],
         [0.03369109, 0.02739136, 0.02804806, ..., 0.02638425,
          0.03162629, 0.02560615]],

        [[0.07472315, 0.08895492, 0.09268859, ..., 0.07098423,
          0.05950723, 0.0595387 ],
         [0.0703036 , 0.07719046, 0.08143807, ..., 0.07945435,
          0.0719438 , 0.07451397],
         [0.06143826, 0.06699089, 0.07739989, ..., 0.08420351,
          0.08209176, 0.08979529],
         ...,
         [0.1922634 , 0.120512  , 0.08602943, ..., 0.28120565,
          0.2469148 , 0.1406847 ],
         [0.15826532, 0.12039333, 0.09145802, ..., 0.24468837,
          0.1999128 , 0.1314866 ],
         [0.11499288, 0.09421477, 0.08536671, ..., 0.20733283,
          0.18289436, 0.13072233]]]])
Coordinates: (12/28)
  * band                                   (band) <U2 16B 'vh' 'vv'
  * x                                      (x) float64 7kB 6.194e+05 ... 6.28...
  * y                                      (y) float64 12kB 3.102e+06 ... 3.0...
  * season                                 (season) <U3 48B 'DJF' ... 'SON'
    sar:center_frequency                   float64 8B 5.405
    sar:observation_direction              <U5 20B 'right'
    ...                                     ...
    sar:resolution_range                   int64 8B 20
    sar:pixel_spacing_azimuth              int64 8B 10
    title                                  (band) <U41 328B 'VH: vertical tra...
    description                            (band) <U173 1kB 'Terrain-correcte...
    raster:bands                           object 8B {'nodata': -32768, 'data...
    epsg                                   int64 8B 32645
Attributes:
    spec:           RasterSpec(epsg=32645, bounds=(619419.5314582244, 3089781...
    crs:            epsg:32645
    transform:      | 9.79, 0.00, 619419.53|\n| 0.00,-7.70, 3101655.25|\n| 0....
    resolution_xy:  (9.788706070864338, 7.699930607059487)

Use the seasons groupby object and specify the season dimension in the Facetgrid call to automatically plot mean backscatter for each season:

fg_vv = s1_tools.power_to_db(seasons_gb.sel(band="vv")).plot(col="season", cmap=plt.cm.Greys_r);
../../_images/95beb60cff7ad9dfae5ff55b365c8a7d09c5804ded1b8de2262ad880e2dd1668.png
fg_vh = s1_tools.power_to_db(seasons_gb.sel(band="vh")).plot(col="season", cmap=plt.cm.Greys_r);
../../_images/f9887bd0b7ccb476f90e1c341a1eb8a106a49d10f55932b8c1ea4643689e1ad5.png

Conclusion#

This notebook demonstrated how to access cloud-hosted data from Microsoft Planetary Computer, some basic dataset organization and preliminary exploration and visualization. The following notebook will compare the Planetary Computer dataset to the ASF dataset.

We’ll use IPython storemagic to store the object we created in this notebook, da, to use in the comparison.

%store da
Stored 'da' (DataArray)