4.5 Comparing Sentinel-1 RTC datasets

4.5 Comparing Sentinel-1 RTC datasets#

So far in this tutorial, we’ve demonstrated how to read Sentinel-1 RTC imagery from two sources and assemble analysis-ready data cubes with appropriate metadata. Now, we’ll perform a comparison of the two datasets.

Dataset comparison

While the two datasets are very similar, there are a few key differences:

They use different sources images.
- ASF Sentinel-1 RTC imagery is processed from Single Look Complex (SLC) images while Planetary Computer Sentinel-1 RTC imagery is processed from Ground Range Detected (GRD) images. SLC images contain both amplitude and phase information for each pixel. They are in radar coordinates and have not yet been multi-looked. In contrast, GRD images has been detected, multi-looked and projected to ground range.
They use different digital elevation models (DEMs) for terrain correction.
- ASF uses the GLO-30 Copernicus DEM while Planetary Computer uses a Planet DEM.
The datasets have different pixel spacings. For Planetary Computer, the pixel spacing is 10m in both range and azimuth directions. ASF has the option to produce images with 30 m, 20 m, or 10 m pixel spacing. The data used in this tutorial is 30 m. Note that there are tradeoffs in processing time and file size with pixel spacing, see more discussion here.
Each platform uses a different algorithm for RTC processing.
The ASF dataset comes with an associated layover shadow map for each scene while the Planetary Computer dataset does not.

All of the above information and much more detail about the processing methods for both datasets are available in each dataset’s documentation pages:

Outline

A. Read and prepare data

1. Check coordinate reference system information

B. Ensure direct comparison between datasets

1. Subset time series to common time steps
1. Handle differences in spatial resolution
1. Mask missing data from one dataset

C. Combine objects

1. expand_dims() to add ‘source’ dimension
1. combine_by_coords()

D. Visualize comparisons

1. Mean over time
1. Mean over space

Learning goals

Concepts

Comparing and evaluating multiple datasets
Organizing data so that its structure matches your use-case

Techniques

Conditional selection based on non-dimensional coordinates using xr.Dataset.where()
Subsetting datasets based on dimensional coordinates using xr.DataArray.isin()
Adding dimensional and non-dimensional coordinates to xr.Dataset objects
Xarray plotting methods
Projecting xarray objects to different grids using xr.interp_like()

A. Read and prepare data#

At the end of notebook 3, we wrote the analysis-ready ASF Sentinel-1 data cube that had been clipped to a smaller spatial area of interest to disk. We’ll read that into memory now to use in this comparison.

We used Jupyter cell magic to persist the Planetary Computer data cube created in notebook 4. Now we can read it into our notebook by adding -r to the store magic command used to persist it. Read more about storemagic here.

%store -r da
pc_cube = da

pc_cube = pc_cube.compute()

timeseries_type = "full"

asf_cube = xr.open_dataset(
    f"../data/raster_data/{timeseries_type}_timeseries/intermediate_cubes/s1_asf_clipped_cube.zarr",
    engine="zarr",
    chunks="auto",
    decode_coords="all",
)

Rename the temporal dimension of the ASF dataset to match that of the PC dataset:

asf_cube = asf_cube.rename({"acq_date": "time"})

asf_cube = asf_cube.compute()

1) Check coordinate reference system information#

First, make sure that both objects are projected to the same CRS.

assert pc_cube.rio.crs == asf_cube.rio.crs, "CRS of both data cubes are expected to be identical."

Let’s also check how missing data is handled in both objects. We want missing data to be assigned NaN values.

asf_cube["vv"].rio.nodata

nan

pc_cube.sel(band="vv").rio.nodata

The pc_cube array contains nan values, but it doesn’t have an encoding specifying what value is used to represent nodata. We can assign a nodata value to the dataset below. See Rioxarray’s Nodata Management documentation for more detail on this.

pc_cube.rio.write_nodata(np.nan, inplace=True)
pc_cube.rio.nodata

nan

assert (
    np.isnan(asf_cube.vh.rio.nodata) == np.isnan(pc_cube.sel(band="vh").rio.nodata) == True
), "Expected vh nodata value to be np.nan"
assert (
    np.isnan(asf_cube.vv.rio.nodata) == np.isnan(pc_cube.sel(band="vv").rio.nodata) == True
), "Expected vv nodata value to be np.nan"

D. Visualize comparisons#

We’re ready to visualize backscatter from both datasets. Because we’ve made a data cube whose dimensionality reflects the comparison, we can use Xarray’s plotting features and visualize the comparisons from a single object.

1) Mean over time#

Look at VV backscatter first:

# Plot backscatter data
vv_fg = s1_tools.power_to_db(comparison_obj.sel(band="vv").mean(dim="time")).plot(
    col="source", cmap=plt.cm.Greys_r, cbar_kwargs=({"label": "dB"})
)
# Format figure and axes
vv_fg.fig.suptitle("Comparing VV backscatter from ASF and PC datasets")
vv_fg.fig.supxlabel("X coordinate of projection (m)")
vv_fg.fig.supylabel("Y coordinate of projection (m)")
vv_fg.fig.set_figheight(7)
vv_fg.fig.set_figwidth(12)

for i in range(len(vv_fg.axs[0])):
    vv_fg.axs[0][i].set_xlabel(None)
    vv_fg.axs[0][i].set_ylabel(None)
vv_fg.axs[0][0].set_title("ASF")
vv_fg.axs[0][1].set_title("PC");

../../_images/db11119a51a965c8e429c7abe0209326e7ab05a4f59ed4ead026c4abace5e417.png

Then VH:

# Plot backscatter data
vh_fg = s1_tools.power_to_db(comparison_obj.sel(band="vh").mean(dim="time")).plot(
    col="source", cmap=plt.cm.Greys_r, cbar_kwargs=({"label": "dB"})
)

# Figure and axes formatting
vh_fg.fig.suptitle("Comparing VH backscatter from ASF and PC datasets")
vh_fg.fig.supxlabel("X coordinate of projection (m)")
vh_fg.fig.supylabel("Y coordinate of projection (m)")
vh_fg.fig.set_figheight(7)
vh_fg.fig.set_figwidth(12)
for i in range(len(vh_fg.axs[0])):
    vh_fg.axs[0][i].set_xlabel(None)
    vh_fg.axs[0][i].set_ylabel(None)
vh_fg.axs[0][0].set_title("ASF")
vh_fg.axs[0][1].set_title("PC");

../../_images/b707cc9d9cfff549a7d3916d8dcd2e834c3abdc9b370601793dadd01996f937c.png

2) Mean over space#

Instead of computing mean backscatter values along the time dimension, reduce along the spatial dimensions (x and y) to see backscatter variability over time:

fig, ax = plt.subplots(nrows=2, figsize=(14, 8), layout="constrained")
s1_tools.power_to_db(comparison_obj.sel(source="asf", band="vv").mean(dim=["x", "y"])).plot.scatter(
    x="time", ax=ax[0], label="asf", c="b", alpha=0.75
)
s1_tools.power_to_db(comparison_obj.sel(source="pc", band="vv").mean(dim=["x", "y"])).plot.scatter(
    x="time", ax=ax[0], label="pc", c="r", alpha=0.75
)

s1_tools.power_to_db(comparison_obj.sel(source="asf", band="vh").mean(dim=["x", "y"])).plot.scatter(
    x="time", ax=ax[1], label="asf", c="b", alpha=0.75
)
s1_tools.power_to_db(comparison_obj.sel(source="pc", band="vh").mean(dim=["x", "y"])).plot.scatter(
    x="time", ax=ax[1], label="pc", c="r", alpha=0.75
)
ax[0].legend(loc="lower right", bbox_to_anchor=([1, -0.25, 0, 0]))

for i in range(len(ax)):
    ax[i].set_xlabel(None)
    ax[i].set_ylabel("dB")

ax[0].set_title("VV")
ax[1].set_title("VH")

fig.supxlabel("Time")
# fig.supylabel('dB')
fig.suptitle(
    "Comparing mean VV and VH backscatter over time from PC (red) and ASF (blue) datasets",
    fontsize=14,
    y=1.05,
);

../../_images/d9885328b285a87ca514db710cdf57a71ebaba4dfd3248f36ef5008bbc6454bb.png

We can also use hvplot to make an interactive visualization of this comparison:

4.5 Comparing Sentinel-1 RTC datasets

Contents

4.5 Comparing Sentinel-1 RTC datasets#

Concepts

Techniques

A. Read and prepare data#

1) Check coordinate reference system information#

B. Ensure direct comparison between datasets#

1) Subset time series to common time steps#

2) Handle differences in spatial resolution#

3) Mask missing data from one dataset#

C. Combine objects#

1) `expand_dims()` to add ‘source’ dimension#

2) `combine_by_coords()`#

D. Visualize comparisons#

1) Mean over time#

2) Mean over space#

Conclusion#

4.5 Comparing Sentinel-1 RTC datasets

Contents

4.5 Comparing Sentinel-1 RTC datasets#

Concepts

Techniques

A. Read and prepare data#

1) Check coordinate reference system information#

B. Ensure direct comparison between datasets#

1) Subset time series to common time steps#

2) Handle differences in spatial resolution#

3) Mask missing data from one dataset#

C. Combine objects#

1) expand_dims() to add ‘source’ dimension#

2) combine_by_coords()#

D. Visualize comparisons#

1) Mean over time#

2) Mean over space#

Conclusion#

1) `expand_dims()` to add ‘source’ dimension#

2) `combine_by_coords()`#