3. Exploratory data analysis of a single glacier

Introduction

Overview

In the previous notebook, we walked through initial steps to read and organize a large raster dataset, and to understand it in the context of spatial areas of interest represented by vector data.

The previous notebook mainly focused on high-level processing operations:

  1. reading a multi-dimensional dataset,

  2. re-organizing along a given dimension,

  3. reducing a large raster object to smaller areas of interest represented by vector data, and

  4. strategies for completing these steps with larger-than-memory datasets.

In this notebook, we will continue performing initial data inspection and exploratory analysis but this time, focused on velocity data clipped to an individual glacier. We demonstrate Xarray functionality for common computations and visualizations.

Outline

A. Data exploration

    1. Load raster data into memory and visualize with vector data

    1. Examine data coverage

    1. Break down by sensor

    1. Combine sensor-specific subset

B. Examine velocity variability

    1. Histograms and summary statistics

    1. Spatial velocity variability

    1. Temporal velocity variability

C. Dimensional Computations

    1. Temporal resampling

    1. Grouped analysis by season

Learning goals

Concepts

  • Examining metadata, interpreting physical observable in the contxt of available metadata

Techniques

  • Using matplotlib to visualize raster and vector data with satellite imagery basemaps


Expand the next cell to see specific packages used in this notebook and relevant system and version information.

Hide code cell source
%xmode minimal
import contextily as cx
import geopandas as gpd
import matplotlib.pyplot as plt
import numpy as np
import rioxarray as rio
import scipy.stats
import warnings
import xarray as xr

warnings.simplefilter(action='ignore', category=FutureWarning)
Hide code cell output
Exception reporting mode: Minimal

A. Data exploration#

1) Load raster data into memory and visualize with vector data#

single_glacier_raster = xr.open_zarr('../data/single_glacier_itslive.zarr')
single_glacier_vector = gpd.read_file('../data/single_glacier_vec.json')
single_glacier_raster.nbytes/ 1e9
3.326255548

The above code cells show us that this dataset contains observations from Sentinel 1 & 2 and Landsat 4,5,6,7,8 & 9 satellite sensors. The dataset is 3.3 GB.

Next, we want to perform computations that require us to load this object into memory. To do this, we use the dask .compute() method, which turns a ‘lazy’ object into an in-memory object. If you try to run compute on too large of an object, your computer may run out of RAM and the kernel being used in this python session will die (if this happens, click ‘restart kernel’ from the kernel drop down menu above).

single_glacier_raster = single_glacier_raster.compute()

Now, if you expand the data object to look at the variables, you will see that they no long hold dask.array objects.

single_glacier_raster
<xarray.Dataset> Size: 3GB
Dimensions:                     (mid_date: 47892, y: 37, x: 40)
Coordinates:
  * mid_date                    (mid_date) datetime64[ns] 383kB 1986-09-11T03...
    spatial_ref                 int64 8B 0
  * x                           (x) float64 320B 7.843e+05 ... 7.889e+05
  * y                           (y) float64 296B 3.316e+06 ... 3.311e+06
Data variables: (12/60)
    M11                         (mid_date, y, x) float32 284MB nan nan ... nan
    M11_dr_to_vr_factor         (mid_date) float32 192kB nan nan nan ... nan nan
    M12                         (mid_date, y, x) float32 284MB nan nan ... nan
    M12_dr_to_vr_factor         (mid_date) float32 192kB nan nan nan ... nan nan
    acquisition_date_img1       (mid_date) datetime64[ns] 383kB 1986-07-25T03...
    acquisition_date_img2       (mid_date) datetime64[ns] 383kB 1986-10-29T03...
    ...                          ...
    vy_error_modeled            (mid_date) float32 192kB 97.0 64.6 ... 930.7
    vy_error_slow               (mid_date) float32 192kB 30.3 18.9 ... 61.7 50.1
    vy_error_stationary         (mid_date) float32 192kB 30.2 18.9 ... 61.6 50.1
    vy_stable_shift             (mid_date) float32 192kB 4.1 -6.9 ... 91.2 91.2
    vy_stable_shift_slow        (mid_date) float32 192kB 4.2 -6.9 ... 91.2 91.2
    vy_stable_shift_stationary  (mid_date) float32 192kB 4.1 -6.9 ... 91.2 91.2
Attributes: (12/19)
    Conventions:                CF-1.8
    GDAL_AREA_OR_POINT:         Area
    author:                     ITS_LIVE, a NASA MEaSUREs project (its-live.j...
    autoRIFT_parameter_file:    http://its-live-data.s3.amazonaws.com/autorif...
    datacube_software_version:  1.0
    date_created:               25-Sep-2023 22:00:23
    ...                         ...
    s3:                         s3://its-live-data/datacubes/v2/N30E090/ITS_L...
    skipped_granules:           s3://its-live-data/datacubes/v2/N30E090/ITS_L...
    time_standard_img1:         UTC
    time_standard_img2:         UTC
    title:                      ITS_LIVE datacube of image pair velocities
    url:                        https://its-live-data.s3.amazonaws.com/datacu...

As a quick sanity check, we’ll convince ourselves that the clipping operation in the previous notebook worked correctly. We also show that we can plot both Xarray raster data and GeoPandas vector data overlaid on the same plot with a satellite image basemap as a background.

A few notes about the following figure:

  • single_glacier_raster is a 3-dimensional object. We want to plot it in 2-d space with the RGI glacier outline. So, we perform a reduction (in this case, compute the mean), in order to reduce the dataset from 3-d to 2-d (Another option would be to select a single time step).

  • We could make this plot with a white background, but it is also nice to be able to add a base map to the image. Here, we’ll use contextily to do so. This will require converting the coordinate reference system (CRS) of both objects to the Web Mercator projection (EPSG:3857). We first need to use rio.write_crs() to assign a CRS to the raster object (Note: the data is already projected in the correct CRS, the object is just not ‘aware’ of its CRS. This is necessary for reprojection operations. For more, see Rioxarray’s CRS Management documentation).

#Write CRS of raster data
single_glacier_raster = single_glacier_raster.rio.write_crs(single_glacier_raster.attrs['projection'])
#Check that CRS of vector and raster data are the same
assert single_glacier_raster.rio.crs == single_glacier_vector.crs
#Reproject both objects to web mercator
single_glacier_vector_web = single_glacier_vector.to_crs('EPSG:3857')
single_glacier_raster_web = single_glacier_raster.rio.reproject('EPSG:3857')

fig, ax = plt.subplots(figsize=(8,5))

#Plot objects
single_glacier_raster_web.v.mean(dim='mid_date').plot(ax=ax, cmap='viridis', alpha=0.75, add_colorbar=True)
single_glacier_vector_web.plot(ax=ax, facecolor='None', edgecolor='red', alpha=0.75)
#Add basemap
cx.add_basemap(ax, crs=single_glacier_vector_web.crs, source=cx.providers.Esri.WorldImagery)
;
''
../_images/f4b832637fc88802a8ed38197e320518683d4a95c04dd7a3f8f8f23057a9dc38.png

We sorted along the time dimension in the previous notebook, so it should be in chronological order.

single_glacier_raster.mid_date
<xarray.DataArray 'mid_date' (mid_date: 47892)> Size: 383kB
array(['1986-09-11T03:31:15.003252992', '1986-10-05T03:31:06.144750016',
       '1986-10-21T03:31:34.493249984', ..., '2024-10-29T04:18:09.241024000',
       '2024-10-29T04:18:09.241024000', '2024-10-29T04:18:09.241024000'],
      dtype='datetime64[ns]')
Coordinates:
    mapping      int64 8B 0
  * mid_date     (mid_date) datetime64[ns] 383kB 1986-09-11T03:31:15.00325299...
    spatial_ref  int64 8B 0
Attributes:
    description:    midpoint of image 1 and image 2 acquisition date and time...
    standard_name:  image_pair_center_date_with_time_separation

2) Examine data coverage#

A wide variety of forces can impact both satellite imagery and the ability of ITS_LIVE’s feature tracking algorithm to extract velocity estimates from satellite image pairs. For these reasons, there are at times both gaps in coverage and ranges in the estimated error associated with different observations. The following section will demonstrate how to calculate and visualize coverage of the dataset over time. Part 2 will include a discussion of uncertainty and error estimates

When first investigating a dataset, it is helpful to be able to scan/quickly visualize coverage along a given dimension. To create the data needed for such a visualization, we first need a mask that will tell us all possible ‘valid’ pixels; in other words, we need to differentiate between pixels in our 2-d rectangular array that represent ice v. non-ice. Then, for every time step, we can calculate the portion of possible ice pixels that contain an estimated velocity value.

#calculate number of valid pixels
valid_pixels = single_glacier_raster.v.count(dim=['x','y'])
#calculate max. number of valid pixels
valid_pixels_max = single_glacier_raster.v.notnull().any('mid_date').sum(['x','y'])
#add cov proportion to dataset as variable
single_glacier_raster['cov'] = valid_pixels/ valid_pixels_max

Now we can visualize coverage over time:

fig, ax = plt.subplots(figsize=(20,5))

#Plot object
single_glacier_raster['cov'].plot(ax=ax, linestyle='None', marker='x',alpha=0.75)

#Specify axes labels and title
fig.suptitle('Velocity data coverage ovver time', fontsize=16)
ax.set_ylabel('Coverage (proportion)', x=-0.05, fontsize=12)
ax.set_xlabel('Date', fontsize=12);
../_images/80e596a55113ad722eaee26dc353f126a4b8aaeecf2285e89efd69a8b728c0bf.png

3) Break down by sensor#

In this dataset, we have a dense time series of velocity observations for a given glacier (~48,000 observatioons from 1986-2024 note: this notebook was last updated in 2025). However, we know that the ability of satellite imagery pairs to capture ice displacement (and by extension, velocity), can be impacted by conditions such as cloud-cover, which obscure Earth’s surface from optical sensors. ITS_LIVE is a multi-sensor ice velocity dataset, meaning that it is composed of ice velocity observations derived from a number of satellites, which include both optical and Synthetic Aperture Radar (SAR) imagery. Currently, Sentinel-1 is the only SAR sensor included in ITS_LIVE, all others are optical.

While optical imagery requires solar illumination and can be impacted by cloud cover, Sentinel-1 is an active remote sensing technique and images in a longer wavelength (C-band, ~ 5 cm). This means that Sentinel-1 imagery does not require solar illumination, and can penetrate through cloud cover. Because of these sensors differing sensitivity to Earth’s surface conditions, there can sometimes be discrapancies in velocity data observed from different sensors.

Note

There are many great resources available for understanding the principles of SAR and working with SAR imagery. The appendix at the bottom of this notebook lists a few of them. In addition, Chapter 2 of this tutorial focuses on working with a dataset of Sentinel-1 imagery.

Let’s first look at what sensors are represented in the time series:

sensors = list(set(single_glacier_raster.satellite_img1.values))
sensors
['5', '7', '2A', '8', '9', '2B', '1A', '4']

To extract observations from a single satellite sensor, we will use Xarray indexing and selection methods such as .where() and .sel(). The following cells demonstrate different selection approaches and a brief discussion of the pros and cons of each when working with large and/ or sparse datasets.

Landsat 8#

First, looking at velocity observations from Landsat8 data only:

%%time
l8_data = single_glacier_raster.where(single_glacier_raster['satellite_img1'] == '8', drop=True)
l8_data
CPU times: user 383 ms, sys: 3.54 s, total: 3.92 s
Wall time: 4.43 s
<xarray.Dataset> Size: 208MB
Dimensions:                     (mid_date: 2688, y: 37, x: 40)
Coordinates:
    mapping                     int64 8B 0
  * mid_date                    (mid_date) datetime64[ns] 22kB 2013-05-20T04:...
    spatial_ref                 int64 8B 0
  * x                           (x) float64 320B 7.843e+05 ... 7.889e+05
  * y                           (y) float64 296B 3.316e+06 ... 3.311e+06
Data variables: (12/60)
    M11                         (mid_date, y, x) float32 16MB nan nan ... nan
    M11_dr_to_vr_factor         (mid_date) float32 11kB nan nan nan ... nan nan
    M12                         (mid_date, y, x) float32 16MB nan nan ... nan
    M12_dr_to_vr_factor         (mid_date) float32 11kB nan nan nan ... nan nan
    acquisition_date_img1       (mid_date) datetime64[ns] 22kB 2013-04-30T04:...
    acquisition_date_img2       (mid_date) datetime64[ns] 22kB 2013-06-09T04:...
    ...                          ...
    vy_error_slow               (mid_date) float32 11kB 40.2 10.2 ... 26.5 159.6
    vy_error_stationary         (mid_date) float32 11kB 40.2 10.2 ... 26.5 159.6
    vy_stable_shift             (mid_date) float32 11kB -40.2 3.6 ... 17.1 152.1
    vy_stable_shift_slow        (mid_date) float32 11kB -40.2 3.6 ... 17.1 152.3
    vy_stable_shift_stationary  (mid_date) float32 11kB -40.2 3.6 ... 17.1 152.1
    cov                         (mid_date) float64 22kB 0.0 0.5969 ... 0.3494
Attributes: (12/19)
    Conventions:                CF-1.8
    GDAL_AREA_OR_POINT:         Area
    author:                     ITS_LIVE, a NASA MEaSUREs project (its-live.j...
    autoRIFT_parameter_file:    http://its-live-data.s3.amazonaws.com/autorif...
    datacube_software_version:  1.0
    date_created:               25-Sep-2023 22:00:23
    ...                         ...
    s3:                         s3://its-live-data/datacubes/v2/N30E090/ITS_L...
    skipped_granules:           s3://its-live-data/datacubes/v2/N30E090/ITS_L...
    time_standard_img1:         UTC
    time_standard_img2:         UTC
    title:                      ITS_LIVE datacube of image pair velocities
    url:                        https://its-live-data.s3.amazonaws.com/datacu...

Another approach: using .sel():

%%time
l8_condition = single_glacier_raster.satellite_img1 == '8'
l8_data_alt = single_glacier_raster.sel(mid_date=l8_condition)
l8_data_alt
CPU times: user 29.9 ms, sys: 0 ns, total: 29.9 ms
Wall time: 34.1 ms
<xarray.Dataset> Size: 187MB
Dimensions:                     (mid_date: 2688, y: 37, x: 40)
Coordinates:
    mapping                     int64 8B 0
  * mid_date                    (mid_date) datetime64[ns] 22kB 2013-05-20T04:...
    spatial_ref                 int64 8B 0
  * x                           (x) float64 320B 7.843e+05 ... 7.889e+05
  * y                           (y) float64 296B 3.316e+06 ... 3.311e+06
Data variables: (12/60)
    M11                         (mid_date, y, x) float32 16MB nan nan ... nan
    M11_dr_to_vr_factor         (mid_date) float32 11kB nan nan nan ... nan nan
    M12                         (mid_date, y, x) float32 16MB nan nan ... nan
    M12_dr_to_vr_factor         (mid_date) float32 11kB nan nan nan ... nan nan
    acquisition_date_img1       (mid_date) datetime64[ns] 22kB 2013-04-30T04:...
    acquisition_date_img2       (mid_date) datetime64[ns] 22kB 2013-06-09T04:...
    ...                          ...
    vy_error_slow               (mid_date) float32 11kB 40.2 10.2 ... 26.5 159.6
    vy_error_stationary         (mid_date) float32 11kB 40.2 10.2 ... 26.5 159.6
    vy_stable_shift             (mid_date) float32 11kB -40.2 3.6 ... 17.1 152.1
    vy_stable_shift_slow        (mid_date) float32 11kB -40.2 3.6 ... 17.1 152.3
    vy_stable_shift_stationary  (mid_date) float32 11kB -40.2 3.6 ... 17.1 152.1
    cov                         (mid_date) float64 22kB 0.0 0.5969 ... 0.3494
Attributes: (12/19)
    Conventions:                CF-1.8
    GDAL_AREA_OR_POINT:         Area
    author:                     ITS_LIVE, a NASA MEaSUREs project (its-live.j...
    autoRIFT_parameter_file:    http://its-live-data.s3.amazonaws.com/autorif...
    datacube_software_version:  1.0
    date_created:               25-Sep-2023 22:00:23
    ...                         ...
    s3:                         s3://its-live-data/datacubes/v2/N30E090/ITS_L...
    skipped_granules:           s3://its-live-data/datacubes/v2/N30E090/ITS_L...
    time_standard_img1:         UTC
    time_standard_img2:         UTC
    title:                      ITS_LIVE datacube of image pair velocities
    url:                        https://its-live-data.s3.amazonaws.com/datacu...

The approach using .sel() takes much less time than .where(). This is because .sel() queries the dataset using mid_date index. Xarray dimensions have associated Index objects, which are built on Pandas Indexes. They are very powerful for quickly and efficiently querying large datasets. In contrast, .where() must check each satellite_img data variable against the specified condition (‘8’), which is not as efficient.

What about a sensor with multiple identifiers?#

For Landsat8 observations, we only needed to identify elements of the dataset where the satellite_img1 variable matched a unique identifier, ‘8’. Sentinel-1 is a two-satellite constellation, meaning that it has sensors on board multiple satellites. For this, we need to select all observations where satellite_img1 matches a list of possible values.

%%time
s1_condition = single_glacier_raster.satellite_img1.isin(['1A','1B'])
s1_data = single_glacier_raster.sel(mid_date=s1_condition)
s1_data
CPU times: user 6.84 ms, sys: 8.81 ms, total: 15.7 ms
Wall time: 15.4 ms
<xarray.Dataset> Size: 25MB
Dimensions:                     (mid_date: 362, y: 37, x: 40)
Coordinates:
    mapping                     int64 8B 0
  * mid_date                    (mid_date) datetime64[ns] 3kB 2014-10-16T11:4...
    spatial_ref                 int64 8B 0
  * x                           (x) float64 320B 7.843e+05 ... 7.889e+05
  * y                           (y) float64 296B 3.316e+06 ... 3.311e+06
Data variables: (12/60)
    M11                         (mid_date, y, x) float32 2MB nan nan ... nan nan
    M11_dr_to_vr_factor         (mid_date) float32 1kB nan nan nan ... nan nan
    M12                         (mid_date, y, x) float32 2MB nan nan ... nan nan
    M12_dr_to_vr_factor         (mid_date) float32 1kB nan nan nan ... nan nan
    acquisition_date_img1       (mid_date) datetime64[ns] 3kB 2014-10-10T11:4...
    acquisition_date_img2       (mid_date) datetime64[ns] 3kB 2014-10-22T11:4...
    ...                          ...
    vy_error_slow               (mid_date) float32 1kB 64.9 56.2 ... 45.5 49.3
    vy_error_stationary         (mid_date) float32 1kB 64.9 56.2 ... 45.5 49.3
    vy_stable_shift             (mid_date) float32 1kB -6.5 -13.0 ... 0.5 0.5
    vy_stable_shift_slow        (mid_date) float32 1kB -6.5 -13.0 ... 0.5 0.5
    vy_stable_shift_stationary  (mid_date) float32 1kB -6.5 -13.0 ... 0.5 0.5
    cov                         (mid_date) float64 3kB 0.9463 0.9604 ... 0.9958
Attributes: (12/19)
    Conventions:                CF-1.8
    GDAL_AREA_OR_POINT:         Area
    author:                     ITS_LIVE, a NASA MEaSUREs project (its-live.j...
    autoRIFT_parameter_file:    http://its-live-data.s3.amazonaws.com/autorif...
    datacube_software_version:  1.0
    date_created:               25-Sep-2023 22:00:23
    ...                         ...
    s3:                         s3://its-live-data/datacubes/v2/N30E090/ITS_L...
    skipped_granules:           s3://its-live-data/datacubes/v2/N30E090/ITS_L...
    time_standard_img1:         UTC
    time_standard_img2:         UTC
    title:                      ITS_LIVE datacube of image pair velocities
    url:                        https://its-live-data.s3.amazonaws.com/datacu...

Rather than go through the above steps for each sensor, let’s write a function that will subset the dataset by sensor, returning a dict of Xarray datasets holding velocity time series for each sensor.

#Make dict of each sensor and identifying string(s)
sensor_conditions = {'Landsat 4': '4', 
              'Landsat 5': '5',
              'Landsat 7': '7',
              'Landsat 8': '8',
              'Landsat 9': '9',
              'Sentinel 1': ['1A','1B'],
              'Sentinel 2': ['2A','2B']}   

def separate_ds_by_sensor(ds, sensor_conditions):
    #Make empty lists to hold sensor IDs and subsetted datasets
    keys_ls, vals_ls = [],[]

    #Iterate through each sensor in dict
    for sensor in sensor_conditions.keys():

        #If there are two identifying conditions, use isin
        if isinstance(sensor_conditions[sensor], list):

            condition = ds.satellite_img1.isin(sensor_conditions[sensor])

        #If there's only one condition, use == 
        else: 
            
            condition = ds.satellite_img1 == sensor_conditions[sensor]

        #Use .sel to subset data based on sensor
        sensor_data = ds.sel(mid_date = condition)

        keys_ls.append(f'{sensor}')
        vals_ls.append(sensor_data)

    #Return dict of sensor IDs and subsetted datasets
    ds_dict = dict(zip(keys_ls, vals_ls))
    return ds_dict
sensor_ds_dict = separate_ds_by_sensor(single_glacier_raster, sensor_conditions)
print(sensor_ds_dict.keys())
dict_keys(['Landsat 4', 'Landsat 5', 'Landsat 7', 'Landsat 8', 'Landsat 9', 'Sentinel 1', 'Sentinel 2'])
sensor_ds_dict['Landsat 8']
<xarray.Dataset> Size: 187MB
Dimensions:                     (mid_date: 2688, y: 37, x: 40)
Coordinates:
    mapping                     int64 8B 0
  * mid_date                    (mid_date) datetime64[ns] 22kB 2013-05-20T04:...
    spatial_ref                 int64 8B 0
  * x                           (x) float64 320B 7.843e+05 ... 7.889e+05
  * y                           (y) float64 296B 3.316e+06 ... 3.311e+06
Data variables: (12/60)
    M11                         (mid_date, y, x) float32 16MB nan nan ... nan
    M11_dr_to_vr_factor         (mid_date) float32 11kB nan nan nan ... nan nan
    M12                         (mid_date, y, x) float32 16MB nan nan ... nan
    M12_dr_to_vr_factor         (mid_date) float32 11kB nan nan nan ... nan nan
    acquisition_date_img1       (mid_date) datetime64[ns] 22kB 2013-04-30T04:...
    acquisition_date_img2       (mid_date) datetime64[ns] 22kB 2013-06-09T04:...
    ...                          ...
    vy_error_slow               (mid_date) float32 11kB 40.2 10.2 ... 26.5 159.6
    vy_error_stationary         (mid_date) float32 11kB 40.2 10.2 ... 26.5 159.6
    vy_stable_shift             (mid_date) float32 11kB -40.2 3.6 ... 17.1 152.1
    vy_stable_shift_slow        (mid_date) float32 11kB -40.2 3.6 ... 17.1 152.3
    vy_stable_shift_stationary  (mid_date) float32 11kB -40.2 3.6 ... 17.1 152.1
    cov                         (mid_date) float64 22kB 0.0 0.5969 ... 0.3494
Attributes: (12/19)
    Conventions:                CF-1.8
    GDAL_AREA_OR_POINT:         Area
    author:                     ITS_LIVE, a NASA MEaSUREs project (its-live.j...
    autoRIFT_parameter_file:    http://its-live-data.s3.amazonaws.com/autorif...
    datacube_software_version:  1.0
    date_created:               25-Sep-2023 22:00:23
    ...                         ...
    s3:                         s3://its-live-data/datacubes/v2/N30E090/ITS_L...
    skipped_granules:           s3://its-live-data/datacubes/v2/N30E090/ITS_L...
    time_standard_img1:         UTC
    time_standard_img2:         UTC
    title:                      ITS_LIVE datacube of image pair velocities
    url:                        https://its-live-data.s3.amazonaws.com/datacu...

4) Combine sensor-specific subset#

In the previous section we looked at different ways to subset the ITS_LIVE time series. Now, we will look at a new structure within the Xarray data model (xr.DataTree) that facilitates working with collections of data.

Our use-case in this section is that we would like to efficiently compute the mean along the time dimension of the ITS_LIVE dataset for each satellite sensor, and then visualize these results side-by-side.

Before the implementation of Xarray DataTree, we would need to either perform the sequence of operations on each sensor-specific dataset individually, or, create a dict of the sensor specific datasets, write a function to perform certain operations and either update the dictionary or create another to hold the results; both of these options are quite clunky and inefficient.

We can create an xr.DataTree object by using the from_dict() method:

sensor_ds_tree = xr.DataTree.from_dict(sensor_ds_dict)
sensor_ds_tree
<xarray.DatasetView> Size: 0B
Dimensions:  ()
Data variables:
    *empty*

The DataTree has a parent node and groups (shown above) that contain individual xr.Datasets:

sensor_ds_tree['Landsat 8'].ds
<xarray.DatasetView> Size: 187MB
Dimensions:                     (mid_date: 2688, y: 37, x: 40)
Coordinates:
    mapping                     int64 8B 0
  * mid_date                    (mid_date) datetime64[ns] 22kB 2013-05-20T04:...
    spatial_ref                 int64 8B 0
  * x                           (x) float64 320B 7.843e+05 ... 7.889e+05
  * y                           (y) float64 296B 3.316e+06 ... 3.311e+06
Data variables: (12/60)
    M11                         (mid_date, y, x) float32 16MB nan nan ... nan
    M11_dr_to_vr_factor         (mid_date) float32 11kB nan nan nan ... nan nan
    M12                         (mid_date, y, x) float32 16MB nan nan ... nan
    M12_dr_to_vr_factor         (mid_date) float32 11kB nan nan nan ... nan nan
    acquisition_date_img1       (mid_date) datetime64[ns] 22kB 2013-04-30T04:...
    acquisition_date_img2       (mid_date) datetime64[ns] 22kB 2013-06-09T04:...
    ...                          ...
    vy_error_slow               (mid_date) float32 11kB 40.2 10.2 ... 26.5 159.6
    vy_error_stationary         (mid_date) float32 11kB 40.2 10.2 ... 26.5 159.6
    vy_stable_shift             (mid_date) float32 11kB -40.2 3.6 ... 17.1 152.1
    vy_stable_shift_slow        (mid_date) float32 11kB -40.2 3.6 ... 17.1 152.3
    vy_stable_shift_stationary  (mid_date) float32 11kB -40.2 3.6 ... 17.1 152.1
    cov                         (mid_date) float64 22kB 0.0 0.5969 ... 0.3494
Attributes: (12/19)
    Conventions:                CF-1.8
    GDAL_AREA_OR_POINT:         Area
    author:                     ITS_LIVE, a NASA MEaSUREs project (its-live.j...
    autoRIFT_parameter_file:    http://its-live-data.s3.amazonaws.com/autorif...
    datacube_software_version:  1.0
    date_created:               25-Sep-2023 22:00:23
    ...                         ...
    s3:                         s3://its-live-data/datacubes/v2/N30E090/ITS_L...
    skipped_granules:           s3://its-live-data/datacubes/v2/N30E090/ITS_L...
    time_standard_img1:         UTC
    time_standard_img2:         UTC
    title:                      ITS_LIVE datacube of image pair velocities
    url:                        https://its-live-data.s3.amazonaws.com/datacu...

If we want to perform a set of operations on every node of the datatree, we can define a function to pass to xr.DataTree.map_over_datasets():

def calc_temporal_mean(ds):
    """I'm a function that calculates the temporal mean of a dataset with (x,y,mid_date) dimensions. 
    I return a new dataset with (x,y,sensor) dimensions """
    
    #Skip parent node
    if len(ds.data_vars) == 0:
        return None
    
    else:
        #Calc mean
        tmean = ds.mean(dim='mid_date')
        #Add a sensor dimension -- this will be used for combining them back together later
        sensor = np.unique(ds.satellite_img1.data)
        if len(sensor) > 1: 
            sensor = [sensor[0]]
        #Expand dims to add sensor
        tmean = tmean.expand_dims({'sensor':sensor})
        return tmean
temp_mean_tree = sensor_ds_tree.map_over_datasets(calc_temporal_mean)

Now, we can take just the descendent nodes of this datatree:

child_tree = temp_mean_tree.descendants

and concatenate them into a single xr.Dataset along the 'sensor' dimension we created above:

sensor_mean_ds = xr.concat([child_tree[i].ds for i in range(len(child_tree))], dim='sensor')
sensor_mean_ds
<xarray.Dataset> Size: 541kB
Dimensions:                     (sensor: 7, y: 37, x: 40)
Coordinates:
  * sensor                      (sensor) object 56B '4' '5' '7' ... '1A' '2A'
    mapping                     int64 8B 0
    spatial_ref                 int64 8B 0
  * x                           (x) float64 320B 7.843e+05 ... 7.889e+05
  * y                           (y) float64 296B 3.316e+06 ... 3.311e+06
Data variables: (12/49)
    M11                         (sensor, y, x) float32 41kB nan nan ... nan nan
    M11_dr_to_vr_factor         (sensor) float32 28B nan nan nan nan nan nan nan
    M12                         (sensor, y, x) float32 41kB nan nan ... nan nan
    M12_dr_to_vr_factor         (sensor) float32 28B nan nan nan nan nan nan nan
    chip_size_height            (sensor, y, x) float32 41kB nan nan ... nan nan
    chip_size_width             (sensor, y, x) float32 41kB nan nan ... nan nan
    ...                          ...
    vy_error_slow               (sensor) float32 28B 44.73 25.63 ... 63.17 9.095
    vy_error_stationary         (sensor) float32 28B 44.77 25.64 ... 63.18 9.096
    vy_stable_shift             (sensor) float32 28B -6.333 -0.09127 ... 1.125
    vy_stable_shift_slow        (sensor) float32 28B -6.317 -0.08736 ... 1.122
    vy_stable_shift_stationary  (sensor) float32 28B -6.333 -0.09127 ... 1.125
    cov                         (sensor) float64 56B 0.4351 0.254 ... 0.06737

Finally, we can use Xarray’s FacetGrid plottingto visualize them side-by-side:

def calc_v_magnitude(ds):
    """I'm a function that calculates the magnitude of a velocity displacement vector given velocity component vectors.
    I return the same dataset object with a new variable called 'vmag'""" 
    
    ds['vmag'] = np.sqrt(ds['vx']**2 + ds['vy']**2)
    return ds

#First, calculate magnitude of velocity from the temporal mean of the component vectors
sensor_mean_ds = calc_v_magnitude(sensor_mean_ds)

a = sensor_mean_ds['vmag'].plot(col='sensor', col_wrap=4, cbar_kwargs={'label':'Meters / year'})
a.fig.suptitle('Temporal mean of velocity magnitude', fontsize=16, y=1.05)
a.fig.supylabel('Y-coordinate of projection (meters)', fontsize=12, x=-0.02)
a.fig.supxlabel('X-coordinate of projection (meters)', fontsize=12, y=-0.05)

#remove individual axes labesl
for i in range(len(a.axs[0])):

    a.axs[0][i].set_ylabel(None)
    a.axs[0][i].set_xlabel(None)

for i in range(len(a.axs[1])):
    a.axs[1][i].set_ylabel(None)
    a.axs[1][i].set_xlabel(None)
    a.axs[1][i].tick_params(axis='x', labelrotation=45)
../_images/248401e1cbc34d2870de77148f107e70593fd66a0128071d96ac18539af75e8e.png

It’s important to keep in mind that in addition to having different spectral properties and imaging resolutions, the sensors included in the ITS_LIVE dataset have been active during different, and sometimes overlapping periods of time. There is additional discussion of inter-sensor bias in the ITS_LIVE Known Issues documentation.

B. Examine velocity variability#

1) Histograms and summary statistics#

First, we plot histogram of the v, vx and vy variables to examine their distrubtions. To construct these plots, we use a combination of xarray plotting functionality and matplotlib object-oriented plotting. In addition, we use xr.reduce() and scipy.stats.skew() to calculate the skew of each variable (inset in each sub-plot).

To make things easier, write a function that calculates and hold summary statistics for each variable in a dictionary:

def calc_summary_stats(ds: xr.Dataset, variable:str):

    """ I'm a function that calculates summary statistics for a given data variable and returns them as a dict to be used in a plot"""

    skew = ds[f'{variable}'].reduce(func=scipy.stats.skew, nan_policy='omit', dim=['x','y','mid_date']).data
    mean = ds[f'{variable}'].mean(dim=['x','y','mid_date'], skipna=True).data
    median = ds[f'{variable}'].median(dim=['x','y','mid_date'], skipna=True).data

    stats_dict = {'skew':skew, 'mean':mean, 'median':median}
    return stats_dict
stats_vy = calc_summary_stats(single_glacier_raster, 'vy')
stats_vx = calc_summary_stats(single_glacier_raster, 'vx')
stats_v = calc_summary_stats(single_glacier_raster, 'v')
fig,axs=plt.subplots(ncols=3, figsize=(20,5))
#VY
hist_y = single_glacier_raster.vy.plot.hist(ax=axs[0], bins=100)
cumulative_y = np.cumsum(hist_y[0])
axs[0].plot(hist_y[1][1:], cumulative_y, color='orange', linestyle='-', alpha=0.5)
# VY stats text
axs[0].text(x=-2000, y=2e6, s=f"Skew: {stats_vy['skew']:.3f}", fontsize=12, color='black')
axs[0].text(x=-2000, y=1.5e6, s=f"Mean: {stats_vy['mean']:.3f}", fontsize=12, color='black')
axs[0].text(x=-2000, y=1e6, s=f"Median: {stats_vy['median']:.3f}", fontsize=12, color='black')

#VX
hist_x = single_glacier_raster.vx.plot.hist(ax=axs[1], bins=100)
cumulative_x = np.cumsum(hist_x[0])
axs[1].plot(hist_x[1][1:], cumulative_x, color='orange', linestyle='-', alpha=0.5)
#VX stats text
axs[1].text(x=-2000, y=2e6, s=f"Skew: {stats_vx['skew']:.3f}", fontsize=12, color='black')
axs[1].text(x=-2000, y=1.5e6, s=f"Mean: {stats_vx['mean']:.3f}", fontsize=12, color='black')
axs[1].text(x=-2000, y=1e6, s=f"Median: {stats_vx['median']:.3f}", fontsize=12, color='black')

#V
hist_v = single_glacier_raster.v.plot.hist(ax=axs[2], bins=100)
cumulative_v = np.cumsum(hist_v[0])
axs[2].plot(hist_v[1][1:], cumulative_v, color='orange', linestyle='-', alpha=0.5)
#V stats text
axs[2].text(x=2000, y=2e6, s=f"Skew: {stats_v['skew']:.3f}", fontsize=12, color='black')
axs[2].text(x=2000, y=1.5e6, s=f"Mean: {stats_v['mean']:.3f}", fontsize=12, color='black')
axs[2].text(x=2000, y=1e6, s=f"Median: {stats_v['median']:.3f}", fontsize=12, color='black')

#Formating and labeling
axs[0].set_title('VY')
axs[1].set_title('VX')
axs[2].set_title('V)')

for i in range(len(axs)):
    axs[i].set_xlabel(None)
    axs[i].set_ylabel(None)

fig.supylabel('# Observations', x=0.08, fontsize=12)
fig.supxlabel('Meters / year', fontsize=12)
fig.suptitle('Histogram (blue) and cumulative distribution function (orange) of velocity components and magnitude', fontsize=16, y=1.05);
../_images/8f3af118af081507129e031551fe2d909d858be1e36b73fbf325ebc45ba45368.png

The histograms and summary statistics show that vx and vy distributions are relatively Gaussian, while v is positively skewed and Rician. This is due to the non-linear relationshipp between component and displacement vectors. In datasets such as this one where the signal to noise ratio can be low, calculating velocity magnitude on smoothed or averaged component vectors can help to suppress noise (for a bit more detail, refer to this comment). For this reason, we will usually calculate velocity magnitude after the dataset has been reduced over space or time dimensions.

2) Spatial velocity variability#

Now that we have a greater understanding of the importance of velocity component variability in shaping our understandings of velocity variability, let’s examine these variables as well as the estimated error provided in the dataset by reducing the dataset along the temporal dimension so that we can visualize the data along x and y dimensions.

#Calculate min, max for color bar 
vmin_y = single_glacier_raster.vy.mean(dim=['mid_date']).min().data
vmax_y = single_glacier_raster.vy.mean(dim=['mid_date']).max().data

vmin_x = single_glacier_raster.vx.mean(dim=['mid_date']).min().data
vmax_x = single_glacier_raster.vx.mean(dim=['mid_date']).max().data

vmin = min([vmin_x, vmin_y])
vmax = max([vmax_x, vmax_y])
fig, axs = plt.subplots(ncols =2, figsize=(17,7))

x = single_glacier_raster.vx.mean(dim='mid_date').plot(ax=axs[0], vmin=vmin, vmax=vmax, cmap='RdBu_r')
y = single_glacier_raster.vy.mean(dim='mid_date').plot(ax=axs[1], vmin=vmin, vmax=vmax, cmap='RdBu_r')
axs[0].set_title('x-component velocity', fontsize=12)
axs[1].set_title('y-component velocity', fontsize=12)
fig.suptitle('Temporal mean of velocity components', fontsize=16, y=1.02)

x.colorbar.set_label('m/y', rotation=270)
y.colorbar.set_label('m/yr', rotation=270)

for i in range(len(axs)):
    axs[i].set_ylabel(None)
    axs[i].set_xlabel(None)
fig.supylabel('Y-coordinate of projection (meters)', x=0.08, fontsize=12)
fig.supxlabel('X-coordinate of projection (meters)', fontsize=12);
../_images/5c92972442e2ec5616d3dd2a1210110d4d4223fa7b4d41f0ff1c2c83369cd6b3.png

In addition to visualizing components (above), plotting velocity vectors is helpful for understanding magnitude and direction of flow:

First, calculate and visualize mean velocity magnitude over time (we will use the function defined in Part 1), and the mean estimated error over time:

ds_v = calc_v_magnitude(single_glacier_raster.mean(dim='mid_date',skipna=True))
fig, axs= plt.subplots(ncols=2, figsize=(20,7))

single_glacier_vector.plot(ax=axs[0], facecolor='none', edgecolor='red')
single_glacier_raster.mean(dim='mid_date').plot.quiver('x','y','vx','vy', ax=axs[1], angles='xy', robust=True)

single_glacier_vector.plot(ax=axs[1], facecolor='none', edgecolor='red')
a = ds_v['vmag'].plot(ax=axs[0], alpha=0.6, vmax=45, vmin=5)
a.colorbar.set_label('meter/year')

fig.supylabel('Y-coordinate of projection (meters)', x=0.08, fontsize=12)
fig.supxlabel('X-coordinate of projection (meters)', fontsize=12)

fig.suptitle('Velocity vectors (R) and magntiude of velocity (L), averaged over time', fontsize=16, y=0.98)
for i in range(len(axs)):
    axs[i].set_xlabel(None)
    axs[i].set_ylabel(None)
    axs[i].set_title(None);
../_images/721cd5a53feb2bc90cfb8c15de3d13cafef12de46778b38e4a7303d3e6945565.png

Visualize magnitude of velocity overlaid with velocity vectors next to velocity error:

fig, ax = plt.subplots(figsize=(22,6), ncols=2)

vmag = ds_v.vmag.plot(ax=ax[0], vmin=0, vmax=52,alpha=0.5)
single_glacier_raster.mean(dim='mid_date').plot.quiver('x','y','vx','vy', ax=ax[0], angles='xy', robust=True)

err = ds_v.v_error.plot(ax=ax[1], vmin=0, vmax=52)


vmag.colorbar.set_label('m/y')#, rotation=270)
err.colorbar.set_label('m/y')#, rotation=270)

for i in range(len(ax)):
    ax[i].set_ylabel(None)
    ax[i].set_xlabel(None)
    ax[i].set_title(None)

fig.supxlabel('X-coordinate of projection (meters)', fontsize=12)
fig.supylabel('Y-coordinate of projection (meters)', x=0.08, fontsize=12)
fig.suptitle('Mean velocity magnitude over time (L), mean error over time (R)', fontsize=16, y=1.02);
../_images/760ee7144880fa854d2887472e6dce5a963455e01a4bb7fd29861d00dc5a8e19.png

v_error is large relative to the magnitude of velocity, suggesting that this data is pretty noisy.

3) Temporal velocity variability#

Reduce over the spatial dimensions (this time we will switch it up and choose a different reduction function) and visualize variability over time:

fig, ax = plt.subplots(figsize=(20,5))

vmag_med = calc_v_magnitude(single_glacier_raster.median(dim=['x','y']))

vmag_med.plot.scatter(x='mid_date', y='vmag',ax=ax, marker='o', edgecolors='None',alpha=0.5)
fig.suptitle('Spatial median magnitude of velocity over time')
ax.set_title(None)
ax.set_ylabel('m/y')
ax.set_xlabel('Time');
../_images/025f989aa453eb7a409b855afc013ff848feba1d602b5670f73dfeb12ea91de6.png

This helps get a sense of velocity variability over time, but also shows how many outliers there are, even after taking the median over the x and y dimensions. In the final section of this notebook, we explore different approaches for changing the resolution of the temporal dimension.

C. Dimensional computations#

In Part 2, we saw that the timeseries is dense in places and quite noisy. This section demonstrates different approaches for looking at a temporal signal in the dataset.

1) Temporal resampling#

Use Xarray’s resample() method to coarsen the dataset. With resample(), you can upsample or downsample the data, choosing both the resample frequency and the type of reduction.

#Coarsen dataset to monthly
resample_obj = single_glacier_raster.resample(mid_date='2ME')
#Calculate monthly median
glacier_resample_1mo = resample_obj.median(dim='mid_date')

The mid_date dimension is now much less dense:

glacier_resample_1mo
<xarray.Dataset> Size: 18MB
Dimensions:                     (mid_date: 230, y: 37, x: 40)
Coordinates:
    mapping                     int64 8B 0
    spatial_ref                 int64 8B 0
  * x                           (x) float64 320B 7.843e+05 ... 7.889e+05
  * y                           (y) float64 296B 3.316e+06 ... 3.311e+06
  * mid_date                    (mid_date) datetime64[ns] 2kB 1986-09-30 ... ...
Data variables: (12/49)
    M11                         (mid_date, y, x) float32 1MB nan nan ... nan nan
    M11_dr_to_vr_factor         (mid_date) float32 920B nan nan nan ... nan nan
    M12                         (mid_date, y, x) float32 1MB nan nan ... nan nan
    M12_dr_to_vr_factor         (mid_date) float32 920B nan nan nan ... nan nan
    chip_size_height            (mid_date, y, x) float32 1MB nan nan ... nan nan
    chip_size_width             (mid_date, y, x) float32 1MB nan nan ... nan nan
    ...                          ...
    vy_error_slow               (mid_date) float32 920B 30.3 31.4 ... 16.65 26.8
    vy_error_stationary         (mid_date) float32 920B 30.2 31.45 ... 26.8
    vy_stable_shift             (mid_date) float32 920B 4.1 -1.0 ... -2.75 71.0
    vy_stable_shift_slow        (mid_date) float32 920B 4.2 -1.0 ... -2.75 71.0
    vy_stable_shift_stationary  (mid_date) float32 920B 4.1 -1.0 ... -2.75 71.0
    cov                         (mid_date) float64 2kB 0.0 0.2461 ... 0.0 0.0
Attributes: (12/19)
    Conventions:                CF-1.8
    GDAL_AREA_OR_POINT:         Area
    author:                     ITS_LIVE, a NASA MEaSUREs project (its-live.j...
    autoRIFT_parameter_file:    http://its-live-data.s3.amazonaws.com/autorif...
    datacube_software_version:  1.0
    date_created:               25-Sep-2023 22:00:23
    ...                         ...
    s3:                         s3://its-live-data/datacubes/v2/N30E090/ITS_L...
    skipped_granules:           s3://its-live-data/datacubes/v2/N30E090/ITS_L...
    time_standard_img1:         UTC
    time_standard_img2:         UTC
    title:                      ITS_LIVE datacube of image pair velocities
    url:                        https://its-live-data.s3.amazonaws.com/datacu...

Compare the 2-month resampled median time series to the full time series we plotted at the end of Part 2:

fig, ax = plt.subplots(figsize=(20,5))

#Calculate magnitude of velocity after the temporal reduction
vmag_1mo = calc_v_magnitude(glacier_resample_1mo)

#Calculate spatial median
vmag_1mo['vmag'].median(dim=['x','y']).plot(ax=ax)
#Plot full time series (spatial median) magnitude of velocity
vmag_med.plot.scatter(x='mid_date', y='vmag',ax=ax, marker='o', edgecolors='None',alpha=0.5, color='orange');

#Labels and formatting
fig.suptitle('2-month median magnitude of velocity (blue), full time series (orange)', fontsize=16)
ax.set_title(None)
ax.set_ylabel('m/y')
ax.set_xlabel('Time')
ax.set_ylim(0, 750);
../_images/90a40f88a80460665e8da3e49dde553697b84ef9ad383fa4ad4ac2445d5c12c3.png

We can make a few observations:

  • Computing two-month median velocities makes it easier to see a somewhat periodic velocity signal

  • As expected, the median is highly sensitive to the density of observations; as such, it displays greater amplitude during periods with sparser observations. When the density of observations increases significantly in 2014, the amplitude of variability of the 2-month median curve decreases dramatically.

2) Grouped analysis by season#

Xarray’s groupby() functionality allows us to segment the dataset into different groups along given dimensions. Here, we use that to analyze seasonal velocity variability patterns:

seasons_gb = single_glacier_raster.groupby(single_glacier_raster.mid_date.dt.season).median()
#add attrs to gb object
seasons_gb.attrs = single_glacier_raster.attrs 
#Reorder seasons
seasons_gb=seasons_gb.reindex({'season':['DJF','MAM','JJA','SON']})
seasons_gb
<xarray.Dataset> Size: 309kB
Dimensions:                     (x: 40, y: 37, season: 4)
Coordinates:
  * x                           (x) float64 320B 7.843e+05 ... 7.889e+05
  * y                           (y) float64 296B 3.316e+06 ... 3.311e+06
  * season                      (season) <U3 48B 'DJF' 'MAM' 'JJA' 'SON'
    mapping                     int64 8B 0
    spatial_ref                 int64 8B 0
Data variables: (12/49)
    M11                         (season, y, x) float32 24kB nan nan ... nan nan
    M11_dr_to_vr_factor         (season) float32 16B nan nan nan nan
    M12                         (season, y, x) float32 24kB nan nan ... nan nan
    M12_dr_to_vr_factor         (season) float32 16B nan nan nan nan
    chip_size_height            (season, y, x) float32 24kB nan nan ... nan nan
    chip_size_width             (season, y, x) float32 24kB nan nan ... nan nan
    ...                          ...
    vy_error_slow               (season) float32 16B 8.2 4.8 4.2 9.1
    vy_error_stationary         (season) float32 16B 8.2 4.8 4.2 9.1
    vy_stable_shift             (season) float32 16B 0.1 0.0 0.0 0.5
    vy_stable_shift_slow        (season) float32 16B 0.1 0.0 0.0 0.5
    vy_stable_shift_stationary  (season) float32 16B 0.1 0.0 0.0 0.5
    cov                         (season) float64 32B 0.0 0.0 0.0 0.0
Attributes: (12/19)
    Conventions:                CF-1.8
    GDAL_AREA_OR_POINT:         Area
    author:                     ITS_LIVE, a NASA MEaSUREs project (its-live.j...
    autoRIFT_parameter_file:    http://its-live-data.s3.amazonaws.com/autorif...
    datacube_software_version:  1.0
    date_created:               25-Sep-2023 22:00:23
    ...                         ...
    s3:                         s3://its-live-data/datacubes/v2/N30E090/ITS_L...
    skipped_granules:           s3://its-live-data/datacubes/v2/N30E090/ITS_L...
    time_standard_img1:         UTC
    time_standard_img2:         UTC
    title:                      ITS_LIVE datacube of image pair velocities
    url:                        https://its-live-data.s3.amazonaws.com/datacu...

This is cool; we’ve gone from our 3-d object with a very dense mid_date dimension to a 3-d object where the temporal aspect of the data is represented by 4 seasons.

In the above cell, we defined how we wanted to group our data (single_glacier_raster.mid_date.dt.season) and the reduction we wanted to apply to each group (median()). After the apply step, Xarray automatically combines the groups into a single object with a season dimension.

If you’d like to see another example of this with more detailed explanations, go here.

#Calculate magnitude on seasonal groupby object
seasons_vmag = calc_v_magnitude(seasons_gb)

Use another FacetGrid plot to visualize the seasons side-by-side:

fg = seasons_vmag.vmag.plot(
    col='season',cbar_kwargs={'label':'Meters / year'}
)
fg.fig.suptitle('Seasonal median velocity magnitude', fontsize=16, y=1.05)
fg.fig.supxlabel('X-coordinate of projection (meters)', fontsize=12, y=-0.05)
fg.fig.supylabel('Y-coordinate of projection (meters)', fontsize=12, x=-0.02)
for i in range(len(fg.axs[0])):
    fg.axs[0][i].set_ylabel(None)
    fg.axs[0][i].set_xlabel(None)
    fg.axs[0][i].tick_params(axis='x', labelrotation=45);
../_images/65f0faf08da6ecd9d7f335ac69a20498f7efbcad2714638f819d7c194ede431a.png

From the above FacetGrid plot, it appears that some regions of the glacier are very active (show high velocities) throughout the entire year. In other areas, it appears that glacier flow may be much more seasonal.

Conclusion

This was a deep dive in exploratory data analysis at the scale of an individual glacier. The last notebook in this chapter will demonstrate a regional-scale analysis.