Software and Data

Software and Data#

On this page you’ll find information about the computing environment and datasets that we’ll be using in this tutorial.

A. Software Packages#

Below, you’lll see a list of the python libraries we’ll be using in this chapter (note:wording updated for when books are combined). This is the full list of libraries across all notebooks.

import contextily as cx
import geopandas as gpd
import hvplot.pandas
import matplotlib.pyplot as plt
import numpy as np
import rioxarray as rio
import s3fs
import scipy.stats
from shapely.geometry import Point, Polygon
from typing import Union
import warnings
import xarray as xr

This tutorial also uses several functions that are stored in the script itslivetools.py. It is located in the github repo for this tutorial. If you clone the repo, it should be available to import to the tutorial notebooks. Otherwise, if you would like to use itslivetools.py, download the script and move it to your working directory.

B. Computational Environment#

Running tutorial on the cloud#

This link will launch a preconfigured jupyterlab environment on mybinder.org:

https://mybinder.org/v2/gh/e-marshall/itslive/HEAD?labpath=accessing_s3_data.ipynb

Running tutorial material locally#

To run the notebooks contained in this tutorial on your local machine

create the itslivetools_env conda environment (conda env create -f environment-unpinned.yml) based on the environment.yml file here. This should work on any platform (linux, osx, windows) and will install the latest versions of all dependencies.

Alternatively, the code repository for this tutorial (e-marshall/itslive) also contains “lock” files for Linux (conda-linux-64.lock.yml) and MacOS (conda-osx-64.lock.yml) that pin exact versions of all required python packages for a reproducible computing environment.

C. Data used in this chapter#

ITS_LIVE#

The velocity data that we’ll be using is from the ITS_LIVE dataset. This dataset contains global coverage of land ice velocity data at various temporal frequencies and in various formats. Follow the link to explore the data that’s available for a particular region you may be interested in. ITS_LIVE has multiple options for data access; this example will focus on using zarr datacubes that are stored in s3 buckets on AWS.

ITS_LIVE velocity data is accessed in a raster format and the data covers a large swath of terrain covering land that is glaciated and non-glaciated. We want to select just the pixels that cover glaciated surfaces; to do this, we use glacier outlines from the Randolph Glacier Inventory. The RGI region used in this tutorial is made available as a GeoParquet file in the tutorial repository.

Head to the next page to see how we start accessing and working with this data.

RGI#

The Randolph Glacier Inventory is a global, publicly available dataset of containing glacier outlines, centerlines and attribute information that has been compiled over decades from many studies. It is a very valuable resource for glaciology research. Read more about the RGI project and the most recent version, V7, here

rewrite but incorporate these points (from initial_velocity_data_inspection): We will read in just one region of the RGI (region 15, SouthAsiaEast). RGI data is downloaded in lat/lon coordinates. We will project it to match the coordinate reference system (CRS) of the ITS_LIVE dataset and then select an individual glacier to begin our analysis. ITS_LIVE data is in the Universal Transverse Mercator (UTM) coordinate system, and each datacube is projected to the UTM zone specific to its location. You can read more about these concepts here.

RGI data is publicly available here, however, for ease-of-use we have saved a single region of the dataset in this repository as a GeoParquet file.