Region-scale glacier analysis#

The previous notebook demonstrated using xarray to analyze surface velocity data for an individual glacier. This notebook will show how we can examine spatial variability in surface velocity within a group of glaciers. To do this we will use xarray as well as geopandas, geocube, and pandas. We will start by using .make_geocube() to rasterize a vector object in the shape of an ITS_LIVE velocity raster object. We will then use the rasterized vector to group the ITS_LIVE object by individual glaciers and then calculate summary statistics of surface velocity for each glacier. The goal in this work flow is to end up with a pandas dataframe where each row is an individual glacier and columns for various surface velocity summary statistics.

Learning goals#

Concepts#

  • Querying and accessing raster data from cloud object storage

  • Accessing and manipulating vector data

  • Handling coordinate reference information

  • Calculating and visualizing summary statistics

Techniques#

  • Access cloud-hosted Zarr data cubes using Xarray

  • Reading GeoParquet vector data using GeoPandas

  • Rasterize vector objects using Geocube

  • Spatial joins of vector datasets using GeoPandas

  • Using dask to work with out-of-memory data

  • Calculating summary statistics of Xarray and Pandas objects

  • Data visualization using Pandas

  • Interactive data visualization with GeoPandas

Other useful resources#

These are resources that contain additional examples and discussion of the content in this notebook and more.

Software + Setup#

%xmode minimal
Exception reporting mode: Minimal
import os
import json
import urllib.request
import numpy as np
import xarray as xr
import rioxarray as rxr
import geopandas as gpd
import pandas as pd

import matplotlib.pyplot as plt

from shapely.geometry import Polygon
from shapely.geometry import Point
from geocube.api.core import make_geocube

%config InlineBackend.figure_format='retina'
import itslivetools

Accessing, reading raster data (ITS_LIVE velocity data)#

itslive_catalog = gpd.read_file('https://its-live-data.s3.amazonaws.com/datacubes/catalog_v02.json')
url = itslivetools.find_granule_by_point([95.180191, 30.645973])
url
'http://its-live-data.s3.amazonaws.com/datacubes/v2/N30E090/ITS_LIVE_vel_EPSG32646_G0120_X750000_Y3350000.zarr'
dc = itslivetools.read_in_s3(url)
dc
<xarray.Dataset>
Dimensions:                     (mid_date: 25243, y: 833, x: 833)
Coordinates:
  * mid_date                    (mid_date) datetime64[ns] 2022-06-07T04:21:44...
  * x                           (x) float64 7.001e+05 7.003e+05 ... 8e+05
  * y                           (y) float64 3.4e+06 3.4e+06 ... 3.3e+06 3.3e+06
Data variables: (12/60)
    M11                         (mid_date, y, x) float32 dask.array<chunksize=(25243, 30, 30), meta=np.ndarray>
    M11_dr_to_vr_factor         (mid_date) float32 dask.array<chunksize=(25243,), meta=np.ndarray>
    M12                         (mid_date, y, x) float32 dask.array<chunksize=(25243, 30, 30), meta=np.ndarray>
    M12_dr_to_vr_factor         (mid_date) float32 dask.array<chunksize=(25243,), meta=np.ndarray>
    acquisition_date_img1       (mid_date) datetime64[ns] dask.array<chunksize=(25243,), meta=np.ndarray>
    acquisition_date_img2       (mid_date) datetime64[ns] dask.array<chunksize=(25243,), meta=np.ndarray>
    ...                          ...
    vy_error_modeled            (mid_date) float32 dask.array<chunksize=(25243,), meta=np.ndarray>
    vy_error_slow               (mid_date) float32 dask.array<chunksize=(25243,), meta=np.ndarray>
    vy_error_stationary         (mid_date) float32 dask.array<chunksize=(25243,), meta=np.ndarray>
    vy_stable_shift             (mid_date) float32 dask.array<chunksize=(25243,), meta=np.ndarray>
    vy_stable_shift_slow        (mid_date) float32 dask.array<chunksize=(25243,), meta=np.ndarray>
    vy_stable_shift_stationary  (mid_date) float32 dask.array<chunksize=(25243,), meta=np.ndarray>
Attributes: (12/19)
    Conventions:                CF-1.8
    GDAL_AREA_OR_POINT:         Area
    author:                     ITS_LIVE, a NASA MEaSUREs project (its-live.j...
    autoRIFT_parameter_file:    http://its-live-data.s3.amazonaws.com/autorif...
    datacube_software_version:  1.0
    date_created:               25-Sep-2023 22:00:23
    ...                         ...
    s3:                         s3://its-live-data/datacubes/v2/N30E090/ITS_L...
    skipped_granules:           s3://its-live-data/datacubes/v2/N30E090/ITS_L...
    time_standard_img1:         UTC
    time_standard_img2:         UTC
    title:                      ITS_LIVE datacube of image pair velocities
    url:                        https://its-live-data.s3.amazonaws.com/datacu...

The mid_date dimension of the dc object isn’t in chronlogical order, so let’s sort by this dimension:

dc = dc.sortby('mid_date')

Create a crs object based on the projection data variable of the data cube (dc) object.

crs = f'EPSG:{dc.projection}'
crs
'EPSG:32646'

Reading vector data (glacier outlines)#

se_asia = gpd.read_parquet('rgi7_region15_south_asia_east.parquet')
se_asia.head(3)
rgi_id o1region o2region glims_id anlys_id subm_id src_date cenlon cenlat utm_zone ... zmin_m zmax_m zmed_m zmean_m slope_deg aspect_deg aspect_sec dem_source lmax_m geometry
0 RGI2000-v7.0-G-15-00001 15 15-01 G078088E31398N 866850 752 2002-07-10T00:00:00 78.087891 31.398046 44 ... 4662.2950 4699.2095 4669.4720 4671.4253 13.427070 122.267290 4 COPDEM30 173 POLYGON Z ((78.08905 31.39784 0.00000, 78.0889...
1 RGI2000-v7.0-G-15-00002 15 15-01 G078125E31399N 867227 752 2002-07-10T00:00:00 78.123699 31.397796 44 ... 4453.3584 4705.9920 4570.9473 4571.2770 22.822983 269.669144 7 COPDEM30 1113 POLYGON Z ((78.12556 31.40257 0.00000, 78.1255...
2 RGI2000-v7.0-G-15-00003 15 15-01 G078128E31390N 867273 752 2000-08-05T00:00:00 78.128510 31.390287 44 ... 4791.7593 4858.6807 4832.1836 4827.6700 15.626262 212.719681 6 COPDEM30 327 POLYGON Z ((78.12960 31.39093 0.00000, 78.1296...

3 rows × 29 columns

How many glaciers are in this dataframe?

se_asia['rgi_id'].nunique()
18587

What coordinate reference system is this dataframe in?

se_asia.crs
<Geographic 2D CRS: EPSG:4326>
Name: WGS 84
Axis Info [ellipsoidal]:
- Lat[north]: Geodetic latitude (degree)
- Lon[east]: Geodetic longitude (degree)
Area of Use:
- name: World.
- bounds: (-180.0, -90.0, 180.0, 90.0)
Datum: World Geodetic System 1984 ensemble
- Ellipsoid: WGS 84
- Prime Meridian: Greenwich

The vector dataset is in WGS 84, meaning that its coordinates are in degrees latitude and longitude rather than meters N and E. We will project this dataset to match the projection of the raster dataset.

Handling projections#

Let’s project this dataframe to match the CRS of the itslive dataset

#project rgi data to match itslive
se_asia_prj = se_asia.to_crs(crs) #we know the epsg from the projection variable of the dc object
se_asia_prj.head(3)
rgi_id o1region o2region glims_id anlys_id subm_id src_date cenlon cenlat utm_zone ... zmin_m zmax_m zmed_m zmean_m slope_deg aspect_deg aspect_sec dem_source lmax_m geometry
0 RGI2000-v7.0-G-15-00001 15 15-01 G078088E31398N 866850 752 2002-07-10T00:00:00 78.087891 31.398046 44 ... 4662.2950 4699.2095 4669.4720 4671.4253 13.427070 122.267290 4 COPDEM30 173 POLYGON Z ((-924868.476 3571663.111 0.000, -92...
1 RGI2000-v7.0-G-15-00002 15 15-01 G078125E31399N 867227 752 2002-07-10T00:00:00 78.123699 31.397796 44 ... 4453.3584 4705.9920 4570.9473 4571.2770 22.822983 269.669144 7 COPDEM30 1113 POLYGON Z ((-921270.161 3571706.471 0.000, -92...
2 RGI2000-v7.0-G-15-00003 15 15-01 G078128E31390N 867273 752 2000-08-05T00:00:00 78.128510 31.390287 44 ... 4791.7593 4858.6807 4832.1836 4827.6700 15.626262 212.719681 6 COPDEM30 327 POLYGON Z ((-921061.745 3570342.665 0.000, -92...

3 rows × 29 columns

se_asia_prj['rgi_id'].str.slice(18)
0        00001
1        00002
2        00003
3        00004
4        00005
         ...  
18582    18583
18583    18584
18584    18585
18585    18586
18586    18587
Name: rgi_id, Length: 18587, dtype: object

Give each glacier (row) a unique integer key that is related to that glacier’s RGIId. We will use this later. Be careful that the RGI_int column is composed of integers not strings.

se_asia_prj['RGI_int'] = se_asia_prj['rgi_id'].str.slice(18,).astype(int)
se_asia_prj.RGI_int.dtype
dtype('int64')

To start with, we will look only at glaciers larger in area than 5km2. Subset the dataset to select for those glaciers

se_asia_prj = se_asia_prj.loc[se_asia_prj['area_km2'] > 5.]
se_asia_prj.head()
rgi_id o1region o2region glims_id anlys_id subm_id src_date cenlon cenlat utm_zone ... zmax_m zmed_m zmean_m slope_deg aspect_deg aspect_sec dem_source lmax_m geometry RGI_int
59 RGI2000-v7.0-G-15-00060 15 15-01 G078577E31191N 871719 752 2000-08-05T00:00:00 78.553292 31.174611 44 ... 6043.9136 5224.0610 5141.7990 17.677896 249.340500 7 COPDEM90 13928 POLYGON Z ((-880436.185 3544730.055 0.000, -88... 60
66 RGI2000-v7.0-G-15-00067 15 15-01 G078523E31112N 871438 752 2000-08-05T00:00:00 78.521258 31.116022 44 ... 5788.4610 4325.8790 4543.5195 18.882126 351.197868 1 COPDEM30 7044 POLYGON Z ((-888722.833 3534458.887 0.000, -88... 67
157 RGI2000-v7.0-G-15-00158 15 15-01 G078843E31150N 874200 752 2002-07-10T00:00:00 78.843177 31.150501 44 ... 5514.8880 5215.9690 5198.8105 12.894007 86.556990 3 COPDEM30 4981 POLYGON Z ((-856766.704 3533377.312 0.000, -85... 158
209 RGI2000-v7.0-G-15-00210 15 15-01 G078779E31218N 873926 752 2002-07-10T00:00:00 78.814979 31.218553 44 ... 6041.6123 5158.5570 5126.2070 12.179309 41.205524 2 COPDEM90 18374 POLYGON Z ((-869202.097 3540392.546 0.000, -86... 210
229 RGI2000-v7.0-G-15-00230 15 15-01 G078544E31029N 871696 752 2000-08-05T00:00:00 78.550250 31.038592 44 ... 6254.6704 5249.8203 5189.1060 20.423279 355.937175 1 COPDEM90 10797 POLYGON Z ((-884033.521 3525419.927 0.000, -88... 230

5 rows × 30 columns

Next, want to subset the RGI dataset by the spatial extent of the ITS_LIVE data. First, get the bbox of the ITS_LIVE data as a vector

dc_bbox = itslivetools.get_bounds_polygon(dc)
dc_bbox.crs
<Projected CRS: EPSG:32646>
Name: WGS 84 / UTM zone 46N
Axis Info [cartesian]:
- E[east]: Easting (metre)
- N[north]: Northing (metre)
Area of Use:
- name: Between 90°E and 96°E, northern hemisphere between equator and 84°N, onshore and offshore. Bangladesh. Bhutan. China. Indonesia. Mongolia. Myanmar (Burma). Russian Federation.
- bounds: (90.0, 0.0, 96.0, 84.0)
Coordinate Operation:
- name: UTM zone 46N
- method: Transverse Mercator
Datum: World Geodetic System 1984 ensemble
- Ellipsoid: WGS 84
- Prime Meridian: Greenwich

Rasterize vector objects#

Subset RGI dataset:
To do this we will use a spatial join. Here we use an inner join but there are various methods to customize the spatial join operation. Find more info here.

rgi_sub = gpd.sjoin(se_asia_prj, dc_bbox, how='inner')
rgi_sub.head()
rgi_id o1region o2region glims_id anlys_id subm_id src_date cenlon cenlat utm_zone ... zmed_m zmean_m slope_deg aspect_deg aspect_sec dem_source lmax_m geometry RGI_int index_right
11630 RGI2000-v7.0-G-15-11631 15 15-03 G095124E30309N 927529 752 2003-07-24T00:00:00 95.106696 30.299816 46 ... 4881.6580 4866.093 21.731916 45.774250 2 COPDEM90 19653 POLYGON Z ((698995.535 3356459.046 0.000, 6990... 11631 0
11631 RGI2000-v7.0-G-15-11632 15 15-03 G095138E30348N 927618 752 2002-07-10T00:00:00 95.137897 30.347701 46 ... 5107.2480 5112.531 19.250132 212.500549 6 COPDEM30 3968 POLYGON Z ((705772.753 3360152.416 0.000, 7057... 11632 0
11642 RGI2000-v7.0-G-15-11643 15 15-03 G095210E30329N 927917 752 1997-08-24T00:00:00 95.209673 30.329234 46 ... 5023.0493 4969.326 23.261293 62.829437 2 COPDEM30 5108 POLYGON Z ((713890.435 3356601.497 0.000, 7138... 11643 0
11672 RGI2000-v7.0-G-15-11673 15 15-03 G095324E30363N 928296 752 2007-08-20T00:00:00 95.323927 30.363373 46 ... 5020.7920 5009.677 20.035536 43.020733 2 COPDEM30 7190 POLYGON Z ((720536.806 3361488.071 0.000, 7205... 11673 0
11735 RGI2000-v7.0-G-15-11736 15 15-03 G095203E30636N 927889 752 2002-07-10T00:00:00 95.203044 30.636060 46 ... 5383.6210 5355.023 14.739693 28.799190 2 COPDEM30 4540 POLYGON Z ((712706.746 3390103.661 0.000, 7126... 11736 0

5 rows × 31 columns

Write crs of dc object. Event hough it is stored in the projection attribute, we want the projection to be stored as a rioxarray attribute so that it can be used in rioxarray methods.

dc = dc.rio.write_crs(crs)
rgi_sub = rgi_sub.drop('index_right', axis=1)
len(rgi_sub) # number of glaciers in the subset
28

Now, use the .make_geocube() function. This essentially takes a vector object (rgi_sub) and rasterizes it, returning an xarray object with the same structure as the object you provide for the like = argument (in our case that is dc). This example relies greatly on the zonal statistics example in the geocube documentation, which contains additional helpful details.

dc
<xarray.Dataset>
Dimensions:                     (mid_date: 25243, y: 833, x: 833)
Coordinates:
    mapping                     int64 0
  * mid_date                    (mid_date) datetime64[ns] 1986-09-11T03:31:15...
  * x                           (x) float64 7.001e+05 7.003e+05 ... 8e+05
  * y                           (y) float64 3.4e+06 3.4e+06 ... 3.3e+06 3.3e+06
Data variables: (12/59)
    M11                         (mid_date, y, x) float32 dask.array<chunksize=(25243, 30, 30), meta=np.ndarray>
    M11_dr_to_vr_factor         (mid_date) float32 dask.array<chunksize=(25243,), meta=np.ndarray>
    M12                         (mid_date, y, x) float32 dask.array<chunksize=(25243, 30, 30), meta=np.ndarray>
    M12_dr_to_vr_factor         (mid_date) float32 dask.array<chunksize=(25243,), meta=np.ndarray>
    acquisition_date_img1       (mid_date) datetime64[ns] dask.array<chunksize=(25243,), meta=np.ndarray>
    acquisition_date_img2       (mid_date) datetime64[ns] dask.array<chunksize=(25243,), meta=np.ndarray>
    ...                          ...
    vy_error_modeled            (mid_date) float32 dask.array<chunksize=(25243,), meta=np.ndarray>
    vy_error_slow               (mid_date) float32 dask.array<chunksize=(25243,), meta=np.ndarray>
    vy_error_stationary         (mid_date) float32 dask.array<chunksize=(25243,), meta=np.ndarray>
    vy_stable_shift             (mid_date) float32 dask.array<chunksize=(25243,), meta=np.ndarray>
    vy_stable_shift_slow        (mid_date) float32 dask.array<chunksize=(25243,), meta=np.ndarray>
    vy_stable_shift_stationary  (mid_date) float32 dask.array<chunksize=(25243,), meta=np.ndarray>
Attributes: (12/19)
    Conventions:                CF-1.8
    GDAL_AREA_OR_POINT:         Area
    author:                     ITS_LIVE, a NASA MEaSUREs project (its-live.j...
    autoRIFT_parameter_file:    http://its-live-data.s3.amazonaws.com/autorif...
    datacube_software_version:  1.0
    date_created:               25-Sep-2023 22:00:23
    ...                         ...
    s3:                         s3://its-live-data/datacubes/v2/N30E090/ITS_L...
    skipped_granules:           s3://its-live-data/datacubes/v2/N30E090/ITS_L...
    time_standard_img1:         UTC
    time_standard_img2:         UTC
    title:                      ITS_LIVE datacube of image pair velocities
    url:                        https://its-live-data.s3.amazonaws.com/datacu...
out_grid = make_geocube(
    vector_data = rgi_sub,
    measurements = ["RGI_int"],
    like = dc
)
out_grid
<xarray.Dataset>
Dimensions:      (y: 833, x: 833)
Coordinates:
  * y            (y) float64 3.4e+06 3.4e+06 3.4e+06 ... 3.3e+06 3.3e+06 3.3e+06
  * x            (x) float64 7.001e+05 7.003e+05 7.004e+05 ... 7.999e+05 8e+05
    spatial_ref  int64 0
Data variables:
    RGI_int      (y, x) float64 nan nan nan nan nan nan ... nan nan nan nan nan

Now each glacier in the geodataframe rgi_sub has been coded with a unique integer value that corresponds to that glacier’s Randolph Glacier Inventory ID.

out_grid.RGI_int.plot();
_images/c677b091c101af0760d43dc1c798b9909ddb6fa5ca178e38fb4be2111ad7f1f6.png

Before moving forward, we will take a temporal subset of the full dataset to make it a bit easier to work with. We will also compute the mean along the time dimension and calculate the magnitude of velocity using the velocity component variables.

Then, merge the rasterized vector and the dataset containing the velocity data into an xarray dataset:

dc_sub = dc.isel(mid_date=slice(400,500))
dc_sub_2d = dc_sub.mean(dim='mid_date')
dc_sub_2d['v_mag'] = np.sqrt(dc_sub_2d.vx**2+dc_sub_2d.vy**2)

Combining data#

Note

The following cell is very computationally intensive. It is executed here for the sake of demonstration but if you are running this code yourself be aware that it may not run/ may take a very long time to run. Consider taking a spatial subset of the dataset.

out_grid['v'] = (dc_sub_2d.v_mag.dims, dc_sub_2d.v_mag.values, dc_sub_2d.v_mag.attrs, dc_sub_2d.v_mag.encoding)

Assign the RGI_int object as a coordinate variable of the xarray.Dataset rather than a data_var.

out_grid = out_grid.assign_coords({'RGI_int':out_grid.RGI_int})

Grouping by RGI ID#

Now, we will use .groupby() to group the dataset by the RGI ID:

grouped_ID = out_grid.groupby('RGI_int')
<__array_function__ internals>:200: RuntimeWarning: invalid value encountered in cast

Calculating summary statistics#

Compute summary statistics for a single variable on the grouped object:

grid_mean_sp = grouped_ID.mean(dim=...).rename({'v': 'speed_mean'})
grid_median_sp = grouped_ID.median(dim=...).rename({'v': 'speed_median'})
grid_max_sp = grouped_ID.max(dim=...).rename({'v': 'speed_max'})
grid_min_sp = grouped_ID.min(dim=...).rename({'v': 'speed_min'})
  

Check if the data arrays (RGI_int) are equal, must be the case for xr.merge() in next step

print(grid_mean_sp.RGI_int.equals(grid_mean_sp.RGI_int))
print(grid_median_sp.RGI_int.equals(grid_median_sp.RGI_int))
print(grid_max_sp.RGI_int.equals(grid_max_sp.RGI_int))
print(grid_min_sp.RGI_int.equals(grid_min_sp.RGI_int))
True
True
True
True

Transitioning from ‘lazy’ operations to in-memory#

Merge and convert the lazy object to an in-memory equivalent using dask .compute(). Once it is in-memory, we can convert it to a pandas dataframe.

zonal_stats = xr.merge([grid_mean_sp, grid_median_sp, grid_max_sp, grid_min_sp]).compute()
zonal_stats_df = zonal_stats.to_dataframe()
zonal_stats_df = zonal_stats_df.reset_index()
zonal_stats_df = zonal_stats_df.drop(['spatial_ref'], axis=1)
zonal_stats_df
RGI_int speed_mean speed_median speed_max speed_min
0 11631.0 43.229496 22.201803 357.174896 0.197642
1 11632.0 12.765533 11.598662 43.169296 0.258909
2 11643.0 18.185356 15.985531 87.176575 0.078567
3 11673.0 23.413357 17.378513 135.973846 0.225877
4 11736.0 18.280186 14.794738 117.458504 0.760345
5 11740.0 17.366793 15.020761 65.095032 1.400108
6 11746.0 17.681360 15.402331 65.298370 1.013794
7 11754.0 25.471052 19.809713 186.470001 0.552978
8 11759.0 19.896990 17.988567 91.692764 0.567878
9 11766.0 23.918291 20.536091 114.969078 1.363080
10 11895.0 12.812520 11.758823 53.096165 0.147059
11 11985.0 16.343838 14.805200 56.622963 0.124226
12 11994.0 22.481195 20.629883 61.390446 0.864832
13 11995.0 23.926825 17.536407 167.414978 0.400000
14 12015.0 18.282509 16.707451 66.150848 0.555556
15 12194.0 34.716423 27.442644 201.819031 0.903508
16 15962.0 27.554832 24.667400 163.135681 0.370370
17 15966.0 37.678925 23.696589 244.371765 0.581265
18 16033.0 24.644690 21.771954 94.401909 1.104536
19 16035.0 40.094566 31.823326 253.023712 1.788854
20 16036.0 32.543732 27.080856 122.827522 1.431782
21 16076.0 16.480865 14.312456 76.183197 0.859005
22 16108.0 21.932232 18.748005 95.795822 1.232693
23 16116.0 22.480335 18.999691 103.565865 0.500000
24 16255.0 27.704807 22.949593 110.771164 0.760559
25 16257.0 30.635042 24.937700 121.708672 0.840708
26 16408.0 29.208967 27.189728 85.190781 0.773067
27 16461.0 25.810692 24.004101 88.890732 1.607980

Now, make a new object (rgi_itslive) where you merge the zonal stats dataframe onto the GeoPandas.GeoDataframe object containing the RGI glacier outlines.

rgi_itslive = rgi_sub.loc[rgi_sub['area_km2'] > 5.]
rgi_itslive = pd.merge(rgi_sub, right =  zonal_stats_df, how='inner', on='RGI_int')
rgi_itslive['rgi_id'].nunique()
28

Data visualization#

Now we can start to visualize the prepared dataset:

Pandas plotting#

fig, ax = plt.subplots()
sc = rgi_itslive.plot.scatter(x='speed_median',y = 'speed_mean', c = 'area_km2', ax=ax)
_images/fab0321254157147b7e79bbaaefe9e60728d79176d60af90a2285186df8aac45.png

Now we have a plot but there is still more information we’d like to include. For example, the labelling could be improved to show units. Changing the x-axis and y-axis labels of the pandas dataframe plot is pretty simple:

fig, ax = plt.subplots()
sc = rgi_itslive.plot.scatter(x='speed_median',y = 'speed_mean', c = 'area_km2', ax=ax)#.legend({'label':'test'})

fig.suptitle('Comparing glacier area, median velocity and mean velocity of 28 glaciers');
ax.set_ylabel('Mean magnitude of velocity (m/y)')
ax.set_xlabel('Median magnitude of velocity (m/y)');
_images/f1e539c9bf28b6fe85a591b77f88c12c733d4392e069dab1d4e2c221239682d4.png

But this could still be improved. The Colormap that describes the area variable in this plot would be more descriptive if it included units. Here is one way of editing the Colormap label:

#see what axes are in your fig object
fig.get_axes()
[<Axes: xlabel='Median magnitude of velocity (m/y)', ylabel='Mean magnitude of velocity (m/y)'>,
 <Axes: label='<colorbar>', ylabel='area_km2'>]
#extract the one we are interested in (colorbar)
cax = fig.get_axes()[1]
cax
<Axes: label='<colorbar>', ylabel='area_km2'>
#modify colorbar label
cax.set_ylabel('Area (square kilometer)')
Text(1102.9030555555557, 0.5, 'Area (square kilometer)')
#all together
fig, ax = plt.subplots()
sc = rgi_itslive.plot.scatter(x='speed_median',y = 'speed_mean', c = 'area_km2', ax=ax)#.legend({'label':'test'})

fig.suptitle('Comparing glacier area, median velocity and mean velocity of 28 glaciers');
ax.set_ylabel('Mean magnitude of velocity (m/y)')
ax.set_xlabel('Median magnitude of velocity (m/y)');

cax = fig.get_axes()[1]
cax.set_ylabel('Area (square kilometer)');
_images/00fa07e7b8727465abf2068bbde0c59c0da479142657ded9d5fd51cc95d0bcc6.png

Geopandas plotting#

In these plots, we are using the geometry information stored in the Geopandas.GeoDataFrame.

Static plots#

Specify the variable you’d like to observe in the plot. Set legend=True to add the colormap and pass legend_kwds as a dictionary to change the label and set other characteristics of the legend object.

fig, ax = plt.subplots()
rgi_itslive.plot(ax=ax, column='speed_mean', legend=True,     
                 legend_kwds={"label": "mean velocity (m/y)", "orientation": "vertical"},);
_images/f612c751fe28ecedc5fa393a4d04523db3aada894c018709503dd8428326527c.png

Interactive plots#

The geopandas.explore() method returns an interactive map of a Geopandas.GeoDataFrame - cool!

rgi_itslive.explore()
Make this Notebook Trusted to load map: File -> Trust Notebook

We can further customize the interactive plot:

rgi_itslive.explore('speed_median', cmap='inferno', style_kwds={'fillOpacity':0.7, 'stroke':False}, #stroke removes border around polygons, fillOpactiy sets opacity of polygons
                                                                                                   )
Make this Notebook Trusted to load map: File -> Trust Notebook

Awesome! We have encoded vector data onto a raster object that let’s us examine raster data for multiple spatial units simultaneously and explore a number of ways to visualize this data.