Xarray Getting Started

In this tutorial we are going to look at how to use xarray a python library for working with labelled multi-dimensional arrays, in particular NetCDF files

Importing the library

To start with we need to import the library.

If xarray is not installed please install it into the current either:
  • via a terminal using 'pip install xarray'
  • in a new cell using '!pip install xarray' (the ! executes terminal commands)

This will import the xarray library with the alias xr which means we can access xarray in the notebook using xr

import xarray as xr

Load some data

To create an xarray.Dataset the default object we simply pass the filename to open_dataset function.

You can change the engine that is used if the file can not be read by using the `engine=` keyword argument. Inparticular using `engine=pydap` allows accessing files that are hosted online.
filename = '~/Repos/netcdf_editor_app/data/standard.nc'
ds = xr.open_dataset(filename)

Investigating the data

The most basic form of investigation is jsut seeing the dataset object. xarray has a default interaction with jupyter to show information about the Dataset.

ds
<xarray.Dataset>
Dimensions:    (latitude: 360, longitude: 720)
Coordinates:
  * latitude   (latitude) float64 -89.75 -89.25 -88.75 ... 88.75 89.25 89.75
  * longitude  (longitude) float64 -179.8 -179.2 -178.8 ... 178.8 179.2 179.8
Data variables:
    Z          (latitude, longitude) float64 ...

here we can see that the NetCDF file has 2 coordinates latitude and longitude and one data variable Z we can also see that there are no Attributes this can be useful for storing information about how the file was generated or odified for example.

Plotting

It is easy to interactively view the NetCDF file.

First we need to add the “bindings” to xarray to do this we simply import hvplot.xarray this extends the xarray base class with new methods, notably .hvplot()

import hvplot.xarray

Now everything is setup we can view our dataset by calling ds.hvplot().

By default the first data variable is plotted

Notice the tools on the left hand side these can be used to interact with the plot

ds.hvplot()

Variables are accesible through ds.VARIABLE_NAME or ds['VARIABLE_NAME']. To view a different variable we simply call ds.VARIABLE.hvplot()

the "+" sign can be used with holoviews to put graphs next to each other and the "*" is used to overlay graphs
ds['Z'].hvplot() + ds.Z.hvplot()