It is still ayoung project, but has already proven extremely useful to me in a number of occasions, and I think it's the perfect illustration of how one can build clever and useful extensions build on the very powerful xarray data structures. I hope to be able to contribute to this project, and this post is intended to be a shoutout and acknowledgment of Fabien Maussion's work.
If you work in climate / ocean science and are already using (as you should !) xarray, Salem is definitely worth a try
This notebook contains a brief overview of 3 convenient packages implementing wavelet analysis in Python:
we will try and reproduce the examples found in:
from Christopher Torrence and Gil P. Compo
which use the NINO3.4 seasonal time series (The NINO3.4 index is calculated as the regional average of Sea - Surface - Temperature (SST) anomalies in the central eastern Pacific [5°N to 5°S, 170°W to 120°W.] and is one of the most used indices for tracking the the El Niño - Southern Oscillation phenomenon)
See also the Interactive Wavelet page
We will also see the example of rectification of the bias that exist in favor of large scales
In this notebook I give a very simple (and rather uncommented) example of how to use scikit-learn to perform an Empirical Orthogonal Function decomposition (EOF analysis, often referred to as well as Principal Component Analysis or PCA) of a climate field, in this case the monthly Sea Surface Temperature (SST) anomalies in the Pacific.
xray is a Python package that allows to define and manipulate N-Dimensional labelled arrays. In a nutschell, whenever you've got data that is defined over more than 2 dimensions, and to each point along those dimensions can be associated a label (e.g. a latitute, a longitude, a timestamp, a depth, etc) then you definitely need to have a look at xray.
If this data model reminds you of the data structures introduced by the widely used pandas library, this is not a coincidence, coming right from the xray documentation is this disclaimer: "xray is an open source project and Python package that aims to bring the labeled data power of pandas to the physical sciences, by providing N-dimensional variants of the core pandas data structures."
In this post I will give a few reasons why I think that xray is destined to become a core Python package for people working with multi-dimensional arrays, especially - but not only - in the geosciences, before illustrating the power of xray with a few examples of how it simplifies considerably common operations on climate datasets.
A - totally incomplete - list of resources I have come across on Python and Python for data analysis and visualization, loosely organized by category:
In this blog, I will try and share what I have learned along the way, and give full examples of how I use Python in my research or operational workflows.
I will occasionally reflect on open science, and how open-source in general, and Python in particular can be used to help make climate science more open.