Use with Python

It should be note that the current focus of MetopDatasets is julia and that python is considered secondary. For python alternatives see Eugene (documentation) or Satpy (supports many EUMETSAT formats but only limited support for Metop).

This guide gives a basic example of using MetopDatasets in python via juliacall. For more information see juliacall documentation for more information. In the future we might make a python wrapper package for MetopDatasets to ease installation and usage.

Installation

The installation part just needs to be run once.

Prerequisites

Julia, Python and Pip all needs to be installed on the machine. This can be checked with the following bash commands.

python --version
julia --version
pip --version

This guide is tested with the following versions

  • Python 3.12.8
  • julia version 1.11.1
  • pip 24.2

Installing python packages

We use pip to install juliacall and numpy. We need juliacall to interface with julia and numpy is just needed to demonstrate compatibility with numpy arrays.

pip install juliacall
pip install numpy

Installing MetopDatasets.jl

Install MetopDatasets.jl via juliacall by running the following Python code.

import juliacall
# make separate module
jl = juliacall.newmodule("MetopDatasetsPy") 
jl.seval("import Pkg")
jl.Pkg.add("MetopDatasets")

Example

You are now ready to use MetopDatasets.jl in python. Below are snippets of python code showing a simple example.

Loading MetopDatasets in the python session.

import juliacall
import numpy as np
jl = juliacall.newmodule("MetopDatasetsPy")
jl.seval("using MetopDatasets")

Reading a dataset

The dataset is simply read with MetopDataset. Only the metadata is read straight away. The variables can be read on demand.

test_file = "/tcenas/home/lupemba/Documents/data/IASI_xxx_1C_M01_20240819103856Z_20240819104152Z_N_C_20240819112911Z"
ds = jl.MetopDataset(test_file, maskingvalue=float('nan'))

The dataset has a method equivalent to __repr__ so the structure of the dataset can be shown easily. The julia keys function can be used to only list variable names.

jl.keys(ds)

The individual variables can also be inspected.

ds["gs1cspect"]

The individual variables can be loaded and used like np.arrays. The record time is a small variable so we can load it all into memory.

record_start_time = jl.Array(ds["record_start_time"])
print("record_start_time")
print(np.shape(record_start_time))
print(np.min(record_start_time))
print(np.max(record_start_time))

It is also possible to just load a slice of a variable. The size of the IASI spectra of an entire orbit is around 2 GB but we can easily load a subset into memory.

spectra_index = 2300
single_channel_slice = ds["gs1cspect"][spectra_index,:,:,0:10]
print("single_channel_slice")
print(np.shape(single_channel_slice))
print(np.mean(single_channel_slice))