Skip to content

quantifyearth/yirgacheffe

Repository files navigation

Yirgacheffe: a declarative geospatial library for Python to make data-science with maps easier

CI Documentation PyPI version

Overview

Yirgacheffe is a declarative geospatial library, allowing you to operate on both raster and polygon geospatial datasets without having to do all the tedious book keeping around layer alignment or dealing with hardware concerns around memory or parallelism. you can load into memory safely.

Example common use-cases:

  • Do the datasets overlap? Yirgacheffe will let you define either the intersection or the union of a set of different datasets, scaling up or down the area as required.
  • Rasterisation of vector layers: if you have a vector dataset then you can add that to your computation and yirgaceffe will rasterize it on demand, so you never need to store more data in memory than necessary.
  • Do the raster layers get big and take up large amounts of memory? Yirgacheffe will let you do simple numerical operations with layers directly and then worry about the memory management behind the scenes for you.
  • Parallelisation of operations over many CPU cores.
  • Built in support for optionally using GPUs via MLX support.

Installation

Yirgacheffe is available via pypi, so can be installed with pip for example:

$ pip install yirgacheffe

Documentation

The documentation can be found on yirgacheffe.org

Simple examples:

Here is how to do cloud removal from Sentinel-2 data, using the Scene Classification Layer data:

import yirgaceffe as yg

with (
  yg.read_raster("T37NCG_20250909T073609_B06_20m.jp2") as vre2,
  yg.read_raster("T37NCG_20250909T073609_SCL_20m.jp2") as scl,
):
  is_cloud = (scl == 8) | (scl == 9) | (scl == 10)  # various cloud types
  is_shadow = (scl == 3)
  is_bad = is_cloud | is_shadow

  masked_vre2 = yg.where(is_bad, float("nan"), vre2)
  masked_vre2.to_geotiff("vre2_cleaned.tif")

or a species' Area of Habitat calculation:

import yirgaceffe as yg

with (
    yg.read_raster("habitats.tif") as habitat_map,
    yg.read_raster('elevation.tif') as elevation_map,
    yg.read_shape('species123.geojson') as range_map,
):
    refined_habitat = habitat_map.isin([...species habitat codes...])
    refined_elevation = (elevation_map >= species_min) & (elevation_map <= species_max)
    aoh = refined_habitat * refined_elevation * range_polygon * area_per_pixel_map
    print(f'Area of habitat: {aoh.sum()}')

Citation

If you use Yirgacheffe in your research, please cite our paper:

Michael Winston Dales, Alison Eyres, Patrick Ferris, Francesca A. Ridley, Simon Tarr, and Anil Madhavapeddy. 2025. Yirgacheffe: A Declarative Approach to Geospatial Data. In Proceedings of the 2nd ACM SIGPLAN International Workshop on Programming for the Planet (PROPL '25). Association for Computing Machinery, New York, NY, USA, 47–54. https://doi.org/10.1145/3759536.3763806

BibTeX
@inproceedings{10.1145/3759536.3763806,
  author = {Dales, Michael Winston and Eyres, Alison and Ferris, Patrick and Ridley, Francesca A. and Tarr, Simon and Madhavapeddy, Anil},
  title = {Yirgacheffe: A Declarative Approach to Geospatial Data},
  year = {2025},
  isbn = {9798400721618},
  publisher = {Association for Computing Machinery},
  address = {New York, NY, USA},
  url = {https://doi.org/10.1145/3759536.3763806},
  doi = {10.1145/3759536.3763806},
  abstract = {We present Yirgacheffe, a declarative geospatial library that allows spatial algorithms to be implemented concisely, supports parallel execution, and avoids common errors by automatically handling data (large geospatial rasters) and resources (cores, memory, GPUs). Our primary user domain comprises ecologists, where a typical problem involves cleaning messy occurrence data, overlaying it over tiled rasters, combining layers, and deriving actionable insights from the results. We describe the successes of this approach towards driving key pipelines related to global biodiversity and describe the capability gaps that remain, hoping to motivate more research into geospatial domain-specific languages.},
  booktitle = {Proceedings of the 2nd ACM SIGPLAN International Workshop on Programming for the Planet},
  pages = {47–54},
  numpages = {8},
  keywords = {Biodiversity, Declarative, Geospatial, Python},
  location = {Singapore, Singapore},
  series = {PROPL '25}
}

Thanks

Thanks to discussion and feedback from my colleagues, particularly Alison Eyres, Patrick Ferris, Amelia Holcomb, and Anil Madhavapeddy.

Inspired by the work of Daniele Baisero in his AoH library.

About

A declarative geospatial library for Python to make data-science with maps easier

Topics

Resources

License

Code of conduct

Stars

Watchers

Forks

Packages

No packages published

Contributors 3

  •  
  •  
  •