esgpull
is a tool that simplifies usage of the ESGF Search API for data discovery, and manages procedures related to downloading and storing files from ESGF.
from esgpull import Esgpull, Query
query = Query()
query.selection.project = "CMIP6"
query.options.distrib = True # default=False
esg = Esgpull()
nb_datasets = esg.context.hits(query, file=False)[0]
nb_files = esg.context.hits(query, file=True)[0]
datasets = esg.context.datasets(query, max_hits=5)
print(f"Number of CMIP6 datasets: {nb_datasets}")
print(f"Number of CMIP6 files: {nb_files}")
for dataset in datasets:
print(dataset)
- Command-line interface
- HTTP download (async multi-file)
esgpull
is distributed via PyPI:
pip install esgpull
esgpull --help
For isolated installation, uv
or
pipx
are recommended:
# with uv
uv tool install esgpull
esgpull --help
# alternatively, uvx enables running without explicit installation (comes with uv)
uvx esgpull --help
# with pipx
pipx install esgpull
esgpull --help
Usage: esgpull [OPTIONS] COMMAND [ARGS]...
esgpull is a management utility for files and datasets from ESGF.
Options:
-V, --version Show the version and exit.
-h, --help Show this message and exit.
Commands:
add Add queries to the database
config View/modify config
convert Convert synda selection files to esgpull queries
download Asynchronously download files linked to queries
login OpenID authentication and certificates renewal
remove Remove queries from the database
retry Re-queue failed and cancelled downloads
search Search datasets and files on ESGF
self Manage esgpull installations / import synda database
show View query tree
status View file queue status
track Track queries
untrack Untrack queries
update Fetch files, link files <-> queries, send files to download...
You can use the common github workflow (through pull requests and issues) to contribute.