Skip to content

Jupyter Book

Ashley Smith edited this page Apr 28, 2021 · 2 revisions

Why?

Jupyter Book is used to build and display the notebooks and documentation in HTML. Some reasons to use Jupyter Book:

  • Simpler than Sphinx directly:
    • Markdown is simpler than RST so it's easier to write. MyST provides a few focussed extensions to MD suitable for this kind of documentation
    • First-class support for notebooks avoids necessity to configure Sphinx extensions to make use of them
    • Limiting to a simple _toc.yml and _config.yml should make it more maintainable (and more scientist-friendly)
  • Good support for the workflow of storing notebooks unexecuted (without outputs), thus providing a cleaner git history and avoiding unnecessary swelling of the repository with large binary blob changes
  • Caching of notebook execution between builds helps during development, reducing long waiting times for notebook execution, while not modifying the .ipynb file itself
  • This workflow has several roles:
    • Publishing of guides as a nice website, integrated with links to VRE to execute them
    • Consolidated execution of notebooks allows a robust development process for them
    • Notebooks can be automatically re-executed to act as test cases for packages used

How?

pip install jupyter-book
jupyter-book build .

to build the html in _build/html. This will use _config.yml by default:

  • only_build_toc_files: true means that only .ipynb/.md files listed within _toc.yml will be used
  • execute_notebooks: cache stores the executed state using jupyter-cache so only modified notebooks will be re-executed
    • Use jupyter-book clean --all . to clean that
  • run_in_temp: true executes each notebook in an isolated directory to ensure they work independently from the repository

TODO: check behaviour when a notebook fails (ref. --keep-going)


jupyter-book build --config _config-testing.yml . uses that other config file, which is necessary to make use of the more experimental nbmake - this assumes that nbmake has been run first:

pip install nbmake pytest-xdist
pytest --numprocesses 2 --nbmake --overwrite \
  notebooks/*.ipynb \
  -deselect=notebooks/04c1_Geomag-Ground-Data-FTP.ipynb

executes two notebooks in parallel for extra speed (two is the limit for the VirES processor, and regular GitHub workers only have two cores available anyway). The outputs are overwritten into the .ipynb files, as opposed to using the cache (TODO: check/report behaviour of nbmake - I think this was necessary because the parallel mode would not use the cache) - for this reason, _config-testing.yml has execute_notebooks: off so that it just uses the outputs generated by nbmake. (TODO: check behaviour when notebook fails).

Publishing online with Netlify

Once you have built the site locally, you can put it online with Netlify

  1. Create a Netlify account
  2. npm install netlify (maybe install npm with nvm)
  3. Try to push your site directly with netlify deploy --dir=_build/html - the tool should walk you through creating a new site
  4. Use the admin panel at https://app.netlify.com:
    • Within Site Settings: rename your new site or change the domain used; find the site API ID
    • Within User Settings / Applications: generate personal access tokens

Automating with GitHub Actions

TODO

Clone this wiki locally