DataHaskell Starter

A ready-to-use VS Code Dev Container for doing data science in Haskell with Jupyter notebooks.

This repo is batteries included environment for the DataHaskell ecosystem:

VS Code + Dev Containers
IHaskell Jupyter kernel
DataHaskell libraries (e.g. dataframe, hasktorch, etc.)
Example notebooks to get you started

Who is this for?

You prefer an easy to use, pinned environment over installing everything locally.
You like notebooks (Jupyter / VS Code) for exploration.

If you’re new to DataHaskell, this is the recommended way to get a working setup.

Setting up VS Code

You can set up a devcontainer in your current folder by running the command below:

Linux/MacOS: curl -sSL https://raw.githubusercontent.com/DataHaskell/datahaskell-starter/refs/heads/main/setup.sh | sh

When you next open VS Code you should see a modal asking if you want to re-open the project in a devcontainer.

Running GHCi scripts

hscript allows you to run GHCi scripts without a main and with inline imports. It enables writing code like:

:set -package text
:set -XOverloadedStrings
:set -XTemplateHaskell

import qualified DataFrame as D
import qualified DataFrame.Functions as F
import Data.Text (Text)
import DataFrame ((|>))
import DataFrame.Functions ((.==), (.>))

-- Read Iris dataset
iris <- D.readParquet "../dataframe/data/iris.parquet"

-- Filter large setosas
iris |>
  D.filterWhere (F.col @Text "variety" .== "Setosa") |>
  D.filterWhere (F.col @Double "sepal.length" .> 5.4)

-- Declare column variables
_ = (); F.declareColumns iris

-- Create a new feature
D.derive "ratio" (sepal_width / sepal_length) iris

Linux/MacOS: Download https://raw.githubusercontent.com/DataHaskell/datahaskell-starter/refs/heads/main/hscript.sh and add it to your PATH.

You can then run files like the one above by typing:

hscript runme.hs

Example notebooks

Check out the example project in the `notebooks` directory.

Requirements

You’ll need:

VS Code
Docker (Docker Desktop on macOS/Windows, or Docker Engine on Linux)
VS Code extensions:
- Jupyter
- Dev Containers
- Haskell (recommended)

You do not need to install GHC, Cabal, or IHaskell on your host machine. Everything lives inside the container.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
.devcontainer		.devcontainer
notebooks		notebooks
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
hscript		hscript
setup.sh		setup.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

DataHaskell Starter

Who is this for?

Setting up VS Code

Running GHCi scripts

Example notebooks

Check out the example project in the `notebooks` directory.

Requirements

About

Uh oh!

Releases

Packages

Languages

License

DataHaskell/datahaskell-starter

Folders and files

Latest commit

History

Repository files navigation

DataHaskell Starter

Who is this for?

Setting up VS Code

Running GHCi scripts

Example notebooks

Check out the example project in the notebooks directory.

Requirements

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Check out the example project in the `notebooks` directory.

Packages