feat: snakemakeify hacker's delight #1549

bollu · 2025-08-24T12:38:38Z

This shows how we would use snakemake to gradually migrate our plotting pipeline for hacker's delight.

See that snakemake now subsumes the (hard) job of task running in collect.py.
It leaves the CSV wranging in collect.py and plot.py untouched, as these have been vetted by @luisacicolini .

This provides parallelism at the task level, and I would like to believe, is much more declarative on how the hacker's delight files are built.

ineol · 2025-08-24T14:48:20Z

How would the run_with_limits function be used?

ineol · 2025-08-24T14:48:52Z

It would be cool if we could hook it up with snakemake's notion of limits

bollu · 2025-08-24T17:01:12Z

@ineol I added an example. Sadly, I learnt today that snakemake's notion of limits are not imposed by the runner, but are only used by the scheduler :) Regardless, we can query those parameters and use then in our call.

Snakefile

bollu · 2025-08-25T09:31:12Z

@ineol , if I could get a thumbs up from you for the config, that would be great :)

ineol · 2025-08-25T11:32:05Z

What does the python thing do in the first rule? Does it update the venv when the requirements.txt change or just creates it if it does not exist?

Relatedly, can we use pip to install snakemake as well?

We refactor this into a separate 'liblimits.py, which will be shared across our evaluation scripts.

bollu · 2025-08-26T08:10:13Z

@ineol Indeed, we can use pip, so I've added it into the requirements.txt.

alexkeizer · 2025-08-26T08:39:06Z

NIT: I'd expect some clean target to get rid of produced plots (although I'm not sure if we'd want it to clean the built core oleans or not)

alexkeizer · 2025-08-26T08:40:19Z

name 'run_with_limits' is not defined

😢

alexkeizer · 2025-08-26T08:41:29Z

In general, I'm not a big fan of the fact that when something breaks, it vomits up heaps of non-informative "thing broke because something returned a non-zero status code" logs. Just show me the actual error!

Just look at the signal/noise ratio in the following log!

NameError in file "/home/alex/Workspace/PhD/lean-mlir/bv-evaluation/Snakefile", line 57:
name 'run_with_limits' is not defined
  File "/home/alex/Workspace/PhD/lean-mlir/bv-evaluation/Snakefile", line 57, in __rule_hdel_compare_make_output
Exiting because a job execution failed. Look below for error messages
WorkflowError:
At least one job did not complete successfully.
Select jobs to execute...
Traceback (most recent call last):

  File "/home/alex/Workspace/PhD/lean-mlir/.venv/lib/python3.13/site-packages/snakemake/executors/local.py", line 232, in spawn_job
    subprocess.check_call(cmd, shell=True)
    ~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^

  File "/nix/store/sd81bvmch7njdpwx3lkjslixcbj5mivz-python3-3.13.4/lib/python3.13/subprocess.py", line 419, in check_call
    raise CalledProcessError(retcode, cmd)

subprocess.CalledProcessError: Command 'cd /home/alex/Workspace/PhD/lean-mlir/bv-evaluation && /home/alex/Workspace/PhD/lean-mlir/.venv/bin/python3 -m snakemake --snakefile '/home/alex/Workspace/PhD/lean-mlir/bv-evaluation/Snakefile' --target-jobs 'hdel_compare_make_output:file=ch2_2AdditionAndLogicalOps,width=64,r=1' --allowed-rules hdel_compare_make_output --cores 1 --attempt 1 --force-use-threads  --force --target-files-omit-workdir-adjustment --max-inventory-time 0 --nocolor --notemp --no-hooks --nolock --ignore-incomplete --rerun-triggers software-env mtime input params code --conda-frontend 'conda' --shared-fs-usage sources persistence storage-local-copies software-deployment input-output source-cache --wrapper-prefix 'https://github.com/snakemake/snakemake-wrappers/raw/' --latency-wait 5 --scheduler 'greedy' --local-storage-prefix base64//LnNuYWtlbWFrZS9zdG9yYWdl --scheduler-solver-path '/home/alex/Workspace/PhD/lean-mlir/.venv/bin' --default-resources base64//dG1wZGlyPXN5c3RlbV90bXBkaXI= --quiet progress rules host --mode 'subprocess' --local-groupid 'local'' returned non-zero exit status 1.


During handling of the above exception, another exception occurred:


Traceback (most recent call last):

  File "/nix/store/sd81bvmch7njdpwx3lkjslixcbj5mivz-python3-3.13.4/lib/python3.13/concurrent/futures/thread.py", line 59, in run
    result = self.fn(*self.args, **self.kwargs)

  File "/home/alex/Workspace/PhD/lean-mlir/.venv/lib/python3.13/site-packages/snakemake/executors/local.py", line 247, in cached_or_run
    run_func(*args)
    ~~~~~~~~^^^^^^^

  File "/home/alex/Workspace/PhD/lean-mlir/.venv/lib/python3.13/site-packages/snakemake/executors/local.py", line 234, in spawn_job
    raise SpawnedJobError()

snakemake.exceptions.SpawnedJobError

[Tue Aug 26 09:42:02 2025]
Error in rule hdel_compare_make_output:
    message: None
    jobid: 17
    input: /home/alex/Workspace/PhD/lean-mlir/bv-evaluation/results/HackersDelight/ch2_2AdditionAndLogicalOps_64_2.lean
    output: /home/alex/Workspace/PhD/lean-mlir/bv-evaluation/results/HackersDelight/ch2_2AdditionAndLogicalOps_64_r1.txt
Shutting down, this might take some time.
Exiting because a job execution failed. Look below for error messages
[Tue Aug 26 09:42:02 2025]
Error in rule hdel_compare_make_output:
    message: None
    jobid: 17
    input: /home/alex/Workspace/PhD/lean-mlir/bv-evaluation/results/HackersDelight/ch2_2AdditionAndLogicalOps_64_2.lean
    output: /home/alex/Workspace/PhD/lean-mlir/bv-evaluation/results/HackersDelight/ch2_2AdditionAndLogicalOps_64_r1.txt
Complete log(s): /home/alex/Workspace/PhD/lean-mlir/bv-evaluation/.snakemake/log/2025-08-26T094200.842934.snakemake.log
WorkflowError:
At least one job did not complete successfully.

bollu · 2025-08-27T08:46:22Z

@alexkeizer @ineol @luisacicolini , I would appreciate it if you could try this again. Please run

cd lean-mlir/bv-evaluation;./toplevel-hackersdelight.sh

github-actions · 2025-08-27T09:21:07Z

bv_decide solved 0 theorems.
bitwuzla solved 0 theorems.
bv_decide found 0 counterexamples.
bitwuzla found 0 counterexamples.
bv_decide only failed on 0 problems.
bitwuzla only failed on 0 problems.
both bitwuzla and bv_decide failed on 0 problems.
In total, bitwuzla saw 0 problems.
In total, bv_decide saw 0 problems.
ran rg 'LeanSAT provided a counter' | wc -l, this file found 0, rg found 0, SUCCESS
ran rg 'Bitwuzla provided a counter' | wc -l, this file found 0, rg found 0, SUCCESS
ran rg 'LeanSAT proved' | wc -l, this file found 0, rg found 0, SUCCESS
ran rg 'Bitwuzla proved' | wc -l, this file found 0, rg found 0, SUCCESS
The InstCombine benchmark contains 4520 theorems in total.
Saved dataframe at: /home/runner/work/lean-mlir/lean-mlir/bv-evaluation/raw-data/InstCombine/instcombine_ceg_data.csv
all_files_solved_bitwuzla_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_rw_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_bb_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_sat_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_lratt_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_lratc_times_stddev avg: nan | stddev: nan
mean of percentage stddev/av: nan%

ineol · 2025-08-27T09:26:02Z

I tried to install bitwuzla but it could not download gmp because ftp.gnu.org sems dead. Bitwuzla is an implicit dependency of the snakemake file.

github-actions · 2025-08-27T09:26:16Z

bv_decide solved 0 theorems.
bitwuzla solved 0 theorems.
bv_decide found 0 counterexamples.
bitwuzla found 0 counterexamples.
bv_decide only failed on 0 problems.
bitwuzla only failed on 0 problems.
both bitwuzla and bv_decide failed on 0 problems.
In total, bitwuzla saw 0 problems.
In total, bv_decide saw 0 problems.
ran rg 'LeanSAT provided a counter' | wc -l, this file found 0, rg found 0, SUCCESS
ran rg 'Bitwuzla provided a counter' | wc -l, this file found 0, rg found 0, SUCCESS
ran rg 'LeanSAT proved' | wc -l, this file found 0, rg found 0, SUCCESS
ran rg 'Bitwuzla proved' | wc -l, this file found 0, rg found 0, SUCCESS
The InstCombine benchmark contains 4520 theorems in total.
Saved dataframe at: /home/runner/work/lean-mlir/lean-mlir/bv-evaluation/raw-data/InstCombine/instcombine_ceg_data.csv
all_files_solved_bitwuzla_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_rw_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_bb_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_sat_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_lratt_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_lratc_times_stddev avg: nan | stddev: nan
mean of percentage stddev/av: nan%

github-actions · 2025-08-27T09:34:09Z

bv_decide solved 0 theorems.
bitwuzla solved 0 theorems.
bv_decide found 0 counterexamples.
bitwuzla found 0 counterexamples.
bv_decide only failed on 0 problems.
bitwuzla only failed on 0 problems.
both bitwuzla and bv_decide failed on 0 problems.
In total, bitwuzla saw 0 problems.
In total, bv_decide saw 0 problems.
ran rg 'LeanSAT provided a counter' | wc -l, this file found 0, rg found 0, SUCCESS
ran rg 'Bitwuzla provided a counter' | wc -l, this file found 0, rg found 0, SUCCESS
ran rg 'LeanSAT proved' | wc -l, this file found 0, rg found 0, SUCCESS
ran rg 'Bitwuzla proved' | wc -l, this file found 0, rg found 0, SUCCESS
The InstCombine benchmark contains 4520 theorems in total.
Saved dataframe at: /home/runner/work/lean-mlir/lean-mlir/bv-evaluation/raw-data/InstCombine/instcombine_ceg_data.csv
all_files_solved_bitwuzla_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_rw_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_bb_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_sat_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_lratt_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_lratc_times_stddev avg: nan | stddev: nan
mean of percentage stddev/av: nan%

github-actions · 2025-08-27T09:34:11Z

bv_decide solved 0 theorems.
bitwuzla solved 0 theorems.
bv_decide found 0 counterexamples.
bitwuzla found 0 counterexamples.
bv_decide only failed on 0 problems.
bitwuzla only failed on 0 problems.
both bitwuzla and bv_decide failed on 0 problems.
In total, bitwuzla saw 0 problems.
In total, bv_decide saw 0 problems.
ran rg 'LeanSAT provided a counter' | wc -l, this file found 0, rg found 0, SUCCESS
ran rg 'Bitwuzla provided a counter' | wc -l, this file found 0, rg found 0, SUCCESS
ran rg 'LeanSAT proved' | wc -l, this file found 0, rg found 0, SUCCESS
ran rg 'Bitwuzla proved' | wc -l, this file found 0, rg found 0, SUCCESS
The InstCombine benchmark contains 4520 theorems in total.
Saved dataframe at: /home/runner/work/lean-mlir/lean-mlir/bv-evaluation/raw-data/InstCombine/instcombine_ceg_data.csv
all_files_solved_bitwuzla_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_rw_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_bb_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_sat_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_lratt_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_lratc_times_stddev avg: nan | stddev: nan
mean of percentage stddev/av: nan%

alexkeizer · 2025-08-27T09:47:07Z

Also @bollu could you double check what metadata you generate, and add it to .gitignore. I got a bunch of stuff under .snakemake, and also __pycache__ (but I'm not sure how much of that was us trying to get Nix to behave)

github-actions · 2025-08-27T14:13:56Z

bv_decide solved 0 theorems.
bitwuzla solved 0 theorems.
bv_decide found 0 counterexamples.
bitwuzla found 0 counterexamples.
bv_decide only failed on 0 problems.
bitwuzla only failed on 0 problems.
both bitwuzla and bv_decide failed on 0 problems.
In total, bitwuzla saw 0 problems.
In total, bv_decide saw 0 problems.
ran rg 'LeanSAT provided a counter' | wc -l, this file found 0, rg found 0, SUCCESS
ran rg 'Bitwuzla provided a counter' | wc -l, this file found 0, rg found 0, SUCCESS
ran rg 'LeanSAT proved' | wc -l, this file found 0, rg found 0, SUCCESS
ran rg 'Bitwuzla proved' | wc -l, this file found 0, rg found 0, SUCCESS
The InstCombine benchmark contains 4520 theorems in total.
Saved dataframe at: /home/runner/work/lean-mlir/lean-mlir/bv-evaluation/raw-data/InstCombine/instcombine_ceg_data.csv
all_files_solved_bitwuzla_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_rw_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_bb_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_sat_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_lratt_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_lratc_times_stddev avg: nan | stddev: nan
mean of percentage stddev/av: nan%

github-actions · 2025-08-27T14:22:53Z

bv_decide solved 0 theorems.
bitwuzla solved 0 theorems.
bv_decide found 0 counterexamples.
bitwuzla found 0 counterexamples.
bv_decide only failed on 0 problems.
bitwuzla only failed on 0 problems.
both bitwuzla and bv_decide failed on 0 problems.
In total, bitwuzla saw 0 problems.
In total, bv_decide saw 0 problems.
ran rg 'LeanSAT provided a counter' | wc -l, this file found 0, rg found 0, SUCCESS
ran rg 'Bitwuzla provided a counter' | wc -l, this file found 0, rg found 0, SUCCESS
ran rg 'LeanSAT proved' | wc -l, this file found 0, rg found 0, SUCCESS
ran rg 'Bitwuzla proved' | wc -l, this file found 0, rg found 0, SUCCESS
The InstCombine benchmark contains 4520 theorems in total.
Saved dataframe at: /home/runner/work/lean-mlir/lean-mlir/bv-evaluation/raw-data/InstCombine/instcombine_ceg_data.csv
all_files_solved_bitwuzla_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_rw_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_bb_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_sat_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_lratt_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_lratc_times_stddev avg: nan | stddev: nan
mean of percentage stddev/av: nan%

github-actions · 2025-08-27T14:26:26Z

bv_decide solved 0 theorems.
bitwuzla solved 0 theorems.
bv_decide found 0 counterexamples.
bitwuzla found 0 counterexamples.
bv_decide only failed on 0 problems.
bitwuzla only failed on 0 problems.
both bitwuzla and bv_decide failed on 0 problems.
In total, bitwuzla saw 0 problems.
In total, bv_decide saw 0 problems.
ran rg 'LeanSAT provided a counter' | wc -l, this file found 0, rg found 0, SUCCESS
ran rg 'Bitwuzla provided a counter' | wc -l, this file found 0, rg found 0, SUCCESS
ran rg 'LeanSAT proved' | wc -l, this file found 0, rg found 0, SUCCESS
ran rg 'Bitwuzla proved' | wc -l, this file found 0, rg found 0, SUCCESS
The InstCombine benchmark contains 4520 theorems in total.
Saved dataframe at: /home/runner/work/lean-mlir/lean-mlir/bv-evaluation/raw-data/InstCombine/instcombine_ceg_data.csv
all_files_solved_bitwuzla_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_rw_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_bb_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_sat_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_lratt_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_lratc_times_stddev avg: nan | stddev: nan
mean of percentage stddev/av: nan%

alexkeizer

While running inside the nix dev-shell, it dies with the following error:

Traceback (most recent call last):

  File "/home/alex/Workspace/PhD/lean-mlir-alt/bv-evaluation/.venv/lib/python3.12/site-packages/snakemake/cli.py", line 2165, in args_to_api
    dag_api.execute_workflow(

  File "/home/alex/Workspace/PhD/lean-mlir-alt/bv-evaluation/.venv/lib/python3.12/site-packages/snakemake/api.py", line 603, in execute_workflow
    workflow.execute(

  File "/home/alex/Workspace/PhD/lean-mlir-alt/bv-evaluation/.venv/lib/python3.12/site-packages/snakemake/workflow.py", line 1405, in execute
    raise e

  File "/home/alex/Workspace/PhD/lean-mlir-alt/bv-evaluation/.venv/lib/python3.12/site-packages/snakemake/workflow.py", line 1401, in execute
    success = self.scheduler.schedule()
              ^^^^^^^^^^^^^^^^^^^^^^^^^

  File "/home/alex/Workspace/PhD/lean-mlir-alt/bv-evaluation/.venv/lib/python3.12/site-packages/snakemake/scheduler.py", line 356, in schedule
    raise e

  File "/home/alex/Workspace/PhD/lean-mlir-alt/bv-evaluation/.venv/lib/python3.12/site-packages/snakemake/scheduler.py", line 200, in schedule
    self._finish_jobs()

  File "/home/alex/Workspace/PhD/lean-mlir-alt/bv-evaluation/.venv/lib/python3.12/site-packages/snakemake/scheduler.py", line 432, in _finish_jobs
    async_run(postprocess())

  File "/home/alex/Workspace/PhD/lean-mlir-alt/bv-evaluation/.venv/lib/python3.12/site-packages/snakemake/common/__init__.py", line 99, in async_run
    return asyncio.run(coroutine)
           ^^^^^^^^^^^^^^^^^^^^^^

  File "/home/alex/.local/share/uv/python/cpython-3.12.11-linux-x86_64-gnu/lib/python3.12/asyncio/runners.py", line 195, in run
    return runner.run(main)
           ^^^^^^^^^^^^^^^^

  File "/home/alex/.local/share/uv/python/cpython-3.12.11-linux-x86_64-gnu/lib/python3.12/asyncio/runners.py", line 118, in run
    return self._loop.run_until_complete(task)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

  File "/home/alex/.local/share/uv/python/cpython-3.12.11-linux-x86_64-gnu/lib/python3.12/asyncio/base_events.py", line 691, in run_until_complete
    return future.result()
           ^^^^^^^^^^^^^^^

  File "/home/alex/Workspace/PhD/lean-mlir-alt/bv-evaluation/.venv/lib/python3.12/site-packages/snakemake/scheduler.py", line 375, in postprocess
    await job.postprocess(

  File "/home/alex/Workspace/PhD/lean-mlir-alt/bv-evaluation/.venv/lib/python3.12/site-packages/snakemake/jobs.py", line 1260, in postprocess
    await self.dag.workflow.persistence.finished(self)

  File "/home/alex/Workspace/PhD/lean-mlir-alt/bv-evaluation/.venv/lib/python3.12/site-packages/snakemake/persistence.py", line 317, in finished
    params = self._params(job)
             ^^^^^^^^^^^^^^^^^

  File "/home/alex/Workspace/PhD/lean-mlir-alt/bv-evaluation/.venv/lib/python3.12/site-packages/snakemake/persistence.py", line 631, in _params
    return sorted(
           ^^^^^^^

  File "/home/alex/Workspace/PhD/lean-mlir-alt/bv-evaluation/.venv/lib/python3.12/site-packages/snakemake/persistence.py", line 634, in <genexpr>
    (self._serialize_param(value) for value in job.non_derived_params),
     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^

  File "/home/alex/Workspace/PhD/lean-mlir-alt/bv-evaluation/.venv/lib/python3.12/site-packages/snakemake/persistence.py", line 623, in _serialize_param_pandas
    import pandas as pd

  File "/nix/store/27drr45ziv3kbjl4pdzdh1cic7ncvpp2-python3-3.13.4-env/lib/python3.13/site-packages/pandas/__init__.py", line 19, in <module>
    raise ImportError(

ImportError: Unable to import required dependencies:
numpy: Error importing numpy: you should not try to import numpy from
        its source directory; please exit the numpy source tree, and relaunch
        your python interpreter from there.

alexkeizer · 2025-08-27T17:54:15Z

bv-evaluation/Snakefile

+onstart:
+    shell("elan --version")
+    shell("cd {gitroot} && lake exe cache get && lake build")
+    shell("uv --version")
+    shell("bitwuzla --version")


Should this be redirecting outputs to log files?

You might also want to print some prelude information before dumping elan --version straight to the user.

bv-evaluation/Snakefile

…ct_and_plot outputs, so that the raw-data folder gets created by snakemake and we can thus drop the .placeholder

alexkeizer · 2025-08-27T18:52:38Z

While running inside the nix dev-shell, it dies with the following error:

Traceback (most recent call last):

  File "/home/alex/Workspace/PhD/lean-mlir-alt/bv-evaluation/.venv/lib/python3.12/site-packages/snakemake/cli.py", line 2165, in args_to_api
    dag_api.execute_workflow(

  File "/home/alex/Workspace/PhD/lean-mlir-alt/bv-evaluation/.venv/lib/python3.12/site-packages/snakemake/api.py", line 603, in execute_workflow
    workflow.execute(

  File "/home/alex/Workspace/PhD/lean-mlir-alt/bv-evaluation/.venv/lib/python3.12/site-packages/snakemake/workflow.py", line 1405, in execute
    raise e

  File "/home/alex/Workspace/PhD/lean-mlir-alt/bv-evaluation/.venv/lib/python3.12/site-packages/snakemake/workflow.py", line 1401, in execute
    success = self.scheduler.schedule()
              ^^^^^^^^^^^^^^^^^^^^^^^^^

  File "/home/alex/Workspace/PhD/lean-mlir-alt/bv-evaluation/.venv/lib/python3.12/site-packages/snakemake/scheduler.py", line 356, in schedule
    raise e

  File "/home/alex/Workspace/PhD/lean-mlir-alt/bv-evaluation/.venv/lib/python3.12/site-packages/snakemake/scheduler.py", line 200, in schedule
    self._finish_jobs()

  File "/home/alex/Workspace/PhD/lean-mlir-alt/bv-evaluation/.venv/lib/python3.12/site-packages/snakemake/scheduler.py", line 432, in _finish_jobs
    async_run(postprocess())

  File "/home/alex/Workspace/PhD/lean-mlir-alt/bv-evaluation/.venv/lib/python3.12/site-packages/snakemake/common/__init__.py", line 99, in async_run
    return asyncio.run(coroutine)
           ^^^^^^^^^^^^^^^^^^^^^^

  File "/home/alex/.local/share/uv/python/cpython-3.12.11-linux-x86_64-gnu/lib/python3.12/asyncio/runners.py", line 195, in run
    return runner.run(main)
           ^^^^^^^^^^^^^^^^

  File "/home/alex/.local/share/uv/python/cpython-3.12.11-linux-x86_64-gnu/lib/python3.12/asyncio/runners.py", line 118, in run
    return self._loop.run_until_complete(task)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

  File "/home/alex/.local/share/uv/python/cpython-3.12.11-linux-x86_64-gnu/lib/python3.12/asyncio/base_events.py", line 691, in run_until_complete
    return future.result()
           ^^^^^^^^^^^^^^^

  File "/home/alex/Workspace/PhD/lean-mlir-alt/bv-evaluation/.venv/lib/python3.12/site-packages/snakemake/scheduler.py", line 375, in postprocess
    await job.postprocess(

  File "/home/alex/Workspace/PhD/lean-mlir-alt/bv-evaluation/.venv/lib/python3.12/site-packages/snakemake/jobs.py", line 1260, in postprocess
    await self.dag.workflow.persistence.finished(self)

  File "/home/alex/Workspace/PhD/lean-mlir-alt/bv-evaluation/.venv/lib/python3.12/site-packages/snakemake/persistence.py", line 317, in finished
    params = self._params(job)
             ^^^^^^^^^^^^^^^^^

  File "/home/alex/Workspace/PhD/lean-mlir-alt/bv-evaluation/.venv/lib/python3.12/site-packages/snakemake/persistence.py", line 631, in _params
    return sorted(
           ^^^^^^^

  File "/home/alex/Workspace/PhD/lean-mlir-alt/bv-evaluation/.venv/lib/python3.12/site-packages/snakemake/persistence.py", line 634, in <genexpr>
    (self._serialize_param(value) for value in job.non_derived_params),
     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^

  File "/home/alex/Workspace/PhD/lean-mlir-alt/bv-evaluation/.venv/lib/python3.12/site-packages/snakemake/persistence.py", line 623, in _serialize_param_pandas
    import pandas as pd

  File "/nix/store/27drr45ziv3kbjl4pdzdh1cic7ncvpp2-python3-3.13.4-env/lib/python3.13/site-packages/pandas/__init__.py", line 19, in <module>
    raise ImportError(

ImportError: Unable to import required dependencies:
numpy: Error importing numpy: you should not try to import numpy from
        its source directory; please exit the numpy source tree, and relaunch
        your python interpreter from there.

This was a nix-skill-issue: the posted error is really saying that numpy failed to load the C extension, which is presumably caused by nix being nix. The fix was to ensure that pythonPackages312.numpy is installed in flake.nix (note that other dependencies are fine to be uv-installed, it's only this one that is problematic!

github-actions · 2025-08-27T19:04:11Z

bv_decide solved 0 theorems.
bitwuzla solved 0 theorems.
bv_decide found 0 counterexamples.
bitwuzla found 0 counterexamples.
bv_decide only failed on 0 problems.
bitwuzla only failed on 0 problems.
both bitwuzla and bv_decide failed on 0 problems.
In total, bitwuzla saw 0 problems.
In total, bv_decide saw 0 problems.
ran rg 'LeanSAT provided a counter' | wc -l, this file found 0, rg found 0, SUCCESS
ran rg 'Bitwuzla provided a counter' | wc -l, this file found 0, rg found 0, SUCCESS
ran rg 'LeanSAT proved' | wc -l, this file found 0, rg found 0, SUCCESS
ran rg 'Bitwuzla proved' | wc -l, this file found 0, rg found 0, SUCCESS
The InstCombine benchmark contains 4520 theorems in total.
Saved dataframe at: /home/runner/work/lean-mlir/lean-mlir/bv-evaluation/raw-data/InstCombine/instcombine_ceg_data.csv
all_files_solved_bitwuzla_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_rw_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_bb_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_sat_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_lratt_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_lratc_times_stddev avg: nan | stddev: nan
mean of percentage stddev/av: nan%

github-actions · 2025-08-27T19:11:16Z

bv_decide solved 0 theorems.
bitwuzla solved 0 theorems.
bv_decide found 0 counterexamples.
bitwuzla found 0 counterexamples.
bv_decide only failed on 0 problems.
bitwuzla only failed on 0 problems.
both bitwuzla and bv_decide failed on 0 problems.
In total, bitwuzla saw 0 problems.
In total, bv_decide saw 0 problems.
ran rg 'LeanSAT provided a counter' | wc -l, this file found 0, rg found 0, SUCCESS
ran rg 'Bitwuzla provided a counter' | wc -l, this file found 0, rg found 0, SUCCESS
ran rg 'LeanSAT proved' | wc -l, this file found 0, rg found 0, SUCCESS
ran rg 'Bitwuzla proved' | wc -l, this file found 0, rg found 0, SUCCESS
The InstCombine benchmark contains 4520 theorems in total.
Saved dataframe at: /home/runner/work/lean-mlir/lean-mlir/bv-evaluation/raw-data/InstCombine/instcombine_ceg_data.csv
all_files_solved_bitwuzla_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_rw_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_bb_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_sat_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_lratt_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_lratc_times_stddev avg: nan | stddev: nan
mean of percentage stddev/av: nan%

tobiasgrosser · 2025-08-27T19:48:50Z

It seems the mathlib cache is download twice for me:

Using cache from origin: leanprover-community/mathlib4-nightly-testing
Attempting to download 7045 file(s) from leanprover-community/mathlib4 cache
Downloaded: 0 file(s) [attempted 7045/7045 = 100%] (0% success)
Attempting to download 7045 file(s) from leanprover-community/mathlib4-nightly-testing cache
Downloaded: 2162 file(s) [attempted 2162/7045 = 30%]

github-actions · 2025-08-27T23:22:13Z

bv_decide solved 0 theorems.
bitwuzla solved 0 theorems.
bv_decide found 0 counterexamples.
bitwuzla found 0 counterexamples.
bv_decide only failed on 0 problems.
bitwuzla only failed on 0 problems.
both bitwuzla and bv_decide failed on 0 problems.
In total, bitwuzla saw 0 problems.
In total, bv_decide saw 0 problems.
ran rg 'LeanSAT provided a counter' | wc -l, this file found 0, rg found 0, SUCCESS
ran rg 'Bitwuzla provided a counter' | wc -l, this file found 0, rg found 0, SUCCESS
ran rg 'LeanSAT proved' | wc -l, this file found 0, rg found 0, SUCCESS
ran rg 'Bitwuzla proved' | wc -l, this file found 0, rg found 0, SUCCESS
The InstCombine benchmark contains 4520 theorems in total.
Saved dataframe at: /home/runner/work/lean-mlir/lean-mlir/bv-evaluation/raw-data/InstCombine/instcombine_ceg_data.csv
all_files_solved_bitwuzla_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_rw_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_bb_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_sat_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_lratt_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_lratc_times_stddev avg: nan | stddev: nan
mean of percentage stddev/av: nan%

bollu · 2025-08-27T23:34:54Z

It seems the mathlib cache is download twice for me:

Using cache from origin: leanprover-community/mathlib4-nightly-testing
Attempting to download 7045 file(s) from leanprover-community/mathlib4 cache
Downloaded: 0 file(s) [attempted 7045/7045 = 100%] (0% success)
Attempting to download 7045 file(s) from leanprover-community/mathlib4-nightly-testing cache
Downloaded: 2162 file(s) [attempted 2162/7045 = 30%]

I merged the latest origin/main. Does this help?

alexkeizer · 2025-08-28T11:53:33Z

After tweaking flake.nix to install numpy for the specific python version that snakemake wants, this now works on my side! LGTM

bollu · 2025-08-28T13:24:27Z

@tobiasgrosser I have an LGTM from Alex, so I propose to merge this.

As per #1613, I'm experimenting with an alternative CI setup where we run everything inside a Docker container. Until we commit to this being the workflow we like, I'll duplicate existing CI jobs and have a *seperate* job run the same action inside the container (on a GH-provided runner, but using the image we built on namespace's runner). This PR does so for the CI job that builds the non-standard build targets and the Instcombine evaluation (but not yet for hackersdelight, for that I'm waiting on #1549). To make this work, I: - Add more stuff to the .dockerignore file, so files which are not needed are not copied to the Docker image (and thus won't trigger cache misses). - Expand the existing bitwuzla image to also install uv, use uv to install the python dependencies specified in requirements.txt, and copy the lean-mlir files from the base-image to get a full-blown instcombine image; the produced image is thus renamed to `lean-mlir-instcombine`. Note that this second image is still build on GH runner infrastructure! This is fine as we don't actually `lake build` anything here, so there is no need for incremental caching (but of course, using namespace.so would still likely be faster). - I tried dropping permissions in the Dockerfile to use a non-root user - Github will change the home-directory to `/github/home`, so I originally thought it was using a github user because of security concerns with using a root user. Thus, I tried to set a `USER` in the dockerfile. - However, it just made everything more complicated, and random actions started failing with permission errors. - Then I tried setting the home-directory explictly to `/root` using `env.var`, but then actions started complaining about `/root/.docker/config.json` not existing... - Just leaving the status quo is non-ideal, as .elan was re-downloading the toolchain (since the Docker image has it at /root/.elan, while in the action it was looking at `/github/home/.elan` - Finally, I worked around it by adding a first run step that symlinks the latter to the former to every job that runs in a container. It's not the most elegant, as it adds a boilerplate run step to every job, but it actually works, so 🤷

bollu commented Aug 24, 2025

View reviewed changes

Snakefile Outdated Show resolved Hide resolved

bollu force-pushed the snakehackersdel branch from 15ff01a to be0c4c3 Compare August 26, 2025 08:03

chore: peel the functionality in bv_evaluation to limit memory/time.

04357cd

We refactor this into a separate 'liblimits.py, which will be shared across our evaluation scripts.

bollu force-pushed the snakehackersdel branch 2 times, most recently from f39c52b to 7b10b20 Compare August 26, 2025 08:13

feat: snakify hackers delight

762e9a3

bollu force-pushed the snakehackersdel branch from 7b10b20 to 762e9a3 Compare August 26, 2025 08:13

chore: refactor to move snakefile inside bv-evaluation

6273347

bollu force-pushed the snakehackersdel branch from 729aee7 to 6273347 Compare August 26, 2025 09:42

bollu added the needs-consensus issues which affect the project as a whole, and which are currently blocked on reaching consensus label Aug 27, 2025

bollu added 2 commits August 27, 2025 09:39

Merge remote-tracking branch 'origin/main' into snakehackersdel

f9703cc

chore: add toplevel runner for hacker's delight

6dac61f

chore: add pyproject.toml for uv

d4ee6c2

bollu force-pushed the snakehackersdel branch from ef0d880 to d4ee6c2 Compare August 27, 2025 08:52

bollu added 4 commits August 27, 2025 14:03

chore: add hacker's delight refactors discussed with Leo and Alex

0946922

chore: cleanup runner

3919cd4

chore: add snakefile

75082fa

chore: update toplevel-hackersdelight

4844fb8

alexkeizer reviewed Aug 27, 2025

View reviewed changes

alexkeizer added 4 commits August 27, 2025 19:35

use absolute log paths

ba3043c

ignore produced log files

9ab0081

install uv and python312-numpy in flake.nix

71e01bb

add raw-data/HackersDelight/hackersdelight_err_data.csv to hdel_colle…

6f59125

…ct_and_plot outputs, so that the raw-data folder gets created by snakemake and we can thus drop the .placeholder

Merge remote-tracking branch 'origin/main' into snakehackersdel

f14a9b1

alexkeizer mentioned this pull request Aug 28, 2025

feat: duplicate evaluation CI to run in docker as well #1625

Merged

bollu mentioned this pull request Sep 25, 2025

feat: #evaluation in ... command for easier evaluation #1706

Draft

1 task

Uh oh!

feat: snakemakeify hacker's delight #1549

Are you sure you want to change the base?

feat: snakemakeify hacker's delight #1549

Conversation

bollu commented Aug 24, 2025

Uh oh!

ineol commented Aug 24, 2025

Uh oh!

ineol commented Aug 24, 2025

Uh oh!

bollu commented Aug 24, 2025

Uh oh!

Uh oh!

bollu commented Aug 25, 2025

Uh oh!

ineol commented Aug 25, 2025

Uh oh!

bollu commented Aug 26, 2025

Uh oh!

alexkeizer commented Aug 26, 2025

Uh oh!

alexkeizer commented Aug 26, 2025

Uh oh!

alexkeizer commented Aug 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bollu commented Aug 27, 2025

Uh oh!

github-actions bot commented Aug 27, 2025

Uh oh!

ineol commented Aug 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Aug 27, 2025

Uh oh!

github-actions bot commented Aug 27, 2025

Uh oh!

github-actions bot commented Aug 27, 2025

Uh oh!

alexkeizer commented Aug 27, 2025

Uh oh!

github-actions bot commented Aug 27, 2025

Uh oh!

github-actions bot commented Aug 27, 2025

Uh oh!

github-actions bot commented Aug 27, 2025

Uh oh!

alexkeizer left a comment

Choose a reason for hiding this comment

Uh oh!

alexkeizer Aug 27, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

alexkeizer commented Aug 27, 2025

Uh oh!

github-actions bot commented Aug 27, 2025

Uh oh!

github-actions bot commented Aug 27, 2025

Uh oh!

tobiasgrosser commented Aug 27, 2025

Uh oh!

github-actions bot commented Aug 27, 2025

Uh oh!

bollu commented Aug 27, 2025

Uh oh!

alexkeizer commented Aug 28, 2025

Uh oh!

bollu commented Aug 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

alexkeizer commented Aug 26, 2025 •

edited

Loading

ineol commented Aug 27, 2025 •

edited

Loading