Skip to content

Conversation

@bollu
Copy link
Collaborator

@bollu bollu commented Aug 24, 2025

This shows how we would use snakemake to gradually migrate our plotting pipeline for hacker's delight.

See that snakemake now subsumes the (hard) job of task running in collect.py.
It leaves the CSV wranging in collect.py and plot.py untouched, as these have been vetted by @luisacicolini .

This provides parallelism at the task level, and I would like to believe, is much more declarative on how the hacker's delight files are built.

@ineol
Copy link
Collaborator

ineol commented Aug 24, 2025

How would the run_with_limits function be used?

@ineol
Copy link
Collaborator

ineol commented Aug 24, 2025

It would be cool if we could hook it up with snakemake's notion of limits

@bollu
Copy link
Collaborator Author

bollu commented Aug 24, 2025

@ineol I added an example. Sadly, I learnt today that snakemake's notion of limits are not imposed by the runner, but are only used by the scheduler :) Regardless, we can query those parameters and use then in our call.

@bollu
Copy link
Collaborator Author

bollu commented Aug 25, 2025

@ineol , if I could get a thumbs up from you for the config, that would be great :)

@ineol
Copy link
Collaborator

ineol commented Aug 25, 2025

What does the python thing do in the first rule? Does it update the venv when the requirements.txt change or just creates it if it does not exist?

Relatedly, can we use pip to install snakemake as well?

We refactor this into a separate 'liblimits.py, which will be shared
across our evaluation scripts.
@bollu
Copy link
Collaborator Author

bollu commented Aug 26, 2025

@ineol Indeed, we can use pip, so I've added it into the requirements.txt.

@bollu bollu force-pushed the snakehackersdel branch 2 times, most recently from f39c52b to 7b10b20 Compare August 26, 2025 08:13
@alexkeizer
Copy link
Collaborator

NIT: I'd expect some clean target to get rid of produced plots (although I'm not sure if we'd want it to clean the built core oleans or not)

@alexkeizer
Copy link
Collaborator

name 'run_with_limits' is not defined

😢

@alexkeizer
Copy link
Collaborator

alexkeizer commented Aug 26, 2025

In general, I'm not a big fan of the fact that when something breaks, it vomits up heaps of non-informative "thing broke because something returned a non-zero status code" logs. Just show me the actual error!

Just look at the signal/noise ratio in the following log!

NameError in file "/home/alex/Workspace/PhD/lean-mlir/bv-evaluation/Snakefile", line 57:
name 'run_with_limits' is not defined
  File "/home/alex/Workspace/PhD/lean-mlir/bv-evaluation/Snakefile", line 57, in __rule_hdel_compare_make_output
Exiting because a job execution failed. Look below for error messages
WorkflowError:
At least one job did not complete successfully.
Select jobs to execute...
Traceback (most recent call last):

  File "/home/alex/Workspace/PhD/lean-mlir/.venv/lib/python3.13/site-packages/snakemake/executors/local.py", line 232, in spawn_job
    subprocess.check_call(cmd, shell=True)
    ~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^

  File "/nix/store/sd81bvmch7njdpwx3lkjslixcbj5mivz-python3-3.13.4/lib/python3.13/subprocess.py", line 419, in check_call
    raise CalledProcessError(retcode, cmd)

subprocess.CalledProcessError: Command 'cd /home/alex/Workspace/PhD/lean-mlir/bv-evaluation && /home/alex/Workspace/PhD/lean-mlir/.venv/bin/python3 -m snakemake --snakefile '/home/alex/Workspace/PhD/lean-mlir/bv-evaluation/Snakefile' --target-jobs 'hdel_compare_make_output:file=ch2_2AdditionAndLogicalOps,width=64,r=1' --allowed-rules hdel_compare_make_output --cores 1 --attempt 1 --force-use-threads  --force --target-files-omit-workdir-adjustment --max-inventory-time 0 --nocolor --notemp --no-hooks --nolock --ignore-incomplete --rerun-triggers software-env mtime input params code --conda-frontend 'conda' --shared-fs-usage sources persistence storage-local-copies software-deployment input-output source-cache --wrapper-prefix 'https://github.com/snakemake/snakemake-wrappers/raw/' --latency-wait 5 --scheduler 'greedy' --local-storage-prefix base64//LnNuYWtlbWFrZS9zdG9yYWdl --scheduler-solver-path '/home/alex/Workspace/PhD/lean-mlir/.venv/bin' --default-resources base64//dG1wZGlyPXN5c3RlbV90bXBkaXI= --quiet progress rules host --mode 'subprocess' --local-groupid 'local'' returned non-zero exit status 1.


During handling of the above exception, another exception occurred:


Traceback (most recent call last):

  File "/nix/store/sd81bvmch7njdpwx3lkjslixcbj5mivz-python3-3.13.4/lib/python3.13/concurrent/futures/thread.py", line 59, in run
    result = self.fn(*self.args, **self.kwargs)

  File "/home/alex/Workspace/PhD/lean-mlir/.venv/lib/python3.13/site-packages/snakemake/executors/local.py", line 247, in cached_or_run
    run_func(*args)
    ~~~~~~~~^^^^^^^

  File "/home/alex/Workspace/PhD/lean-mlir/.venv/lib/python3.13/site-packages/snakemake/executors/local.py", line 234, in spawn_job
    raise SpawnedJobError()

snakemake.exceptions.SpawnedJobError

[Tue Aug 26 09:42:02 2025]
Error in rule hdel_compare_make_output:
    message: None
    jobid: 17
    input: /home/alex/Workspace/PhD/lean-mlir/bv-evaluation/results/HackersDelight/ch2_2AdditionAndLogicalOps_64_2.lean
    output: /home/alex/Workspace/PhD/lean-mlir/bv-evaluation/results/HackersDelight/ch2_2AdditionAndLogicalOps_64_r1.txt
Shutting down, this might take some time.
Exiting because a job execution failed. Look below for error messages
[Tue Aug 26 09:42:02 2025]
Error in rule hdel_compare_make_output:
    message: None
    jobid: 17
    input: /home/alex/Workspace/PhD/lean-mlir/bv-evaluation/results/HackersDelight/ch2_2AdditionAndLogicalOps_64_2.lean
    output: /home/alex/Workspace/PhD/lean-mlir/bv-evaluation/results/HackersDelight/ch2_2AdditionAndLogicalOps_64_r1.txt
Complete log(s): /home/alex/Workspace/PhD/lean-mlir/bv-evaluation/.snakemake/log/2025-08-26T094200.842934.snakemake.log
WorkflowError:
At least one job did not complete successfully.

@bollu bollu added the needs-consensus issues which affect the project as a whole, and which are currently blocked on reaching consensus label Aug 27, 2025
@bollu
Copy link
Collaborator Author

bollu commented Aug 27, 2025

@alexkeizer @ineol @luisacicolini , I would appreciate it if you could try this again. Please run

cd lean-mlir/bv-evaluation;./toplevel-hackersdelight.sh

@github-actions
Copy link

bv_decide solved 0 theorems.
bitwuzla solved 0 theorems.
bv_decide found 0 counterexamples.
bitwuzla found 0 counterexamples.
bv_decide only failed on 0 problems.
bitwuzla only failed on 0 problems.
both bitwuzla and bv_decide failed on 0 problems.
In total, bitwuzla saw 0 problems.
In total, bv_decide saw 0 problems.
ran rg 'LeanSAT provided a counter' | wc -l, this file found 0, rg found 0, SUCCESS
ran rg 'Bitwuzla provided a counter' | wc -l, this file found 0, rg found 0, SUCCESS
ran rg 'LeanSAT proved' | wc -l, this file found 0, rg found 0, SUCCESS
ran rg 'Bitwuzla proved' | wc -l, this file found 0, rg found 0, SUCCESS
The InstCombine benchmark contains 4520 theorems in total.
Saved dataframe at: /home/runner/work/lean-mlir/lean-mlir/bv-evaluation/raw-data/InstCombine/instcombine_ceg_data.csv
all_files_solved_bitwuzla_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_rw_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_bb_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_sat_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_lratt_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_lratc_times_stddev avg: nan | stddev: nan
mean of percentage stddev/av: nan%

@ineol
Copy link
Collaborator

ineol commented Aug 27, 2025

I tried to install bitwuzla but it could not download gmp because ftp.gnu.org sems dead. Bitwuzla is an implicit dependency of the snakemake file.

@github-actions
Copy link

bv_decide solved 0 theorems.
bitwuzla solved 0 theorems.
bv_decide found 0 counterexamples.
bitwuzla found 0 counterexamples.
bv_decide only failed on 0 problems.
bitwuzla only failed on 0 problems.
both bitwuzla and bv_decide failed on 0 problems.
In total, bitwuzla saw 0 problems.
In total, bv_decide saw 0 problems.
ran rg 'LeanSAT provided a counter' | wc -l, this file found 0, rg found 0, SUCCESS
ran rg 'Bitwuzla provided a counter' | wc -l, this file found 0, rg found 0, SUCCESS
ran rg 'LeanSAT proved' | wc -l, this file found 0, rg found 0, SUCCESS
ran rg 'Bitwuzla proved' | wc -l, this file found 0, rg found 0, SUCCESS
The InstCombine benchmark contains 4520 theorems in total.
Saved dataframe at: /home/runner/work/lean-mlir/lean-mlir/bv-evaluation/raw-data/InstCombine/instcombine_ceg_data.csv
all_files_solved_bitwuzla_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_rw_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_bb_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_sat_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_lratt_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_lratc_times_stddev avg: nan | stddev: nan
mean of percentage stddev/av: nan%

2 similar comments
@github-actions
Copy link

bv_decide solved 0 theorems.
bitwuzla solved 0 theorems.
bv_decide found 0 counterexamples.
bitwuzla found 0 counterexamples.
bv_decide only failed on 0 problems.
bitwuzla only failed on 0 problems.
both bitwuzla and bv_decide failed on 0 problems.
In total, bitwuzla saw 0 problems.
In total, bv_decide saw 0 problems.
ran rg 'LeanSAT provided a counter' | wc -l, this file found 0, rg found 0, SUCCESS
ran rg 'Bitwuzla provided a counter' | wc -l, this file found 0, rg found 0, SUCCESS
ran rg 'LeanSAT proved' | wc -l, this file found 0, rg found 0, SUCCESS
ran rg 'Bitwuzla proved' | wc -l, this file found 0, rg found 0, SUCCESS
The InstCombine benchmark contains 4520 theorems in total.
Saved dataframe at: /home/runner/work/lean-mlir/lean-mlir/bv-evaluation/raw-data/InstCombine/instcombine_ceg_data.csv
all_files_solved_bitwuzla_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_rw_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_bb_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_sat_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_lratt_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_lratc_times_stddev avg: nan | stddev: nan
mean of percentage stddev/av: nan%

@github-actions
Copy link

bv_decide solved 0 theorems.
bitwuzla solved 0 theorems.
bv_decide found 0 counterexamples.
bitwuzla found 0 counterexamples.
bv_decide only failed on 0 problems.
bitwuzla only failed on 0 problems.
both bitwuzla and bv_decide failed on 0 problems.
In total, bitwuzla saw 0 problems.
In total, bv_decide saw 0 problems.
ran rg 'LeanSAT provided a counter' | wc -l, this file found 0, rg found 0, SUCCESS
ran rg 'Bitwuzla provided a counter' | wc -l, this file found 0, rg found 0, SUCCESS
ran rg 'LeanSAT proved' | wc -l, this file found 0, rg found 0, SUCCESS
ran rg 'Bitwuzla proved' | wc -l, this file found 0, rg found 0, SUCCESS
The InstCombine benchmark contains 4520 theorems in total.
Saved dataframe at: /home/runner/work/lean-mlir/lean-mlir/bv-evaluation/raw-data/InstCombine/instcombine_ceg_data.csv
all_files_solved_bitwuzla_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_rw_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_bb_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_sat_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_lratt_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_lratc_times_stddev avg: nan | stddev: nan
mean of percentage stddev/av: nan%

@alexkeizer
Copy link
Collaborator

Also @bollu could you double check what metadata you generate, and add it to .gitignore. I got a bunch of stuff under .snakemake, and also __pycache__ (but I'm not sure how much of that was us trying to get Nix to behave)

@github-actions
Copy link

bv_decide solved 0 theorems.
bitwuzla solved 0 theorems.
bv_decide found 0 counterexamples.
bitwuzla found 0 counterexamples.
bv_decide only failed on 0 problems.
bitwuzla only failed on 0 problems.
both bitwuzla and bv_decide failed on 0 problems.
In total, bitwuzla saw 0 problems.
In total, bv_decide saw 0 problems.
ran rg 'LeanSAT provided a counter' | wc -l, this file found 0, rg found 0, SUCCESS
ran rg 'Bitwuzla provided a counter' | wc -l, this file found 0, rg found 0, SUCCESS
ran rg 'LeanSAT proved' | wc -l, this file found 0, rg found 0, SUCCESS
ran rg 'Bitwuzla proved' | wc -l, this file found 0, rg found 0, SUCCESS
The InstCombine benchmark contains 4520 theorems in total.
Saved dataframe at: /home/runner/work/lean-mlir/lean-mlir/bv-evaluation/raw-data/InstCombine/instcombine_ceg_data.csv
all_files_solved_bitwuzla_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_rw_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_bb_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_sat_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_lratt_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_lratc_times_stddev avg: nan | stddev: nan
mean of percentage stddev/av: nan%

2 similar comments
@github-actions
Copy link

bv_decide solved 0 theorems.
bitwuzla solved 0 theorems.
bv_decide found 0 counterexamples.
bitwuzla found 0 counterexamples.
bv_decide only failed on 0 problems.
bitwuzla only failed on 0 problems.
both bitwuzla and bv_decide failed on 0 problems.
In total, bitwuzla saw 0 problems.
In total, bv_decide saw 0 problems.
ran rg 'LeanSAT provided a counter' | wc -l, this file found 0, rg found 0, SUCCESS
ran rg 'Bitwuzla provided a counter' | wc -l, this file found 0, rg found 0, SUCCESS
ran rg 'LeanSAT proved' | wc -l, this file found 0, rg found 0, SUCCESS
ran rg 'Bitwuzla proved' | wc -l, this file found 0, rg found 0, SUCCESS
The InstCombine benchmark contains 4520 theorems in total.
Saved dataframe at: /home/runner/work/lean-mlir/lean-mlir/bv-evaluation/raw-data/InstCombine/instcombine_ceg_data.csv
all_files_solved_bitwuzla_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_rw_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_bb_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_sat_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_lratt_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_lratc_times_stddev avg: nan | stddev: nan
mean of percentage stddev/av: nan%

@github-actions
Copy link

bv_decide solved 0 theorems.
bitwuzla solved 0 theorems.
bv_decide found 0 counterexamples.
bitwuzla found 0 counterexamples.
bv_decide only failed on 0 problems.
bitwuzla only failed on 0 problems.
both bitwuzla and bv_decide failed on 0 problems.
In total, bitwuzla saw 0 problems.
In total, bv_decide saw 0 problems.
ran rg 'LeanSAT provided a counter' | wc -l, this file found 0, rg found 0, SUCCESS
ran rg 'Bitwuzla provided a counter' | wc -l, this file found 0, rg found 0, SUCCESS
ran rg 'LeanSAT proved' | wc -l, this file found 0, rg found 0, SUCCESS
ran rg 'Bitwuzla proved' | wc -l, this file found 0, rg found 0, SUCCESS
The InstCombine benchmark contains 4520 theorems in total.
Saved dataframe at: /home/runner/work/lean-mlir/lean-mlir/bv-evaluation/raw-data/InstCombine/instcombine_ceg_data.csv
all_files_solved_bitwuzla_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_rw_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_bb_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_sat_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_lratt_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_lratc_times_stddev avg: nan | stddev: nan
mean of percentage stddev/av: nan%

Copy link
Collaborator

@alexkeizer alexkeizer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While running inside the nix dev-shell, it dies with the following error:

Traceback (most recent call last):

  File "/home/alex/Workspace/PhD/lean-mlir-alt/bv-evaluation/.venv/lib/python3.12/site-packages/snakemake/cli.py", line 2165, in args_to_api
    dag_api.execute_workflow(

  File "/home/alex/Workspace/PhD/lean-mlir-alt/bv-evaluation/.venv/lib/python3.12/site-packages/snakemake/api.py", line 603, in execute_workflow
    workflow.execute(

  File "/home/alex/Workspace/PhD/lean-mlir-alt/bv-evaluation/.venv/lib/python3.12/site-packages/snakemake/workflow.py", line 1405, in execute
    raise e

  File "/home/alex/Workspace/PhD/lean-mlir-alt/bv-evaluation/.venv/lib/python3.12/site-packages/snakemake/workflow.py", line 1401, in execute
    success = self.scheduler.schedule()
              ^^^^^^^^^^^^^^^^^^^^^^^^^

  File "/home/alex/Workspace/PhD/lean-mlir-alt/bv-evaluation/.venv/lib/python3.12/site-packages/snakemake/scheduler.py", line 356, in schedule
    raise e

  File "/home/alex/Workspace/PhD/lean-mlir-alt/bv-evaluation/.venv/lib/python3.12/site-packages/snakemake/scheduler.py", line 200, in schedule
    self._finish_jobs()

  File "/home/alex/Workspace/PhD/lean-mlir-alt/bv-evaluation/.venv/lib/python3.12/site-packages/snakemake/scheduler.py", line 432, in _finish_jobs
    async_run(postprocess())

  File "/home/alex/Workspace/PhD/lean-mlir-alt/bv-evaluation/.venv/lib/python3.12/site-packages/snakemake/common/__init__.py", line 99, in async_run
    return asyncio.run(coroutine)
           ^^^^^^^^^^^^^^^^^^^^^^

  File "/home/alex/.local/share/uv/python/cpython-3.12.11-linux-x86_64-gnu/lib/python3.12/asyncio/runners.py", line 195, in run
    return runner.run(main)
           ^^^^^^^^^^^^^^^^

  File "/home/alex/.local/share/uv/python/cpython-3.12.11-linux-x86_64-gnu/lib/python3.12/asyncio/runners.py", line 118, in run
    return self._loop.run_until_complete(task)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

  File "/home/alex/.local/share/uv/python/cpython-3.12.11-linux-x86_64-gnu/lib/python3.12/asyncio/base_events.py", line 691, in run_until_complete
    return future.result()
           ^^^^^^^^^^^^^^^

  File "/home/alex/Workspace/PhD/lean-mlir-alt/bv-evaluation/.venv/lib/python3.12/site-packages/snakemake/scheduler.py", line 375, in postprocess
    await job.postprocess(

  File "/home/alex/Workspace/PhD/lean-mlir-alt/bv-evaluation/.venv/lib/python3.12/site-packages/snakemake/jobs.py", line 1260, in postprocess
    await self.dag.workflow.persistence.finished(self)

  File "/home/alex/Workspace/PhD/lean-mlir-alt/bv-evaluation/.venv/lib/python3.12/site-packages/snakemake/persistence.py", line 317, in finished
    params = self._params(job)
             ^^^^^^^^^^^^^^^^^

  File "/home/alex/Workspace/PhD/lean-mlir-alt/bv-evaluation/.venv/lib/python3.12/site-packages/snakemake/persistence.py", line 631, in _params
    return sorted(
           ^^^^^^^

  File "/home/alex/Workspace/PhD/lean-mlir-alt/bv-evaluation/.venv/lib/python3.12/site-packages/snakemake/persistence.py", line 634, in <genexpr>
    (self._serialize_param(value) for value in job.non_derived_params),
     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^

  File "/home/alex/Workspace/PhD/lean-mlir-alt/bv-evaluation/.venv/lib/python3.12/site-packages/snakemake/persistence.py", line 623, in _serialize_param_pandas
    import pandas as pd

  File "/nix/store/27drr45ziv3kbjl4pdzdh1cic7ncvpp2-python3-3.13.4-env/lib/python3.13/site-packages/pandas/__init__.py", line 19, in <module>
    raise ImportError(

ImportError: Unable to import required dependencies:
numpy: Error importing numpy: you should not try to import numpy from
        its source directory; please exit the numpy source tree, and relaunch
        your python interpreter from there.

Comment on lines +27 to +31
onstart:
shell("elan --version")
shell("cd {gitroot} && lake exe cache get && lake build")
shell("uv --version")
shell("bitwuzla --version")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be redirecting outputs to log files?

You might also want to print some prelude information before dumping elan --version straight to the user.

@alexkeizer
Copy link
Collaborator

While running inside the nix dev-shell, it dies with the following error:

Traceback (most recent call last):

  File "/home/alex/Workspace/PhD/lean-mlir-alt/bv-evaluation/.venv/lib/python3.12/site-packages/snakemake/cli.py", line 2165, in args_to_api
    dag_api.execute_workflow(

  File "/home/alex/Workspace/PhD/lean-mlir-alt/bv-evaluation/.venv/lib/python3.12/site-packages/snakemake/api.py", line 603, in execute_workflow
    workflow.execute(

  File "/home/alex/Workspace/PhD/lean-mlir-alt/bv-evaluation/.venv/lib/python3.12/site-packages/snakemake/workflow.py", line 1405, in execute
    raise e

  File "/home/alex/Workspace/PhD/lean-mlir-alt/bv-evaluation/.venv/lib/python3.12/site-packages/snakemake/workflow.py", line 1401, in execute
    success = self.scheduler.schedule()
              ^^^^^^^^^^^^^^^^^^^^^^^^^

  File "/home/alex/Workspace/PhD/lean-mlir-alt/bv-evaluation/.venv/lib/python3.12/site-packages/snakemake/scheduler.py", line 356, in schedule
    raise e

  File "/home/alex/Workspace/PhD/lean-mlir-alt/bv-evaluation/.venv/lib/python3.12/site-packages/snakemake/scheduler.py", line 200, in schedule
    self._finish_jobs()

  File "/home/alex/Workspace/PhD/lean-mlir-alt/bv-evaluation/.venv/lib/python3.12/site-packages/snakemake/scheduler.py", line 432, in _finish_jobs
    async_run(postprocess())

  File "/home/alex/Workspace/PhD/lean-mlir-alt/bv-evaluation/.venv/lib/python3.12/site-packages/snakemake/common/__init__.py", line 99, in async_run
    return asyncio.run(coroutine)
           ^^^^^^^^^^^^^^^^^^^^^^

  File "/home/alex/.local/share/uv/python/cpython-3.12.11-linux-x86_64-gnu/lib/python3.12/asyncio/runners.py", line 195, in run
    return runner.run(main)
           ^^^^^^^^^^^^^^^^

  File "/home/alex/.local/share/uv/python/cpython-3.12.11-linux-x86_64-gnu/lib/python3.12/asyncio/runners.py", line 118, in run
    return self._loop.run_until_complete(task)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

  File "/home/alex/.local/share/uv/python/cpython-3.12.11-linux-x86_64-gnu/lib/python3.12/asyncio/base_events.py", line 691, in run_until_complete
    return future.result()
           ^^^^^^^^^^^^^^^

  File "/home/alex/Workspace/PhD/lean-mlir-alt/bv-evaluation/.venv/lib/python3.12/site-packages/snakemake/scheduler.py", line 375, in postprocess
    await job.postprocess(

  File "/home/alex/Workspace/PhD/lean-mlir-alt/bv-evaluation/.venv/lib/python3.12/site-packages/snakemake/jobs.py", line 1260, in postprocess
    await self.dag.workflow.persistence.finished(self)

  File "/home/alex/Workspace/PhD/lean-mlir-alt/bv-evaluation/.venv/lib/python3.12/site-packages/snakemake/persistence.py", line 317, in finished
    params = self._params(job)
             ^^^^^^^^^^^^^^^^^

  File "/home/alex/Workspace/PhD/lean-mlir-alt/bv-evaluation/.venv/lib/python3.12/site-packages/snakemake/persistence.py", line 631, in _params
    return sorted(
           ^^^^^^^

  File "/home/alex/Workspace/PhD/lean-mlir-alt/bv-evaluation/.venv/lib/python3.12/site-packages/snakemake/persistence.py", line 634, in <genexpr>
    (self._serialize_param(value) for value in job.non_derived_params),
     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^

  File "/home/alex/Workspace/PhD/lean-mlir-alt/bv-evaluation/.venv/lib/python3.12/site-packages/snakemake/persistence.py", line 623, in _serialize_param_pandas
    import pandas as pd

  File "/nix/store/27drr45ziv3kbjl4pdzdh1cic7ncvpp2-python3-3.13.4-env/lib/python3.13/site-packages/pandas/__init__.py", line 19, in <module>
    raise ImportError(

ImportError: Unable to import required dependencies:
numpy: Error importing numpy: you should not try to import numpy from
        its source directory; please exit the numpy source tree, and relaunch
        your python interpreter from there.

This was a nix-skill-issue: the posted error is really saying that numpy failed to load the C extension, which is presumably caused by nix being nix. The fix was to ensure that pythonPackages312.numpy is installed in flake.nix (note that other dependencies are fine to be uv-installed, it's only this one that is problematic!

@github-actions
Copy link

bv_decide solved 0 theorems.
bitwuzla solved 0 theorems.
bv_decide found 0 counterexamples.
bitwuzla found 0 counterexamples.
bv_decide only failed on 0 problems.
bitwuzla only failed on 0 problems.
both bitwuzla and bv_decide failed on 0 problems.
In total, bitwuzla saw 0 problems.
In total, bv_decide saw 0 problems.
ran rg 'LeanSAT provided a counter' | wc -l, this file found 0, rg found 0, SUCCESS
ran rg 'Bitwuzla provided a counter' | wc -l, this file found 0, rg found 0, SUCCESS
ran rg 'LeanSAT proved' | wc -l, this file found 0, rg found 0, SUCCESS
ran rg 'Bitwuzla proved' | wc -l, this file found 0, rg found 0, SUCCESS
The InstCombine benchmark contains 4520 theorems in total.
Saved dataframe at: /home/runner/work/lean-mlir/lean-mlir/bv-evaluation/raw-data/InstCombine/instcombine_ceg_data.csv
all_files_solved_bitwuzla_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_rw_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_bb_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_sat_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_lratt_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_lratc_times_stddev avg: nan | stddev: nan
mean of percentage stddev/av: nan%

1 similar comment
@github-actions
Copy link

bv_decide solved 0 theorems.
bitwuzla solved 0 theorems.
bv_decide found 0 counterexamples.
bitwuzla found 0 counterexamples.
bv_decide only failed on 0 problems.
bitwuzla only failed on 0 problems.
both bitwuzla and bv_decide failed on 0 problems.
In total, bitwuzla saw 0 problems.
In total, bv_decide saw 0 problems.
ran rg 'LeanSAT provided a counter' | wc -l, this file found 0, rg found 0, SUCCESS
ran rg 'Bitwuzla provided a counter' | wc -l, this file found 0, rg found 0, SUCCESS
ran rg 'LeanSAT proved' | wc -l, this file found 0, rg found 0, SUCCESS
ran rg 'Bitwuzla proved' | wc -l, this file found 0, rg found 0, SUCCESS
The InstCombine benchmark contains 4520 theorems in total.
Saved dataframe at: /home/runner/work/lean-mlir/lean-mlir/bv-evaluation/raw-data/InstCombine/instcombine_ceg_data.csv
all_files_solved_bitwuzla_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_rw_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_bb_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_sat_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_lratt_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_lratc_times_stddev avg: nan | stddev: nan
mean of percentage stddev/av: nan%

@tobiasgrosser
Copy link
Collaborator

It seems the mathlib cache is download twice for me:

Using cache from origin: leanprover-community/mathlib4-nightly-testing
Attempting to download 7045 file(s) from leanprover-community/mathlib4 cache
Downloaded: 0 file(s) [attempted 7045/7045 = 100%] (0% success)
Attempting to download 7045 file(s) from leanprover-community/mathlib4-nightly-testing cache
Downloaded: 2162 file(s) [attempted 2162/7045 = 30%]

@github-actions
Copy link

bv_decide solved 0 theorems.
bitwuzla solved 0 theorems.
bv_decide found 0 counterexamples.
bitwuzla found 0 counterexamples.
bv_decide only failed on 0 problems.
bitwuzla only failed on 0 problems.
both bitwuzla and bv_decide failed on 0 problems.
In total, bitwuzla saw 0 problems.
In total, bv_decide saw 0 problems.
ran rg 'LeanSAT provided a counter' | wc -l, this file found 0, rg found 0, SUCCESS
ran rg 'Bitwuzla provided a counter' | wc -l, this file found 0, rg found 0, SUCCESS
ran rg 'LeanSAT proved' | wc -l, this file found 0, rg found 0, SUCCESS
ran rg 'Bitwuzla proved' | wc -l, this file found 0, rg found 0, SUCCESS
The InstCombine benchmark contains 4520 theorems in total.
Saved dataframe at: /home/runner/work/lean-mlir/lean-mlir/bv-evaluation/raw-data/InstCombine/instcombine_ceg_data.csv
all_files_solved_bitwuzla_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_rw_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_bb_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_sat_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_lratt_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_lratc_times_stddev avg: nan | stddev: nan
mean of percentage stddev/av: nan%

@bollu
Copy link
Collaborator Author

bollu commented Aug 27, 2025

It seems the mathlib cache is download twice for me:

Using cache from origin: leanprover-community/mathlib4-nightly-testing
Attempting to download 7045 file(s) from leanprover-community/mathlib4 cache
Downloaded: 0 file(s) [attempted 7045/7045 = 100%] (0% success)
Attempting to download 7045 file(s) from leanprover-community/mathlib4-nightly-testing cache
Downloaded: 2162 file(s) [attempted 2162/7045 = 30%]

I merged the latest origin/main. Does this help?

@alexkeizer
Copy link
Collaborator

After tweaking flake.nix to install numpy for the specific python version that snakemake wants, this now works on my side! LGTM

@bollu
Copy link
Collaborator Author

bollu commented Aug 28, 2025

@tobiasgrosser I have an LGTM from Alex, so I propose to merge this.

github-merge-queue bot pushed a commit that referenced this pull request Aug 29, 2025
As per #1613, I'm experimenting with an alternative CI setup where we
run everything inside a Docker container. Until we commit to this being
the workflow we like, I'll duplicate existing CI jobs and have a
*seperate* job run the same action inside the container (on a
GH-provided runner, but using the image we built on namespace's runner).

This PR does so for the CI job that builds the non-standard build
targets and the Instcombine evaluation (but not yet for hackersdelight,
for that I'm waiting on #1549).

To make this work, I:
- Add more stuff to the .dockerignore file, so files which are not
needed are not copied to the Docker image (and thus won't trigger cache
misses).
- Expand the existing bitwuzla image to also install uv, use uv to
install the python dependencies specified in requirements.txt, and copy
the lean-mlir files from the base-image to get a full-blown instcombine
image; the produced image is thus renamed to `lean-mlir-instcombine`.
Note that this second image is still build on GH runner infrastructure!
This is fine as we don't actually `lake build` anything here, so there
is no need for incremental caching (but of course, using namespace.so
would still likely be faster).
- I tried dropping permissions in the Dockerfile to use a non-root user
- Github will change the home-directory to `/github/home`, so I
originally thought it was using a github user because of security
concerns with using a root user. Thus, I tried to set a `USER` in the
dockerfile.
- However, it just made everything more complicated, and random actions
started failing with permission errors.
- Then I tried setting the home-directory explictly to `/root` using
`env.var`, but then actions started complaining about
`/root/.docker/config.json` not existing...
- Just leaving the status quo is non-ideal, as .elan was re-downloading
the toolchain (since the Docker image has it at /root/.elan, while in
the action it was looking at `/github/home/.elan`
- Finally, I worked around it by adding a first run step that symlinks
the latter to the former to every job that runs in a container. It's not
the most elegant, as it adds a boilerplate run step to every job, but it
actually works, so 🤷
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

needs-consensus issues which affect the project as a whole, and which are currently blocked on reaching consensus

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants