Skip to content
Merged
Show file tree
Hide file tree
Changes from 50 commits
Commits
Show all changes
52 commits
Select commit Hold shift + click to select a range
35bac06
convert static config data into functions (with dummy values, compile…
jberthold Sep 30, 2025
2f19a1f
add a notes file describing the approach (in docs, should remove or e…
jberthold Sep 30, 2025
d5e7613
Merge remote-tracking branch 'origin/master' into EXPERIMENT-compilin…
jberthold Sep 30, 2025
01b2a04
CHERRY-PICK-ME Adjust Makefile to avoid nix-formatting files that are…
jberthold Sep 30, 2025
902361f
WIP generate module (plugged into gen-spec for manual testing
jberthold Oct 1, 2025
0a7c31d
Change type for lookupAlloc and decodeAlloc to Evaluation
jberthold Oct 1, 2025
bfde0a5
suppress duplicate adtDef equations (polymorphic ADTs), adjust #decod…
jberthold Oct 1, 2025
992b34a
remove @ chars from function names to make llvm's lld happy :explodin…
jberthold Oct 1, 2025
918be67
Draft code for proving with static data in a module
jberthold Oct 1, 2025
048865e
formatting etc
jberthold Oct 1, 2025
e258ac7
Reduce item table before generating the K module (prevents re-use, en…
jberthold Oct 1, 2025
0c960e5
Use info logging to report status and module file
jberthold Oct 1, 2025
4bed164
CHERRY-PICK-ME escape \n, @, and " when printing bytes as ascii text
jberthold Oct 2, 2025
d14e002
Merge remote-tracking branch 'origin/master' into EXPERIMENT-compilin…
jberthold Oct 13, 2025
df4e06f
adjust alloc equation generation. Status: Not working as expected
jberthold Oct 13, 2025
d01cf2c
WIP implementing run-x
jberthold Oct 13, 2025
4457782
WIP run-x: try-catch around module compilation
jberthold Oct 13, 2025
1d72581
update the notes with some insight about blockers and workarounds
jberthold Oct 13, 2025
311b631
WIP adding a `run-x` command for testing
jberthold Oct 13, 2025
a7c1f42
fix lookupAlloc in generated module, use custom constructor in prove-x
jberthold Oct 14, 2025
5c402e9
WIP generating kore function equations for the module
jberthold Oct 14, 2025
d09c5f3
produce and render function equations using pyk, define shorter symbols
jberthold Oct 14, 2025
e576a1e
complete definition.kore generation and llvm kompile (draft)
jberthold Oct 14, 2025
2245172
fix app and sorts in rule rendering, simply print rules unless output…
jberthold Oct 14, 2025
bde92b8
move kore generation into a KMIR constructor, adjustments
jberthold Oct 15, 2025
0f67a25
Add definedness and totality to avoid fall-back
jberthold Oct 15, 2025
ba427bc
WIP hacking in kore-based running and proving. run-x not working as e…
jberthold Oct 15, 2025
e2ac856
Reduce logging for pyk.ktool.kprint
jberthold Oct 15, 2025
de64d26
Merge remote-tracking branch 'origin/master' into EXPERIMENT-compilin…
jberthold Oct 15, 2025
0a40519
remove K module based approach
jberthold Oct 15, 2025
cf54a71
prune read-only config parts from make_call_config
jberthold Oct 15, 2025
94b98d5
refactor KMIR.prove-rs to a static method, move code out of __main__.…
jberthold Oct 15, 2025
0c92042
Merge remote-tracking branch 'origin/master' into EXPERIMENT-compilin…
jberthold Oct 15, 2025
e59b649
Remove default dir out-kore, implement non-symbolic compilation path
jberthold Oct 16, 2025
00ccba1
Remove run-x, moving code to `_kmir_run`
jberthold Oct 16, 2025
8320c06
fix llvm execution
jberthold Oct 16, 2025
11a4869
Remove all read-only data from the expectation files for exec-smir tests
jberthold Oct 16, 2025
5db61f0
Merge remote-tracking branch 'origin/master' into EXPERIMENT-compilin…
jberthold Oct 16, 2025
43d8dd3
remove outdated test_prove test (was based on K spec files)
jberthold Oct 16, 2025
f0c68e6
Fix prove_rs routine (scope of tmp_dir was too short)
jberthold Oct 16, 2025
074ef3c
Change `test_prove_termination` to use `prove_rs` instead of `gen-spe…
jberthold Oct 16, 2025
03547fa
adjust expectations for 3 prove-rs tests which call panic (unknown fu…
jberthold Oct 17, 2025
56ddf51
notes file update
jberthold Oct 17, 2025
30aa4f3
notes update
jberthold Oct 17, 2025
312eb7f
restore (unused) gen-spec, remove prove-x helper, remove stale code
jberthold Oct 17, 2025
4109071
Adjust expectations for panic calls in single-lib and cli_statistics …
jberthold Oct 17, 2025
148a88b
WIP attempt to create type lookups fixture for test_decode_value
jberthold Oct 17, 2025
0b036d7
adjust dir name for LLVM compilation
jberthold Oct 17, 2025
9e925ba
Synchronise llvm-kompile call with file lock before test_decode_value
jberthold Oct 17, 2025
6ae2a0a
fml
jberthold Oct 17, 2025
6414c85
Adjust expectation for enum-two-refs-fail (K decoding checks isValue …
jberthold Oct 17, 2025
f20af1e
remove notes file
jberthold Oct 17, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
146 changes: 146 additions & 0 deletions docs/notes-compile-to-k.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,146 @@
Compiling a Rust program via SMIR JSON to a K module
============================

To avoid fattening the configuration beyond the size manageable for the symbolic backend,
static elements of the configuration (function table, types table, allocs table, adt table)
are translated to K functions in a dedicated module.
This module is then imported into the KMIR semantics to extend the function equations provided in the semantics.

Note: no need to _extend_, instead we can just provide that module for import.
Requires standard name and location, complicates debugging as we don't know which program we were dealing with.

# Adapting the semantics

* Declare functions to look up static data.
- Functions declared total, return dummy values in `[owise]` rules for now (HACK).
- Functions are currently declared in `value.md` module, could have their own module.
- TODO inconsistent naming
* All uses of static data cells in the semantics replaced by function calls

# Generating a module with the declared functions for a particular program

* Python code `SMIR -> K Module (as Text)`
* imports `VALUE-SYNTAX` (`value.md`) to bring function declaration into scope
* imports `KMIR` to have semantics in scope

* Four groups of equations
1. `lookupFunction : Ty -> MonoItemKind`
- depends on `MONO` (`mono.md`), already present via `RT-VALUE-SYNTAX`
- contains equations `lookupFunction(ty(N)) => Item`
- item with symbol_name from `functions[N]`
- add negative `N` for all items not present in `functions`
- most Python code exists in `kmir.py::_make_function_map`
but needs to be copied and modified to make function equations instead of a K `Map`
2. `lookupAlloc : AllocId -> Value`
- depends on `RT-DECODING` (`decoding.md`)
- contains equations `lookupAlloc(ID) => decodeAlloc(ID, TY, ALLOC)[ID]`
- TODO remove the `Map` return
- using data from `allocs` (each `AllocInfo(ID, TY, ALLOC)`)
- we should probably decode in python later?
3. `lookupTy : Ty -> TypeInfo`
- depends on `TYPES` (`ty.md`), already present via `RT-VALUE-SYNTAX`
- contains equations `lookupTy(T) => typeInfo`
- using data from `types` (each `[T, TypeInfo]`)
- most python code exists in `kmir.py::_make_type_and_adt_maps`
4. `lookupAdtTy : AdtDef -> Ty`
- depends on `TYPES` (`ty.md`), already present
- most python code exists in `kmir.py::_make_type_and_adt_maps`

# Compiling the module together with the semantics

The generated module becomes the main module to run `kompile` calls on
- three compilations: llvm, llvm-library, haskell

The compiled artefacts should be what the `KMIR` class uses (`definition_dir`).

That means, the module generation and compilation happens when creating the `KMIR` object.
Module and compilation results should be cached (can be omitted in a prototype, recompiling every time using temp files).

# Changes to running and proving

To run a program/function or prove a property, we still generate a call configuration
but the configuration does not have any static fields any more.
The configuration is executed with the compiled module.

# Obstacles

1. LLVM backend is confused about symbol versions when an `@` char is in a string or binary data (`\x40h`).
This was caused by the LLVM backend using the literal kore string as a symbol name. Fixed in llvm backend, open PR.
2. Compilation fails because of "long" lists (>100 elements) causing problems in the rule parser.
Work-arounds considered:
- generate kore instead , and insert it into the `definition.kore`
- need to do this manually as there are no helper functions to make `kore` rules directly.
- generate `kast` syntax and splice into the K code.
- need to extend `pyk` with a `kast` pretty-printer. The `kast` format specification is somewhat tricky and incomplete, as I was told.
Work-around 1 appears to work, code developed on new branch extending the other.


# Technical work to prepare the change

## Integration tests
- [ ] `test_decode_value`: need to replace type map by definition contents (non-python tests only)
- pre-compile an LLVM interpreter with the types fixed to a standard table (covering all test cases)
- alternative is compiling once per test, bad.
- [ ] `exec_smir`: needs an implementation of `kmir run` for concrete execution, LLVM and Haskell
- KMIR.run_smir should just be a static method, preparing the definition beforehand
- BTW the _binary_ format is the problem for `kmir run` with `--backend haskell`
- [ ] `prove_rs`:
- [X] turned `KMIR.prove_rs` into a static method, preparing the definition beforehand
- [ ] proof failures:
```
FAILED test_prove_termination[Ref-weirdRefs] - assert False
FAILED test_prove_termination[Array-indexing] - assert False
FAILED test_prove_termination[Array-index-writes] - assert False
FAILED test_prove_termination[Array-inlined] - assert False
FAILED test_prove_termination[Alloc-array-const-compare] - assert False
FAILED test_prove_termination[niche-enum] - assert False
FAILED test_prove_termination[main-a-b-c] - assert False
FAILED test_prove_termination[assign-cast] - assert False
FAILED test_prove_termination[struct-field-update] - assert False
FAILED test_prove_termination[arithmetic] - assert False
FAILED test_prove_termination[arithmetic-unchecked] - assert False
FAILED test_prove_termination[unary] - assert False
FAILED test_prove_termination[Ref-simple] - assert False
FAILED test_prove_termination[Ref-refAsArg] - assert False
FAILED test_prove_termination[Ref-doubleRef] - assert False
```
- `test_prove_termination` for all failing tests.
- uses `gen-spec` and `prove-raw`
- should instead use `prove-rs --smir`

- [ ] compilation failures (non-deterministic):
```
FAILED test_prove_rs[double-ref-deref] - subprocess.CalledProcessError: Command '['llvm-kompile', 'out-kore/llvm-library/definition.kore', 'out-kore/llvm-library/dt', 'c', '-O2', '--', '-o', PosixPath('out-kore/llvm-library/interpreter')]'...
FAILED test_prove_rs[array_nest_compare] - subprocess.CalledProcessError: Command '['llvm-kompile', 'out-kore/llvm-library/definition.kore', 'out-kore/llvm-library/dt', 'c', '-O2', '--', '-o', PosixPath('out-kore/llvm-library/interpreter')]'...
FAILED test_prove_rs[branch-negative] - subprocess.CalledProcessError: Command '['llvm-kompile', 'out-kore/llvm-library/definition.kore', 'out-kore/llvm-library/dt', 'c', '-O2', '--', '-o', PosixPath('out-kore/llvm-library/interpreter')]'...
FAILED test_prove_rs[bitwise-not-shift] - subprocess.CalledProcessError: Command '['llvm-kompile', 'out-kore/llvm-library/definition.kore', 'out-kore/llvm-library/dt', 'c', '-O2', '--', '-o', PosixPath('out-kore/llvm-library/interpreter')]'...
FAILED test_prove_rs[pointer-cast] - subprocess.CalledProcessError: Command '['llvm-kompile', 'out-kore/llvm-library/definition.kore', 'out-kore/llvm-library/dt', 'c', '-O2', '--', '-o', PosixPath('out-kore/llvm-library/interpreter')]'...
```
- caused by concurrent access to the same files. Need to compile into temp directory or proof-dir
- [ ] Unexpected output: Probably always an additional step to an unknown function
```
FAILED kmir/src/tests/integration/test_cli.py::test_cli_show_statistics_and_leaves - AssertionError: The actual output does not match the expected output:
FAILED test_prove_rs[symbolic-structs-fail] - AssertionError: The actual output does not match the expected output:
FAILED test_crate_examples[single-lib] - AssertionError: The actual output does not match the expected output:
FAILED test_prove_rs[symbolic-args-fail] - AssertionError: The actual output does not match the expected output:
FAILED test_prove_rs[pointer-cast-length-test-fail] - AssertionError: The actual output does not match the expected output:
```
- could extend the function lookup and provide the symbol name for functions not present in the items table.
- or just leave things as they are, not much difference to before (getting stuck one step later now)
- [ ] `test_prove`: single test with a checked-in spec file. Delete this test!

1. use temp directory or proof directory for prove-rs DONE
2. do not copy the bin file for HS backend DONE
3. implement LLVM backend compilation for concrete execution DONE
4. delete `test_prove` test DONE
5. re-test (without `test_decode_value`) DONE

1. factor out the compilation method and revert prove_rs change (remove `KMIR.from_compiled_kore`)
Unfortunately this does not work. It would mean to move logic from `prove_rs` out of `KMIR`
to the call sites because we need the SMIRInfo to produce the modified KMIR object.
Instead, `KMIR.run_smir` becomes a static method with the same techniques.
2. use `run_process2`
3. implement KMIR.run_rs (with the above)
4. implement fixed type lookup table for `test_decode_value` (per session recompiled fixture)
5. remove helper constructs and commands
6. re-test
10 changes: 6 additions & 4 deletions kmir/src/kmir/__main__.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@

import logging
import sys
import tempfile
from argparse import ArgumentParser
from pathlib import Path
from typing import TYPE_CHECKING
Expand Down Expand Up @@ -57,13 +58,14 @@ def _kmir_run(opts: RunOpts) -> None:
# target = opts.bin if opts.bin else cargo.default_target
smir_info = cargo.smir_for_project(clean=False)

result = kmir.run_smir(smir_info, start_symbol=opts.start_symbol, depth=opts.depth)
print(kmir.kore_to_pretty(result))
with tempfile.TemporaryDirectory() as work_dir:
kmir = KMIR.from_kompiled_kore(smir_info, symbolic=opts.haskell_backend, target_dir=work_dir)
result = kmir.run_smir(smir_info, start_symbol=opts.start_symbol, depth=opts.depth)
print(kmir.kore_to_pretty(result))


def _kmir_prove_rs(opts: ProveRSOpts) -> None:
kmir = KMIR(HASKELL_DEF_DIR, LLVM_LIB_DIR, bug_report=opts.bug_report)
proof = kmir.prove_rs(opts)
proof = KMIR.prove_rs(opts)
print(str(proof.summary))
if not proof.passed:
sys.exit(1)
Expand Down
1 change: 1 addition & 0 deletions kmir/src/kmir/build.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,3 +10,4 @@
LLVM_DEF_DIR: Final = kdist.which('mir-semantics.llvm')
LLVM_LIB_DIR: Final = kdist.which('mir-semantics.llvm-library')
HASKELL_DEF_DIR: Final = kdist.which('mir-semantics.haskell')
KMIR_SOURCE_DIR: Final = kdist.which('mir-semantics.source')
Loading
Loading