Skip to content

Conversation

ineol
Copy link
Collaborator

@ineol ineol commented Oct 8, 2025

It seems to be working on a small example.

It needs to special case functions such as zeroExtend to recognize that some parameters are "width" parameters.

@ineol ineol requested a review from bollu October 8, 2025 20:33
Copy link

github-actions bot commented Oct 8, 2025

bv_decide solved 0 theorems.
bitwuzla solved 0 theorems.
bv_decide found 0 counterexamples.
bitwuzla found 0 counterexamples.
bv_decide only failed on 0 problems.
bitwuzla only failed on 0 problems.
both bitwuzla and bv_decide failed on 0 problems.
In total, bitwuzla saw 0 problems.
In total, bv_decide saw 0 problems.
ran rg 'LeanSAT provided a counter' | wc -l, this file found 0, rg found 0, SUCCESS
ran rg 'Bitwuzla provided a counter' | wc -l, this file found 0, rg found 0, SUCCESS
ran rg 'LeanSAT proved' | wc -l, this file found 0, rg found 0, SUCCESS
ran rg 'Bitwuzla proved' | wc -l, this file found 0, rg found 0, SUCCESS
The InstCombine benchmark contains 4520 theorems in total.
Saved dataframe at: /home/runner/work/lean-mlir/lean-mlir/bv-evaluation/raw-data/InstCombine/instcombine_ceg_data.csv
all_files_solved_bitwuzla_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_rw_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_bb_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_sat_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_lratt_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_lratc_times_stddev avg: nan | stddev: nan
mean of percentage stddev/av: nan%

@ineol ineol force-pushed the push-tnpkrqksnkvx branch from d003a14 to 9feffb1 Compare October 8, 2025 22:09
Copy link

github-actions bot commented Oct 8, 2025

bv_decide solved 0 theorems.
bitwuzla solved 0 theorems.
bv_decide found 0 counterexamples.
bitwuzla found 0 counterexamples.
bv_decide only failed on 0 problems.
bitwuzla only failed on 0 problems.
both bitwuzla and bv_decide failed on 0 problems.
In total, bitwuzla saw 0 problems.
In total, bv_decide saw 0 problems.
ran rg 'LeanSAT provided a counter' | wc -l, this file found 0, rg found 0, SUCCESS
ran rg 'Bitwuzla provided a counter' | wc -l, this file found 0, rg found 0, SUCCESS
ran rg 'LeanSAT proved' | wc -l, this file found 0, rg found 0, SUCCESS
ran rg 'Bitwuzla proved' | wc -l, this file found 0, rg found 0, SUCCESS
The InstCombine benchmark contains 4520 theorems in total.
Saved dataframe at: /home/runner/work/lean-mlir/lean-mlir/bv-evaluation/raw-data/InstCombine/instcombine_ceg_data.csv
all_files_solved_bitwuzla_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_rw_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_bb_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_sat_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_lratt_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_lratc_times_stddev avg: nan | stddev: nan
mean of percentage stddev/av: nan%

@ineol ineol force-pushed the push-tnpkrqksnkvx branch 2 times, most recently from e6f5a3d to 0cbe25c Compare October 9, 2025 09:40
@ineol ineol marked this pull request as ready for review October 9, 2025 09:40
@ineol
Copy link
Collaborator Author

ineol commented Oct 9, 2025

@bollu it seems to be working now

Copy link

github-actions bot commented Oct 9, 2025

bv_decide solved 0 theorems.
bitwuzla solved 0 theorems.
bv_decide found 0 counterexamples.
bitwuzla found 0 counterexamples.
bv_decide only failed on 0 problems.
bitwuzla only failed on 0 problems.
both bitwuzla and bv_decide failed on 0 problems.
In total, bitwuzla saw 0 problems.
In total, bv_decide saw 0 problems.
ran rg 'LeanSAT provided a counter' | wc -l, this file found 0, rg found 0, SUCCESS
ran rg 'Bitwuzla provided a counter' | wc -l, this file found 0, rg found 0, SUCCESS
ran rg 'LeanSAT proved' | wc -l, this file found 0, rg found 0, SUCCESS
ran rg 'Bitwuzla proved' | wc -l, this file found 0, rg found 0, SUCCESS
The InstCombine benchmark contains 4520 theorems in total.
Saved dataframe at: /home/runner/work/lean-mlir/lean-mlir/bv-evaluation/raw-data/InstCombine/instcombine_ceg_data.csv
all_files_solved_bitwuzla_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_rw_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_bb_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_sat_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_lratt_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_lratc_times_stddev avg: nan | stddev: nan
mean of percentage stddev/av: nan%

1 similar comment
Copy link

github-actions bot commented Oct 9, 2025

bv_decide solved 0 theorems.
bitwuzla solved 0 theorems.
bv_decide found 0 counterexamples.
bitwuzla found 0 counterexamples.
bv_decide only failed on 0 problems.
bitwuzla only failed on 0 problems.
both bitwuzla and bv_decide failed on 0 problems.
In total, bitwuzla saw 0 problems.
In total, bv_decide saw 0 problems.
ran rg 'LeanSAT provided a counter' | wc -l, this file found 0, rg found 0, SUCCESS
ran rg 'Bitwuzla provided a counter' | wc -l, this file found 0, rg found 0, SUCCESS
ran rg 'LeanSAT proved' | wc -l, this file found 0, rg found 0, SUCCESS
ran rg 'Bitwuzla proved' | wc -l, this file found 0, rg found 0, SUCCESS
The InstCombine benchmark contains 4520 theorems in total.
Saved dataframe at: /home/runner/work/lean-mlir/lean-mlir/bv-evaluation/raw-data/InstCombine/instcombine_ceg_data.csv
all_files_solved_bitwuzla_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_rw_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_bb_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_sat_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_lratt_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_lratc_times_stddev avg: nan | stddev: nan
mean of percentage stddev/av: nan%

Copy link
Collaborator

@bollu bollu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I agree that this is a viable approach, I had a different algorithm in mind:

At each function application f x1 ... xn, infer the type of the function as f : T1 -> T2 .. -> Tn -> O. Inspect the (xi, ti), and if the type ti is a BitVec K, we perform the generalization as you did. This avoids the need for a table, but I guess I'm now not sure about the tradeoffs between having a table and not having one.

Anyway, LGTM to me! I wrote down some comments, but none of them are blockers to merging.

Thanks muchly ^_^

Comment on lines 1108 to 1110
for (e', x) in s.mapping do
if ← isDefEq e e' then
return x
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
for (e', x) in s.mapping do
if ← isDefEq e e' then
return x
/-- TODO: Instead of using a HashMap, consider using a DiscrTree. -/
for (e', x) in s.mapping do
if ← isDefEq e e' then
return x

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm is there a good way to recover the original expression from the keys of the tree?

let arg ← if bv? i then State.add? arg else visit arg
pure <| .app res arg
| .forallE n e₁ e₂ info =>
pure <| .forallE n (← visit e₁) (← visit e₂) info
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why don't you use the combinators forallTelescoping and instead work with raw BVars? I guess in this case it's OK, but I do wonder why you prefer the approach :)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm we want to recurse in e1 so we'd need to first change the types in the telescope and then use forallTelescope right?

Comment on lines +1131 to +1145
def genTable : Std.HashMap Name (Array Bool) := Id.run do
let mut table := .emptyWithCapacity 16
table := table.insert ``BitVec #[true]
table := table.insert ``BitVec.zeroExtend #[true, true, false]
table := table.insert ``BitVec.signExtend #[true, true, false]
table := table.insert ``BitVec.instAdd #[true]
table := table.insert ``BitVec.instSub #[true]
table := table.insert ``BitVec.instMul #[true]
table := table.insert ``BitVec.instDiv #[true]
table
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we want to always generalize a BV variable? My intuition is that instead of having a table, we check if a value has type BitVec w, and if it is, we generalize it, with a possible exception for BitVec 1?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is that we want to generalize 10 in x.signExtend 10 but it's only because we know that the first parameter of signExtend is a width and not a random Nat.

A more rigorous approach would be to analyze at the type of signExtend and see that the variable appears as a paramter of BitVec and recover the information in the table like that, but it seems complicated.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just read your comment above, so we agree on the method :)

I think we can try this simple approach and see if it's sufficient for the evaluation.

@ineol ineol force-pushed the push-tnpkrqksnkvx branch from 0cbe25c to 867a36c Compare October 9, 2025 12:08
Copy link

github-actions bot commented Oct 9, 2025

bv_decide solved 0 theorems.
bitwuzla solved 0 theorems.
bv_decide found 0 counterexamples.
bitwuzla found 0 counterexamples.
bv_decide only failed on 0 problems.
bitwuzla only failed on 0 problems.
both bitwuzla and bv_decide failed on 0 problems.
In total, bitwuzla saw 0 problems.
In total, bv_decide saw 0 problems.
ran rg 'LeanSAT provided a counter' | wc -l, this file found 0, rg found 0, SUCCESS
ran rg 'Bitwuzla provided a counter' | wc -l, this file found 0, rg found 0, SUCCESS
ran rg 'LeanSAT proved' | wc -l, this file found 0, rg found 0, SUCCESS
ran rg 'Bitwuzla proved' | wc -l, this file found 0, rg found 0, SUCCESS
The InstCombine benchmark contains 4520 theorems in total.
Saved dataframe at: /home/runner/work/lean-mlir/lean-mlir/bv-evaluation/raw-data/InstCombine/instcombine_ceg_data.csv
all_files_solved_bitwuzla_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_rw_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_bb_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_sat_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_lratt_times_stddev avg: nan | stddev: nan
all_files_solved_bv_decide_lratc_times_stddev avg: nan | stddev: nan
mean of percentage stddev/av: nan%

@ineol ineol added this pull request to the merge queue Oct 9, 2025
Merged via the queue into main with commit e0bb242 Oct 9, 2025
20 of 21 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants