Skip to content

Conversation

rocallahan
Copy link
Contributor

What are the reasons/motivation for this change?

#5312 explains how Const is currently very inefficient. This PR fixes a lot of that.

Explain how this is achieved.

The main piece of work here is to remove all direct calls to Const::bits(). We add some new APIs to Const that callers can use instead. Then we make the Const(long long) constructor use the packed bits representation (i.e. std::string), and update various Const methods to work with that representation instead of downconverting to vector<State>. On one of our large circuits, this significantly reduces the time spend in OptMergePass::hash_cell_parameters().

I realize this is a huge PR. I should probably break it up into smaller PRs for actual review and merging. I'm putting it here so people can see what it looks like.

If applicable, please suggest to reviewers how they can test the change.

This shouldn't change any behavior. FWIW, after I created all the big patches to remove all calls to Const::bits() and got them to compile, Yosys tests only revealed one bug. So the number of latent bugs not caught by tests should hopefully be correspondingly low.

@widlarizer
Copy link
Collaborator

From taking a look at the changes in rtlil.cc and rtlil.h, this does look very promising

@rocallahan
Copy link
Contributor Author

rocallahan commented Sep 1, 2025

I've added a bunch of tests to rtlilTest.cc.

Writing these tests revealed a regression due to my changes --- I was computing the hash value of a Const differently for packed bits vs explicit vector, even when they represented the same list of bits. So I've fixed that.

@widlarizer
Copy link
Collaborator

The upgrade from c.bits().resize(i, val) to c.resize(i, val) is not clear to me, since the resize method does unconditionally bitvectorize. Is this something you intend to change?

@widlarizer
Copy link
Collaborator

Providing only a const_iterator was a deliberate decision, I wonder if there are scenarios where somebody could accidentally get a bitvectorizing iterator when they only need a const_iterator. I'm guessing the purpose of the iterator::proxy is to allow setting bits to two-valued states in string-backed consts. The proxy sounds like a neat encapsulation, but I can't imagine a case where using it to patch string-backed consts would actually save memory or runtime... that sort of thing shouldn't be happening in the codebase

@rocallahan
Copy link
Contributor Author

The upgrade from c.bits().resize(i, val) to c.resize(i, val) is not clear to me, since the resize method does unconditionally bitvectorize. Is this something you intend to change?

Both of those unconditionally bitvectorize, so I think nothing is changing here.

@rocallahan
Copy link
Contributor Author

I wonder if there are scenarios where somebody could accidentally get a bitvectorizing iterator when they only need a const_iterator.

The non-const iterator does not bitvectorize --- unless/until you write through the iterator, in which case it does. But if you're modifying bits then you would need to bitvectorize anyway, since currently we don't allow modifying the packed bits.

I'm guessing the purpose of the iterator::proxy is to allow setting bits to two-valued states in string-backed consts.

No. It's to catch writes to through the iterator, so we turn those into set() which bitvectorizes if necessary.

The proxy sounds like a neat encapsulation, but I can't imagine a case where using it to patch string-backed consts would actually save memory or runtime... that sort of thing shouldn't be happening in the codebase

Right! This code does not add support for patching string-backed consts.

@rocallahan rocallahan marked this pull request as ready for review September 10, 2025 23:46
@rocallahan
Copy link
Contributor Author

By the way, I didn't mention this anywhere, but maybe it's not obvious: we really benefit here from small-string optimization in the C++ standard library. I.e. for small strings (<= 15 bytes typically, which certainly includes the very common 32-bit constants), the string is not heap-allocated but instead stored directly in the std::string. Which means a Const created by the default Const(long long) 32-bit constructor doesn't require any heap allocation.

@rocallahan
Copy link
Contributor Author

There's a message in that last test log that makes no sense to me:

2025-09-11T00:18:36.0028437Z ../kernel/rtlil.cc:471:13: runtime error: left shift of negative value -1
2025-09-11T00:18:36.0039548Z SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior ../kernel/rtlil.cc:471:13 

The relevant code is

	if (is_signed && significant_bits > 0 && significant_bits < 32 && bv.back() == State::S1 )
		ret |= -1 << significant_bits;

so it's pretty obvious that significant_bits can't be negative here :-(.

I can't reproduce any test failures running UBSAN locally.

I'll rebase to main and re-push and see if the error persists.

@whitequark
Copy link
Member

so it's pretty obvious that significant_bits can't be negative here :-(.

It's complaining about the left hand side. Shifting 1 into the sign bit is UB in C.

@whitequark
Copy link
Member

You probably want ret |= -1UL << significant_bits; or something like that.

@rocallahan
Copy link
Contributor Author

It's complaining about the left hand side. Shifting 1 into the sign bit is UB in C.

Huh. Which means left-shift of any negative value is UB in C! I never knew that.

Odd that it doesn't get reported locally. Possibly that got changed in my version of clang because C++20 makes left-shift of a negative number defined behavior.

Anyway, fixed it (I hope) using UINT32_MAX << significant_bits.

Copy link
Member

@jix jix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good, mostly based on what @widlarizer had already reviewed and me reviewing a few specific things after coordinating internally on what was still left.

Now that we only call `bitvectorize()` in non-const methods, we can move the casting-away-const to only happen
in `bitvectorize()`, which is deprecated so only some plugins (maybe) are using it.

This means `const` `Const` methods don't change the underlying data, which means
they'll be safe to use from multiple threads if/when we want to do that.
@widlarizer widlarizer merged commit c057789 into YosysHQ:main Sep 16, 2025
27 checks passed
@rocallahan rocallahan deleted the cleanup-Const branch September 16, 2025 10:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants