Skip to content

Conversation

@corneliusroemer
Copy link
Member

Looking into sort performance (see #1647), I noticed that the various string containment operations where performance limiting.

Instead of doing multiple string containment call, this PR uses a compile time lookup table to turn a sequence into a bitpacked representation.

A quick benchmark on 240MB of SC2 sequences show this reduces core sort algorithm runtime (excluding minimizer download) by 42%.

…cked representation

Looking into sort performance (see #1647), I noticed that the various string containment operations where performance limiting.

Instead of doing multiple string containment call, this PR uses a compile time lookup table to turn a sequence into a bitpacked representation.

A quick benchmark on 240MB of SC2 sequences show this reduces core sort algorithm runtime (excluding minimizer download) by 42%.
@github-actions
Copy link

github-actions bot commented Jun 9, 2025

@corneliusroemer corneliusroemer marked this pull request as draft June 9, 2025 14:04
return cutoff + 1; // break out of loop, return hash above cutoff
}

// A=11=3, C=10=2, G=00=0, T=01=1
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment was actually wrong - T and C should have been interchanged.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's why the initial preview finds no matches (I used the comment rather than the code to implement the lookup table)

@corneliusroemer corneliusroemer marked this pull request as ready for review June 9, 2025 14:39
@ivan-aksamentov
Copy link
Member

ivan-aksamentov commented Jun 10, 2025

I was requested to review this. Perf improvement sounds nice! (haven't checked numbers though)

I am having difficulties understanding the diff and commit messages. Looks like the work is ongoing? Agent went sentient? :) The PR goes well beyond the "bitpacked representation". Some replaced pieces seem unrelated and don't look equivalent. Will need some more explanation to understand.

It's unclear how to verify correctness. We don't have tests here, so we will have to "big brain" it, and I will need all the help available to do that.

Ideally, I'd like bit-twiddling tricks to be isolated, rather than just spread all over the place. And if it's a well-known algo (most of them are), to be named explicitly. If some kind of a bit container is used, perhaps it could be made into a reusable class? Or perhaps an existing library can be used? (sparing us from edge cases, like all kinds of over-/under-flows, and from maintenance)

Oftentimes bits don't need to be moved directly. Compiler can usually transform well-written, human-readable code into bit hackery itself. Usually even better than humans can. This looks like an over-optimization and "not invented here" code (typical for LLM code). Though if performance gains are there and correctness is preserved, I would not mind. But I might be a bit grumpy :)

Will wait for some complete work and some comments before re-reviewing. Feel free to ping me or re-request a review when ready.

@ivan-aksamentov ivan-aksamentov removed their request for review June 10, 2025 06:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants