-
Notifications
You must be signed in to change notification settings - Fork 153
Use external hip and hipcub headers when HOOMD_GPU_PLATFORM=CUDA #2178
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: trunk
Are you sure you want to change the base?
Conversation
Builds fail with 7.1 on CUDA 12.
Also document Pixi installation instructions.
|
@mphoward, what do you think of using an external HIP library even when Ideally we could avoid this by using In any case, I won't merge this until |
|
Hmm, it isn't ideal to require CUDA users to download and compile HIP, but I agree this fix is substantially less work than getting Do you have a rough sense of when |
I do not. When working on this branch, I was surprised to find that I am concerned that HIP/CUDA interoperation is no longer actively supported by AMD. They seem to measure success by "can it run PyTorch?" and have even made the effort to create a custom build system that builds ROCm, HIP, and PyTorch: https://github.com/ROCm/TheRock |
I agree, this is all quite concerning. It might also explain why they have more closely matched the HIP kernel launch / device code interface with CUDA in recent major releases (it is a replacement, not a compatibility layer). Lack of proper CUDA support would be a good reason to put in the effort to update and use hipper as our own compatibility layer. I can prioritize working on that, but the soonest I can have a look would be in ~2 weeks (after the semester ends). It will require breaking changes to hipper first, and I am good with jumping to whatever the minimum versions of CUDA and ROCm that we need for HOOMD to keep the work as limited as possible. |
|
We'll give AMD some time and see if they add CUDA 13 support. If you do plan an eventual hipper refactor, the minimum CUDA I need is 12.8. NCSA Delta has just updated to that version. When I submitted this PR, the system was still on CUDA 12.4. |
Description
find_package(hip)to find external headers.Motivation and context
CUDA 13 contains many breaking changes and the vendored headers do not support it.
By using external hip libraries, HOOMD-blue will gain support for new versions of CUDA as soon as upstream adds support (at this time the latest release of hipcub does not support CUDA 13).
conda-forge does not support
HIP_PLATFORM=nvidia(conda-forge/hip-feedstock#9) inhip-develand lacks ahipcubpackage entirely. Therefore, users that build HOOMD from source for NVIDIA GPUs will need to install hip and hipcub headers:How has this been tested?
HOOMD-blue compiles and passes tests with CUDA 12.9, rocm-systems:hip-version_7.2.53220, and rocm-libraries:rocm-7.1.0 locally. CI checks have been updated accordingly. Patches to hip and hipcub fix build errors with CUDA 12.5–12.8.
Checklist:
sphinx-doc/credits.rst) in the pull request source branch.CHANGELOG.rstfollowing the established format.