Skip to content

Conversation

winggan
Copy link

@winggan winggan commented Aug 24, 2025

  • support TORCH_CUDA_ARCH_LIST
    support TORCH_CUDA_ARCH_LIST so we can compile wheel in a non-GPU environment, making it more convenient to working with CI/CD and prebuilt binary distirbution

now we can build SageAttention in a cpu-only environment using (as an example):

CUDA_HOME=/path/to/cuda-x.y TORCH_CUDA_ARCH_LIST='8.0;9.0+PTX' python3 setup.py bdist_wheel
  • avoid link against libcuda.so
    libcuda.so (the driver API) is not available in a non-GPU environment, so we should avoid directly link against it at compile time.
    NVIDIA has offered a standard way to access driver API via cudaGetDriverEntryPointByVersion (or previously cudaGetDriverEntryPoint) to dynamically load and call driver API

support TORCH_CUDA_ARCH_LIST so we can compile wheel in a non-GPU environment, making it more convenient to working with CI/CD and prebuilt binary distirbution
@winggan winggan changed the title Update setup.py to support TORCH_CUDA_ARCH_LIST support TORCH_CUDA_ARCH_LIST and avoid link against libcuda.so at compile time Aug 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant