fix the issue of hipblas matmul #66

zhangnju · 2025-05-28T06:22:12Z

when running the int8 benchmark of official bitsandbytes https://github.com/bitsandbytes-foundation/bitsandbytes/blob/main/benchmarking/int8/int8_benchmark.py , or the below sample codes for int8 quantization:

from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig

base_model_name = "/models/Llama-2-7b-hf"
tokenizer = AutoTokenizer.from_pretrained(base_model_name)
quantization_config = BitsAndBytesConfig(load_in_8bit=True)
tokenizer = AutoTokenizer.from_pretrained(base_model_name)
bnb_model_8bit = AutoModelForCausalLM.from_pretrained(
base_model_name,
device_map="auto",
quantization_config=quantization_config)

prompt = "What is a large language model?"
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
generated_ids = bnb_model_8bit.generate(**inputs)
outputs = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)
print(outputs)

we met the below hipblaslt matmul issue:

this patch can fix the above issue by 1) removing rocblas in class context, since only hipblaslt is used in rocm bitsandbytes 2) adding the workspace pointer for hipblaslt matmul. this patch also includes some more log info to help debug.
after applying this patch, both the above sample codes and the int8 benchmark codes can run successfully.

zhangnju added 4 commits May 28, 2025 00:25

fix the issue of hipblas matmul

36623b8

update the patch of fixing hipblaslt matmul issue

234e1ae

update the patch of fixing hipblaslt matmul issue

5fe10e8

update the patch of fixing hipblaslt matmul issue

e3238fe

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix the issue of hipblas matmul #66

fix the issue of hipblas matmul #66

Uh oh!

zhangnju commented May 28, 2025

Uh oh!

Uh oh!

fix the issue of hipblas matmul #66

Are you sure you want to change the base?

fix the issue of hipblas matmul #66

Uh oh!

Conversation

zhangnju commented May 28, 2025

Uh oh!

Uh oh!