Skip to content

Adding aic-hw-version Compile Options Support #528

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

abukhoy
Copy link
Contributor

@abukhoy abukhoy commented Aug 4, 2025

This pull request introduces support for compile-time options via keyword arguments (kwargs), including the aic-hw-version parameter, which now accepts values "ai100" or "ai200". If no value is provided, the default is "ai100", representing the AI100 hardware.

These enhancements allow users to tailor the compile API to better suit their specific requirements.

Example Usage:

from QEfficient import QEFFAutoModelForCausalLM
from transformers import AutoTokenizer

model_name = "gpt2"
model = QEFFAutoModelForCausalLM.from_pretrained(model_name, num_hidden_layers=2)

model.compile(prefill_seq_len=128, ctx_len=256, num_cores=16, num_devices=1, **{'aic-hw-version': 'ai100'})

tokenizer = AutoTokenizer.from_pretrained(model_name)
model.generate(prompts=["Hi there!!"], tokenizer=tokenizer)

Note: Previously, the default value for aic-hw-version was "2.0", which implicitly referred to AI100. This value is now deprecated and replaced with the explicit "ai100" identifier.

@abukhoy
Copy link
Contributor Author

abukhoy commented Aug 6, 2025

I have made a little change to the _compile function of the base class by including some helper method. If it's not okay then I will revert it.

@quic-hemagnih
Copy link
Contributor

Is anything pending on this? I think we are good to merge this change.

@quic-rishinr
Copy link
Contributor

Is anything pending on this? I think we are good to merge this change.

Yes, the compiler changes need to be merged first before we proceed with adding this change to Qeff.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants