Qualcomm Cloud AI SDK - Developer Resources

Cloud AI 100

| User Guide | Download SDK | Blog |

Qualcomm Cloud AI SDK - Developer Resources

Latest News 🔥

[2025/08] Try the QAic Bench script for LLM benchmarking on Cloud AI accelerators
[2025/08] The Open WebUI tutorial shows how to use Open WebUI's chat interface with Cloud AI accelerators.
[2025/08] Added Kubernetes tutorial
[2025/08] Added Efficient Transformers tutorial
[2025/08] Added DETR ResNet-50 model example
[2025/08] Added YOLOv8 model example
[2024/11] Check out the Qualcomm Cloud AI Playground Tutorial to learn how to access the latest Generative AI models running on Qualcomm Cloud AI 100 Ultra Accelerators hosted in the cloud.
[2024/11] Added Stable Diffusion XL Turbo model example
[2024/11] Added Stable Diffusion 3 model example
[2024/09] Added Whisper model example
[2024/09] Added SDXL-DeepCache model example
[2024/04] Qualcomm released efficient transformers for seamless deployment of pre-trained LLMs.
[2024/03] Added AI 100 Ultra recipe for Llama family of LLMs - e.g., Llama-2-7B
[2024/03] Added support for Speculative Decoding with LLMs - CodeGen with Speculative Decoding
[2024/02] Added support for Stable Diffusion XL
[2024/02] Added support for MPT family of LLMs - e.g., MPT-7B
[2024/02] Added support for GPTBigCode family of LLMs - e.g., StarCoder
[2024/01] Added profiling tutorial for LLMs
[2024/01] Added support for DeciDiffusion-v2.0
[2024/01] Added support for DeciCoder-6B
[2024/01] Added support for Llama family of LLMs - e.g., Llama-2-7B

About

Qualcomm Cloud AI 100 offers a unique blend of high computational performance, low latency, and low power utilization, making it well-suited for a broad range of AI applications, including computer vision, natural language processing, and Generative AI such as Large Language Models (LLMs). Specifically designed for high-performance, low-power AI processing, it is ideal for both public and private cloud environments, supporting Enterprise AI applications.

This repository provides developers with 3 key resources

Models - Recipes for CV, NLP, multimodal models to run on Cloud AI platforms performantly,
For LLM, embeddings and speech models, see efficient-transformers
Tutorials - Tutorials cover model onboarding, performance tuning, and profiling aspects of inferencing across CV/NLP on Cloud AI platforms
Samples - Sample code illustrating usage of APIs - Python and C++ for inference on Cloud AI platforms

📚 Supported Models

Generative AI - Large Language Models (LLMs), Embeddings and Speech

See efficient-transformers

💬 Generative AI - Text-to-Image Models

Stable Diffusion (stabilityai/stable-diffusion-xl-base-1.0, stabilityai/stable-diffusion-2-1, runwayml/stable-diffusion-v1-5, etc.)
DeciDiffusion (Deci/DeciDiffusion-v2-0, Deci/DeciDiffusion-v1-0, etc.)

🤖 NLP - Encoder-only Transformer Models

80+ models including all varieties of bert models, sentence-transformer embedding models, etc.

👀 Computer Vision (CV) Models

ViT (vit_b_16, vit_b_32, vit-base-patch16-224)
YOLO (yolov5s, yolov5m, yolov5l, yolov5x, yolov7-e6e, yolov8)
ResNet (resnet18, resnet34, resnet50, resnet101, resnet152)
ResNeXt (resnext101_32x8d, resnext101_64x4d, resnext50_32x4d)
Wide ResNet (wide_resnet101_2, wide_resnet50_2)
DenseNet (densenet121, densenet161, densenet169, densenet201)
MNASNet (mnasnet0_5, mnasnet0_75, mnasnet1_0, mnasnet1_3)
MobileNet (mobilenet_v2, mobilenet_v3_large, mobilenet_v3_small)
EfficientNet (efficientnet_v2_l, efficientnet_v2_m, efficientnet_v2_s, efficientnet_b0, efficientnet_b7, etc.)
ShuffleNet (shufflenet_v2_x0_5, shufflenet_v2_x1_0, shufflenet_v2_x1_5, shufflenet_v2_x2_0)
SqueezeNet (squeezenet1_0, squeezenet1_1)

Support

Reach out on the 📢cloud-ai Discord channel or use 💬 GitHub Issues to request for model support, raise questions or to provide feedback.

Disclaimer

While this repository may provide documentation on how to run models on Qualcomm Cloud AI platforms, this repository does NOT contain any of these models. All models referenced in this documentation are independently provided by third parties at unaffiliated websites. Please be sure to review any third-party license terms at these websites; no license to any model is provided in this repository. This repository of documentation provides no warranty or assurances for any model so please also be sure to review all model cards, model descriptions, model limitations / intended uses, training data, biases, risks, and any other warnings given by the third party model providers.

License

The documentation made available in this repository is licensed under the BSD 3-clause-Clear “New” or “Revised” License. Check out the LICENSE for more details.

Name		Name	Last commit message	Last commit date
Latest commit History 842 Commits
images		images
models		models
samples		samples
tutorials		tutorials
utils		utils
CODE-OF-CONDUCT.md		CODE-OF-CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Qualcomm Cloud AI SDK - Developer Resources

About

📚 Supported Models

Generative AI - Large Language Models (LLMs), Embeddings and Speech

💬 Generative AI - Text-to-Image Models

🤖 NLP - Encoder-only Transformer Models

👀 Computer Vision (CV) Models

Support

Disclaimer

License

About

Uh oh!

Uh oh!

Contributors 9

Uh oh!

Languages

License

quic/cloud-ai-sdk

Folders and files

Latest commit

History

Repository files navigation

Qualcomm Cloud AI SDK - Developer Resources

About

📚 Supported Models

Generative AI - Large Language Models (LLMs), Embeddings and Speech

💬 Generative AI - Text-to-Image Models

🤖 NLP - Encoder-only Transformer Models

👀 Computer Vision (CV) Models

Support

Disclaimer

License

About

Topics

Resources

License

Code of conduct

Contributing

Uh oh!

Stars

Watchers

Forks

Uh oh!

Contributors 9

Uh oh!

Languages