Skip to content

Releases: intel/ipex-llm

2.3.0 nightly build

16 Apr 06:44
db5edba
Compare
Choose a tag to compare
2.3.0 nightly build Pre-release
Pre-release

IPEX-LLM release 2.2.0

07 Apr 07:23
183d01f
Compare
Choose a tag to compare

Highlights

Note: IPEX-LLM v2.2.0 has been updated to include functional and security updates. Users should update to the latest version.

Please go to https://github.com/ipex-llm/ipex-llm/releases/tag/v2.2.0 for the downloads.

Multi-Arc Serving release 0.1.0

09 Apr 06:51
183d01f
Compare
Choose a tag to compare

Overview

This release introduces the latest update to the Multi-ARC vLLM serving solution, optimized for Intel Xeon + ARC platforms with ipex-llm vLLM. The new version delivers low latency and high throughput LLM serving with improved model compatibility and resource efficiency. Major component upgrades include: vLLM upgraded to 0.6.6, PyTorch upgraded to 2.6, oneAPI upgraded to 2025.0, oneCCL patch updated to 0.0.6.6.

New Features

  • Optimized vLLM serving for Intel Xeon + ARC multi-GPU platforms, enabling lower latency and higher throughput.
  • Supported various LLM models.
  • Enhanced support for loading models with minimal memory requirements.
  • Refined Docker image for improved ease of use and deployment.
  • Improved WebUI model connectivity and stability.
  • Added VLLM_LOG_OUTPUT=1 option to enable detailed input/output logging for vLLM.

Bug Fixes

  • Resolved multimodal issues including get_image failures and inference errors with models such as MiniCPM-V-2_6, Qwen2-VL, and GLM-4v-9B.
  • Fixed Qwen2-VL multi-request crash by removing Qwen2VisionAttention’s attention_mask and addressing mrope_positions instability.
  • Updated profile_run usage to avoid OOM (Out of Memory) crashes.
  • Resolved GQA kernel issues causing errors with multiple concurrent outputs.
  • Fixed --enable-prefix-caching none crash in specific cases.
  • Addressed low-bit overflow causing !!!!!! output error in DeepSeek-R1-Distill-Qwen-14B.
  • Resolved GPTQ and AWQ-related errors to improve compatibility across more models.

Docker Images

2.2.0 nightly build

13 Feb 03:28
1083fe5
Compare
Choose a tag to compare

IPEX-LLM release 2.1.0

22 Aug 09:06
c5b51d4
Compare
Choose a tag to compare

Highlights

Note: IPEX-LLM v2.1.0 has been updated to include functional and security updates. Users should update to the latest version.

BigDL release 2.4.0

13 Nov 02:02
ac12599
Compare
Choose a tag to compare

Highlights

Note: BigDL v2.4.0 has been updated to include functional and security updates. Users should update to the latest version.

BigDL release 2.3.0

24 Apr 02:17
ce43fac
Compare
Choose a tag to compare

Highlights

Note: BigDL v2.3.0 has been updated to include functional and security updates. Users should update to the latest version.

Nano

  • Enhanced trace and quantization process (for PyTorch and TensorFlow model optimizations)
  • New inference optimization methods (including Intel ARC series GPU support, CPU fp16, JIT int8, etc.)
  • New inference/training features (including TorchCCL support, async inference pipeline, compressed model saving, automatic channels_last_3d, multi-instance training for customized TF train loop, etc.)
  • Performance enhancement and overhead reduction for inference optimized model
  • More user-friendly document and API design

Orca:

  • Step-by-step distributed TensorFlow and PyTorch tutorials for different data inputs.
  • Improvement and examples for distributed MMCV pipelines.
  • Further enhancement for Orca Estimator (more flexible PyTorch train loops via Hook, improved multi-output prediction, memory optimization for OpenVINO, etc.)

Chronos

  • 70% latency reduction for Forecasters
  • New bigdl.chronos.aiops module for AIOps use case on top of Chronos algorithms.
  • Enhanced TF-based TCNForecaster to better accuracy

Friesian:

  • Automatic deployment of RecSys serving pipeline on Kubernetes with Helm Chart

PPML

  • TDX (both VM and CoCo) support for Big Data, DL Training & Serving (including TDX-VM orchestration & k8s deployment, TDXCC installation & deployment, attestation and key management support, etc.)
  • New Trusted Machine Learning toolkit (with secure and distributed SparkML & LightGBM support)
  • Trusted Big Data toolkit upgrade (>2x EPC usage reduction, Apache Flink support, Azure MAA support, multi-KMS support, etc.)
  • Trusted Deep Learning toolkit upgrade (with improved performance using BigDL Nano, tcmalloc, etc.)
  • Trusted DL Serving toolkit upgrade (with Torch Serve, TF-Serving, and improved throughput and latency)

BigDL release 2.0.0

09 Mar 07:47
5950355
Compare
Choose a tag to compare

Highlights

Note: BigDL v2.0.0 has been updated to include functional and security updates. Users should update to the latest version.

BigDL release 0.13.0

09 Jul 12:20
3413659
Compare
Choose a tag to compare
v0.13.0

Update deploy-spark2.sh

BigDL release 0.12.2

21 Apr 01:53
b942185
Compare
Choose a tag to compare
v0.12.2

flip version to 0.12.2 (#3119)