[Intel HPU] Support intel hpu platform #4161

fmiao2372 · 2025-09-17T12:35:35Z

FastDeploy在Intel HPU上已完成ERNIE 4.5模型的适配

依赖信息：
Gaudi software: 1.22.0
PaddlePaddle：3.1.1
PaddleCustomDevice: latest develop branch

更多模型的支持和性能的优化会继续更新。

paddle-bot · 2025-09-17T12:35:40Z

Thanks for your contribution!

zoooo0820 · 2025-09-18T03:29:05Z

fastdeploy/platforms/intel_hpu.py

+        try:
+            # assert len(paddle.static.cuda_places()) > 0
+            return True
+        except Exception as e:


This check doesn't seem to work.

zoooo0820 · 2025-09-18T03:50:30Z

fastdeploy/model_executor/ops/intel_hpu/__init__.py

+# PACKAGE = "fastdeploy.model_executor.ops.intel_hpu"
+PACKAGE = "paddlenlp_ops"
+
+import_custom_ops(PACKAGE, "paddlenlp_ops", globals())


here should be fastdeploy.model_executor.ops.intel_hpu instead of paddlenlp_ops ?

Is this because of the naming convention of the ops implementation in custom device?

yes, real custom ops come from paddlecustomdevice, we just rename it in fastdeploy

zoooo0820 · 2025-09-18T03:51:40Z

fastdeploy/model_executor/ops/intel_hpu/__init__.py

@@ -0,0 +1,21 @@
+# Copyright (c) 2024 PaddlePaddle Authors. All Rights Reserved.
+#


zoooo0820 · 2025-09-18T06:45:14Z

fastdeploy/model_executor/layers/attention/base_attention_backend.py

        raise NotImplementedError
+
+
+class AttentionBackend_HPU(AttentionBackend):


Will it be better to move this class to fastdeploy/model_executor/layers/attention/hpu_attn_backend.py ?

zoooo0820 · 2025-09-18T06:53:56Z

fastdeploy/engine/args_utils.py

+            "--enable-tensor-or-expert-parallel",
+            action='store_true',
+            default=EngineArgs.enable_tensor_or_expert_parallel,
+            help="Enable tensor parallelism for non-MoE and expert parallelism for MoE.")


could we enable tp + ep by setting --enable-expert-parallel and --tensor-parrllel-size without adding a new argument ?

currently EP is combined with DP, so we can't enable tp + ep with existing parameters
https://github.com/PaddlePaddle/FastDeploy/blob/develop/fastdeploy/config.py#L316-L318
https://github.com/PaddlePaddle/FastDeploy/blob/develop/fastdeploy/model_executor/layers/moe/moe.py#L132-L134

zoooo0820 · 2025-09-18T07:02:38Z

fastdeploy/worker/worker_process.py

+
+    parallel_config.engine_worker_queue_port = parallel_config.engine_worker_queue_port[
+        parallel_config.local_data_parallel_id
+    ]


All CI fails at this line. TypeError: '\''int'\'' object is not subscriptable' . We need to solve it first and then see if there are any other problems

YuanRisheng · 2025-09-19T08:37:41Z

fastdeploy/model_executor/layers/attention/hpu_attn_backend.py

@@ -0,0 +1,314 @@
+"""


layers目录下有一个backends文件夹，里边放着各类device的layer有关的实现，把attention和moe的实现都放到这个文件夹下吧

YuanRisheng · 2025-09-19T08:40:05Z

fastdeploy/model_executor/layers/linear.py

+        elif current_platform.is_intel_hpu():
+            self.forward = self.forward_intel_hpu


forard_cuda名字可能现在已经不太合适叫这个了，应该是可以复用forward_cuda的，逻辑都是一样的

YuanRisheng · 2025-09-19T08:44:23Z

fastdeploy/model_executor/layers/sample/sampler.py

+        elif current_platform.is_intel_hpu():
+            self.forward = self.forward_intel_hpu


这个和其他硬件平台有何不同之处吗，为啥需要单独写逻辑，不能抽象为几个op然后调用forward_cuda吗

YuanRisheng · 2025-09-19T08:51:56Z

fastdeploy/model_executor/load_weight_utils.py

 from fastdeploy.platforms import current_platform


+def reload_ep_checkpoint(model_path: str, fd_config: FDConfig, state_dict: dict, return_numpy: bool = False):


为什么会修改加载模型这块儿的内容，是因为用的不是官方的模型吗

YuanRisheng · 2025-09-19T08:53:39Z

fastdeploy/config.py

        self.expert_parallel_size = 1  # EP degree
        self.data_parallel_size = 1  # DP degree
        self.enable_expert_parallel = False
+        self.enable_tensor_or_expert_parallel = False


这里不能通过enable_expert_parallel或者是expert_parallel_size，tensor_parallel_size等这些字段组合判断吗，必须要给用户接口加新字段吗

zoooo0820 · 2025-09-19T09:01:18Z

fastdeploy/engine/args_utils.py

        cache_cfg = CacheConfig(all_dict)
        load_cfg = LoadConfig(all_dict)
        parallel_cfg = ParallelConfig(all_dict)
+        cache_cfg.enc_dec_block_num = self.static_decode_blocks


It could be better to set this value as https://github.com/PaddlePaddle/FastDeploy/blob/release/2.2/fastdeploy/config.py#L899 to avoid impact on other hardware.

paddle-bot bot added the contributor External developers label Sep 17, 2025

[Intel HPU] Support intel hpu platform

d7509a6

fmiao2372 force-pushed the integration_upstreaming branch from 7e59562 to d7509a6 Compare September 17, 2025 12:49

zoooo0820 reviewed Sep 18, 2025

View reviewed changes

fmiao2372 added 5 commits September 18, 2025 12:57

fix some issues

1d52530

apply precommit and move AttentionBackend_HPU

7a59cb5

fix format issue

c8b94d3

correct ops import

f4334a8

fix ci issue

6f20a90

YuanRisheng reviewed Sep 19, 2025

View reviewed changes

zoooo0820 reviewed Sep 19, 2025

View reviewed changes

		@@ -0,0 +1,21 @@
		# Copyright (c) 2024 PaddlePaddle Authors. All Rights Reserved.
		#

		raise NotImplementedError


		class AttentionBackend_HPU(AttentionBackend):

		elif current_platform.is_intel_hpu():
		self.forward = self.forward_intel_hpu

		from fastdeploy.platforms import current_platform


		def reload_ep_checkpoint(model_path: str, fd_config: FDConfig, state_dict: dict, return_numpy: bool = False):

[Intel HPU] Support intel hpu platform #4161

Are you sure you want to change the base?

[Intel HPU] Support intel hpu platform #4161

Uh oh!

Conversation

fmiao2372 commented Sep 17, 2025

Uh oh!

paddle-bot bot commented Sep 17, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zoooo0820 Sep 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

zoooo0820 Sep 19, 2025 •

edited

Loading