[NPUW] Enabled model with multiple outputs in LLMInferRequest #31520

AsyaPronina · 2025-07-30T01:26:33Z

Details:

Added possibility for passed LLM model to have multiple outputs

Tickets:

EISW-175831

AlexanderKalistratov · 2025-07-30T09:53:52Z

src/plugins/intel_npu/src/plugin/npuw/llm_infer_request.hpp

+        // second and third and combine them using XOR
+        // and bit shifting:
+
+        return ((hash<std::size_t>()(port.get_index()) ^ (hash<const ov::Node*>()(port.get_node()) << 1)) >> 1) ^


Wouldn't it be enough just to have hash from port.get_node() ?

I am not sure, as ov::Node can have multiple outputs (with different indices)

AlexanderKalistratov · 2025-07-30T09:59:41Z

src/plugins/intel_npu/src/plugin/npuw/llm_infer_request.cpp

-            LOG_DEBUG("Input name " << input_name << " doesn't contain kv cache. Skipping.");
-            continue;
-        }
+        NPUW_ASSERT(m_kvcache_in_ports.find(input_name) == m_kvcache_in_ports.end());


This assert looks very suspicious

AlexanderKalistratov · 2025-07-30T10:01:11Z

src/plugins/intel_npu/src/plugin/npuw/llm_compiled_model.hpp

+        // model's I/O are appended to original model's I/O at the end,
+        // thus it is safe to loop over KVCache I/O blocks just using some
+        // start offsets.
+        std::size_t start_idx_in_outputs = 0u;


Where do we set this one to something different than 0?

In constructor of LLMCompiledModel: https://github.com/openvinotoolkit/openvino/pull/31520/files#diff-7c0906d2d24827a3f37868e79470c1a41ea4bf66eae148a462c0442f8e9cc191R1087

AsyaPronina · 2025-07-30T12:06:34Z

src/plugins/intel_npu/src/plugin/npuw/llm_infer_request.cpp

@@ -531,9 +530,9 @@ void ov::npuw::LLMInferRequest::infer_prefill(ov::SoPtr<ov::ITensor> input_ids,
    if (m_lm_head_request) {
        LOG_DEBUG("Calling inference for LM head model.");
        m_lm_head_request->infer();
-        m_logits = m_lm_head_request->get_tensor(m_lm_head_logits_port);
+        update_out_tensors_from(m_lm_head_request);


Wrong. Only logits will be in LM head, other outputs should be gathered from prefill.

AsyaPronina requested review from a team as code owners July 30, 2025 01:26

github-actions bot added category: NPU OpenVINO NPU plugin category: NPUW NPUW plugin labels Jul 30, 2025

AsyaPronina changed the title ~~Added possibility for passed model to have multiple outputs~~ [NPUW] Added possibility for passed model to have multiple outputs Jul 30, 2025

AsyaPronina changed the title ~~[NPUW] Added possibility for passed model to have multiple outputs~~ [NPUW] Enabled model with multiple outputs in LLMInferRequest Jul 30, 2025

AsyaPronina force-pushed the fix_multi_outputs_issue branch from 28d3e37 to b5a0806 Compare July 30, 2025 01:37

Added possibility for passed model to have multiple outputs

11ab2cb

AsyaPronina force-pushed the fix_multi_outputs_issue branch from b5a0806 to 11ab2cb Compare July 30, 2025 01:44

AlexanderKalistratov reviewed Jul 30, 2025

View reviewed changes

AsyaPronina commented Jul 30, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[NPUW] Enabled model with multiple outputs in LLMInferRequest #31520

[NPUW] Enabled model with multiple outputs in LLMInferRequest #31520

AsyaPronina commented Jul 30, 2025

Uh oh!

AlexanderKalistratov Jul 30, 2025

Uh oh!

AsyaPronina Jul 30, 2025

Uh oh!

AlexanderKalistratov Jul 30, 2025

Uh oh!

AlexanderKalistratov Jul 30, 2025

Uh oh!

AsyaPronina Jul 30, 2025 •

edited

Loading

Uh oh!

AsyaPronina Jul 30, 2025

Uh oh!

Uh oh!

[NPUW] Enabled model with multiple outputs in LLMInferRequest #31520

Are you sure you want to change the base?

[NPUW] Enabled model with multiple outputs in LLMInferRequest #31520

Conversation

AsyaPronina commented Jul 30, 2025

Details:

Tickets:

Uh oh!

AlexanderKalistratov Jul 30, 2025

Choose a reason for hiding this comment

Uh oh!

AsyaPronina Jul 30, 2025

Choose a reason for hiding this comment

Uh oh!

AlexanderKalistratov Jul 30, 2025

Choose a reason for hiding this comment

Uh oh!

AlexanderKalistratov Jul 30, 2025

Choose a reason for hiding this comment

Uh oh!

AsyaPronina Jul 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

AsyaPronina Jul 30, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

AsyaPronina Jul 30, 2025 •

edited

Loading