[DeepSeek R1] Support chunked prefill for P/D #1938

jerrychenhf · 2025-09-17T02:46:07Z

Purpose

It is a complex problem to support general chunked prefill at the current code base considering prefill and decode mix. It is also challenge to make it performant. While this can be simplified for P/D case if our target is to support chunked prefill for prefill instance. All the chunked requests are prefill and we can handle it in the same concept of prefix cache with minimum changes.

The most changes happens in KV cache transfer side. For chunked prefill, the time to send the KV cache and the way to fetch data from the cache have been changed.

We send the KV cache only at the last chunk
We need to send the KV cache for the whole sequence instead of KV for the current query.

Initial implementation of chunked prefill for P/D

0b4a687

jerrychenhf requested review from afierka-intel, kzawora-intel, mgawarkiewicz, michalkuligowski and vivekgoe as code owners September 17, 2025 02:46

jerrychenhf added 10 commits September 17, 2025 11:15

Always allocate the number of new tokens dividable by block size

8c436a5

Make KV cache shape consistent

e3e6cd5

Fix the block indices and a few others

30606b6

Go the new chunked prefill path only when enabled

9bd2466

Make the consistent KV cache shape for default

0c9db57

Not use fetch_from_cache for chunked prefill

7ce6b9c

Fix the sample error when any sequences not sample in a batch

80a3288

Prefill chunk size option

762bb26

skip prefill sample support for chunked prefill

9216728

Formating

5428794

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[DeepSeek R1] Support chunked prefill for P/D #1938

[DeepSeek R1] Support chunked prefill for P/D #1938

Uh oh!

jerrychenhf commented Sep 17, 2025 •

edited by github-actions bot

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

[DeepSeek R1] Support chunked prefill for P/D #1938

Are you sure you want to change the base?

[DeepSeek R1] Support chunked prefill for P/D #1938

Uh oh!

Conversation

jerrychenhf commented Sep 17, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

jerrychenhf commented Sep 17, 2025 •

edited by github-actions bot

Loading