Skip to content

Conversation

cheng221
Copy link
Contributor

@cheng221 cheng221 commented Sep 8, 2025

Before submitting

  • Lint code. If there are lint issues, please format the code first.
# Install and register `pre-commit` in the project folder
pip install pre-commit && pre-commit install

# Process previous code files separately
pre-commit run --file XXXX.py
  • Add test cases into tests folder. If there are codecov issues, please add tests cases first.

PR types

PR changes

Description

Fix attn_mask_startend_row_indices shape mismatch

Copy link

paddle-bot bot commented Sep 8, 2025

Thanks for your contribution!

@cheng221 cheng221 force-pushed the develop branch 2 times, most recently from 873bd7d to b38e4b5 Compare September 8, 2025 08:54
assert labels is not None

return self.criterion(logits, labels, loss_mask, router_loss=router_loss, mtp_logits=mtp_logits)
loss, _ = self.criterion(logits, labels, loss_mask, router_loss=router_loss, mtp_logits=mtp_logits)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这是为什么?

tar -xvf alpaca_demo.gz
```
### 模型下载
```bash
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

不需要加下载文档,后面可以直接用from_pretrained方式下载了

# 微调Qwen2-0.5B-Instruct 需要12G显存左右
python -u run_finetune.py ./config/qwen/sft_argument_qwen2_0p5b.json

# 微调ERNIE-4.5-0.3B-PT
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里先不用加ernie 4.5。先保证这ERNIEKit中能够训练就行

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

文档可以不改

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants