-
Notifications
You must be signed in to change notification settings - Fork 2.1k
【Bug】Fix attn_mask_startend_row_indices shape mismatch #2564
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
Thanks for your contribution! |
873bd7d
to
b38e4b5
Compare
assert labels is not None | ||
|
||
return self.criterion(logits, labels, loss_mask, router_loss=router_loss, mtp_logits=mtp_logits) | ||
loss, _ = self.criterion(logits, labels, loss_mask, router_loss=router_loss, mtp_logits=mtp_logits) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这是为什么?
tar -xvf alpaca_demo.gz | ||
``` | ||
### 模型下载 | ||
```bash |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
不需要加下载文档,后面可以直接用from_pretrained方式下载了
# 微调Qwen2-0.5B-Instruct 需要12G显存左右 | ||
python -u run_finetune.py ./config/qwen/sft_argument_qwen2_0p5b.json | ||
|
||
# 微调ERNIE-4.5-0.3B-PT |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里先不用加ernie 4.5。先保证这ERNIEKit中能够训练就行
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
文档可以不改
Before submitting
tests
folder. If there are codecov issues, please add tests cases first.PR types
PR changes
Description
Fix attn_mask_startend_row_indices shape mismatch