Skip to content

Commit 050594f

Browse files
authored
rm redundant line break (#475)
Signed-off-by: Zhang, Weiwei1 <[email protected]>
1 parent 8033dd0 commit 050594f

File tree

1 file changed

+11
-12
lines changed

1 file changed

+11
-12
lines changed

README.md

Lines changed: 11 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -70,7 +70,6 @@ pip install auto-round-lib
7070
```
7171

7272
</details>
73-
<br>
7473

7574
## Model Quantization
7675

@@ -124,11 +123,11 @@ auto-round-fast \
124123
``` -->
125124

126125
</details>
127-
<br>
128126

129-
In conclusion, we recommend using **auto-round for INT4 and auto-round-best for INT2**. However, you may adjust the configuration to suit your specific requirements and available resources.
127+
In conclusion, we recommend using **auto-round for INT4 and auto-round-best for INT2**. However, you may adjust the configuration to suit your specific requirements and available resources.
128+
129+
W4G128 Average Accuracy of 13 tasks and Time Cost Results(Testing was conducted on the Nvidia A100 80G using the version of PyTorch 2.6.0 with enable_torch_compile):
130130

131-
Average Accuracy of 13 tasks(W4G128) and Time Cost(enable_torch_compile) Results
132131

133132
| Model | Qwen2.5-0.5B-Instruct | Falcon3-3B | Qwen2.5-7B-Instruct | Falcon3-10B | Qwen2.5-72B-Instruct |
134133
|---------|-----------------------|----------------------|---------------------|----------------------|-----------------------|
@@ -138,7 +137,8 @@ Average Accuracy of 13 tasks(W4G128) and Time Cost(enable_torch_compile) Results
138137
| Light | 0.4052(2m) | 0.5108(3m) | **0.6453**(5m) | 0.6063(6m) | 0.7243(37m) |
139138

140139

141-
<br>
140+
141+
142142

143143
### API Usage (Gaudi2/CPU/GPU)
144144

@@ -217,7 +217,6 @@ autoround.save_quantized(output_dir, format='auto_round', inplace=True)
217217
- `device`: The device to be used for tuning. The default is set to 'auto', allowing for automatic detection.
218218

219219
</details>
220-
<br>
221220

222221

223222
### API Usage for VLMs
@@ -256,7 +255,7 @@ autoround.save_quantized(output_dir, format='auto_round', inplace=True)
256255
```
257256
</details>
258257

259-
<br>
258+
260259

261260
### Export Formats
262261
**AutoRound Format**: This format is well-suited for CPU, HPU devices, 2 bits, as well as mixed-precision
@@ -273,7 +272,7 @@ adopted within the community, **only 4-bits quantization is supported**.
273272
**GGUF** Format: This format is well-suited for CPU devices and is widely adopted by the community, **only q4_0 and
274273
q4_1 (W4G32) is supported in our repo**.
275274

276-
<br>
275+
277276

278277
### Quantization Costs
279278

@@ -330,7 +329,6 @@ inputs = tokenizer(text, return_tensors="pt").to(model.device)
330329
print(tokenizer.decode(model.generate(**inputs, max_new_tokens=50)[0]))
331330
```
332331

333-
<br>
334332

335333
#### Evaluation
336334
<details>
@@ -344,7 +342,7 @@ auto-round --model saved_quantized_model \
344342
```
345343

346344
</details>
347-
<br>
345+
348346

349347
### AutoGPTQ/AutoAWQ format
350348

@@ -420,7 +418,7 @@ release most of the models ourselves.
420418
</details>
421419

422420

423-
<br>
421+
424422

425423
## Integration
426424

@@ -432,7 +430,7 @@ AutoRound has been integrated into multiple repositories.
432430

433431
[pytorch/ao](https://github.com/pytorch/ao)
434432

435-
<br>
433+
436434

437435
## Reference
438436

@@ -451,3 +449,4 @@ If you find AutoRound useful for your research, please cite our paper:
451449

452450

453451

452+

0 commit comments

Comments
 (0)