Skip to content

Commit 4c9ca03

Browse files
Update markdown formatting and links in blog post (#251)
1 parent 24a98f6 commit 4c9ca03

File tree

1 file changed

+3
-3
lines changed

1 file changed

+3
-3
lines changed

blog/2025-11-19-miles.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -5,13 +5,13 @@ date: "November 19, 2025"
55
previewImg: /images/blog/miles/miles.jpg
66
---
77

8-
> A journey of a thousand miles begins with a single step.
8+
> *A journey of a thousand miles begins with a single step.*
99
1010
We're excited to introduce Miles, an enterprise-facing reinforcement learning framework designed for large-scale MoE training and production workloads. This introductory chapter will be the beginning of a series of tech blogs.
1111

1212
Miles is forked from slime, the lightweight RL framework that has quietly powered many of today’s post-training pipelines and large MoE training runs. Building on slime’s foundation, Miles aims to deliver a smooth and controllable RL experience for teams that need reliability and scale in real-world deployments.
1313

14-
The GitHub link for Miles can be found here: https://github.com/radixark/miles
14+
The GitHub link for Miles can be found [here](https://github.com/radixark/miles).
1515

1616
## 🧠 Starting Point: slime - A Lightweight and Customizable RL Framework
1717

@@ -64,7 +64,7 @@ In RL, freezing the draft model prevents it from following the target model poli
6464

6565
### Miscellaneous Updates
6666

67-
Enhance the FSDP training backend; allow deploying the rollout subsystem independently outside the framework; debug utilities such as more metrics, post-hoc analyzers, and enhancing profilers; gradually refactor the code to further enhance it; A formal mathematics (Lean) example is provided with SFT/RL scripts.
67+
Enhance the FSDP training backend; allow deploying the rollout subsystem independently outside the framework; debug utilities such as more metrics, post-hoc analyzers, and enhancing profilers; gradually refactor the code to further enhance it; A formal mathematics (Lean) example is provided with [SFT/RL scripts](https://github.com/radixark/miles/tree/main/examples/formal_math/single_round).
6868

6969
## 🚧 Towards the Future: Our Roadmap
7070

0 commit comments

Comments
 (0)