Skip to content
@SWE-agent

SWE-agent

Use language models to 🐛 fix issues in real GitHub repositories, ⛳️ solve coding challenges, and 🔥 crack offensive cybersecurity challenges

📣 New: Meet mini, the 100 line AI agent that still gets 65% on SWE-bench verified!


SWE-agent    mini-SWE-agent    SWE-ReX    SWE-Smith    SWE-bench    sb-cli

Software engineering agents, benchmarks, and models.
Built and maintained by researchers from Princeton University and Stanford University.

Slack HuggingFace YouTube

More information about the projects

Main projects:

  • SWE-agent, a system that automatically solves GitHub issues using an LM agent.
  • mini-SWE-agent, a 100 line AI agent that still gets 65% on SWE-bench verified!
  • SWE-bench, a benchmark for evaluating AI systems on real world GitHub issues.
  • SWE-smith, a toolkit for generating SWE training data at scale.

Also check out the supporting infrastructure for working with SWE-* projects

  • SWE-ReX, infrastructure supporting sandboxed code execution for AI agents
  • sb-cli, a command line interface for running evaluations on the cloud.

Pinned Loading

  1. SWE-agent SWE-agent Public

    SWE-agent takes a GitHub issue and tries to automatically fix it, using your LM of choice. It can also be employed for offensive cybersecurity or competitive coding challenges. [NeurIPS 2024]

    Python 17k 1.8k

  2. mini-swe-agent mini-swe-agent Public

    The 100 line AI agent that solves GitHub issues or helps you in your command line. Radically simple, no huge configs, no giant monorepo—but scores 68% on SWE-bench verified!

    Python 1k 91

  3. SWE-ReX SWE-ReX Public

    Sandboxed code execution for AI agents, locally or on the cloud. Massively parallel, easy to extend. Powering SWE-agent and more.

    Python 283 72

Repositories

Showing 8 of 8 repositories

Top languages

Loading…

Most used topics