GitHub - SWE-agent

GitHub - SWE-agent

SWE-agent takes a GitHub issue and tries to automatically fix it, using your LM of choice. It can also be employed for offensive cybersecurity or competitive coding challenges. [NeurIPS 2024] - SW...

GitHub - SWE-agent

SWE-agent: A Claude Code Alternative for Research-Grade Autonomous GitHub Issue Resolution

SWE-agent is an open-source autonomous coding agent developed by researchers from Princeton University and Stanford University, designed to enable language models to fix real bugs in GitHub repositories through a structured agent-computer interface. Consistently ranking among the top performers on SWE-bench — the industry-standard benchmark for evaluating AI coding agents — SWE-agent is a powerful Claude Code alternative for developers and researchers who want full control over the agent architecture and model selection. The project is hosted at github.com/SWE-agent/SWE-agent.

SWE-agent takes a fundamentally research-first approach: rather than a polished SaaS product, it provides a highly configurable, hackable framework governed by a single YAML configuration file. Users can point SWE-agent at any GitHub issue, and the agent will autonomously explore the codebase, reproduce the bug, implement a fix, and run tests to verify the solution. Support for GPT-4o, Claude Sonnet, and other frontier models via API keys gives developers complete control over which underlying model powers the agent.

The Princeton/Stanford research team continues to evolve the project actively. SWE-agent 1.0 combined with Claude 3.7 achieved state-of-the-art performance on SWE-bench verified in February 2025, and the team has since released mini-SWE-agent — a 100-line Python implementation that matches SWE-agent's performance with significantly reduced complexity. Multimodal support for processing GitHub issue images was added in mid-2025, extending its capability to visual bug reports.

Comparison: SWE-agent vs Claude Code

Feature SWE-agent Claude Code
TypeCLI Agent / Research framework (open-source)CLI Agent (Anthropic)
IDE SupportTerminal / command-line onlyTerminal only
PricingFree (open-source); pay for model API usage onlyUsage-based via Anthropic API
ModelsAny API-accessible model: GPT-4o, Claude, Gemini, etc.Claude 3.5 Sonnet / Claude 3 Opus
Privacy/HostingSelf-hosted (code never sent to SWE-agent servers)Cloud (Anthropic API)
Open SourceYes (MIT license)No
Offline/Local ModelsYes (compatible with Ollama and local model servers)No

Key Strengths

  • State-of-the-art SWE-bench performance: SWE-agent 1.0 with Claude 3.7 achieved state-of-the-art results on both SWE-bench verified and full benchmarks in early 2025, making it one of the most capable open-source agents for real-world bug fixing. This benchmark performance translates directly to high-quality, merge-ready pull requests.
  • Model-agnostic architecture: Unlike Claude Code which requires Anthropic's models, SWE-agent supports any model accessible via an API — including OpenAI GPT-4o, Anthropic Claude, Google Gemini, and local models through Ollama. This flexibility prevents vendor lock-in and allows teams to optimize for cost or capability.
  • Fully open-source and self-hostable: The MIT-licensed codebase can be run entirely on your own infrastructure. Code stays within your environment — it is never sent to SWE-agent's servers (there are none). This makes SWE-agent suitable for air-gapped environments and organizations with strict data residency requirements.
  • YAML-based configuration for full control: The entire agent behavior — tools, prompts, model selection, retry logic, context management — is controlled through a single YAML file. Researchers and advanced users can customize every aspect of the agent's behavior without modifying source code.
  • Active research backing from Princeton and Stanford: SWE-agent is not a startup product — it is an active research project from top universities, with regular publications, benchmark updates, and a community of researchers contributing improvements. The research pedigree provides confidence in the agent's design rigor.

Known Limitations

  • Steeper setup curve than polished products: SWE-agent requires Python installation, API key configuration, and familiarity with YAML configuration files. It is not a one-click tool — first-time setup can take 15-30 minutes, and getting optimal performance requires tuning the configuration.
  • Research-oriented, not production-polished: The tool is explicitly designed for research and experimentation. The UX prioritizes configurability over polish — error messages can be technical, and documentation assumes comfort with Python and CLI tools.
  • No native GitHub Actions integration: SWE-agent does not have a native one-click GitHub Actions workflow for automatic issue resolution. Setting up automated PR generation from issues requires manual pipeline configuration, unlike dedicated platforms like Codegen.
  • No team collaboration features: SWE-agent is a single-user CLI tool with no built-in team management, access control, or multi-user workflows. Enterprise teams needing shared agent access and centralized billing should look at platforms like Codegen instead.

Best For

SWE-agent is best for individual developers, researchers, and engineering teams who want maximum control over their AI coding agent — including model selection, agent behavior, and data privacy. It is the top choice for organizations with data residency requirements who cannot send code to external SaaS platforms. Academic researchers working on AI coding agent improvements will find the hackable, well-documented architecture ideal for experimentation and publication.

Pricing

SWE-agent itself is completely free and open-source under the MIT license. The only cost is the API usage for whichever language model you configure. Using GPT-4o via OpenAI API or Claude via Anthropic API incurs standard token-based pricing from those providers. For local model usage via Ollama or compatible local servers, the operational cost is zero beyond hardware. There are no SWE-agent subscription fees, seats, or usage limits imposed by the project itself.

Tech Details

SWE-agent is implemented in Python and requires Python 3.9 or later. It uses a structured Agent-Computer Interface (ACI) that provides the language model with specialized tools for file navigation, code editing, terminal command execution, and test running. Configuration is managed through YAML files — the primary config file controls model selection, tool availability, prompt templates, and agent behavior parameters. The tool supports GitHub API integration for issue fetching and PR creation. Multimodal support (added mid-2025) enables processing of images from GitHub issues when using vision-capable models. The related mini-SWE-agent project achieves 65% on SWE-bench verified in approximately 100 lines of Python.

When to Choose SWE-agent Over Claude Code

SWE-agent is the superior choice when you need full control over the AI model powering your agent, when data privacy requires that code never leave your infrastructure, or when you want to run agents using local models without API costs. Its YAML-based configurability makes it ideal for teams with specific workflow requirements that commercial tools do not accommodate. Researchers and technically advanced developers who want to understand and modify every aspect of how their coding agent behaves will appreciate SWE-agent's transparency and hackability in ways that closed-source tools cannot provide.

When Claude Code May Be a Better Fit

Claude Code provides a more polished, immediately productive experience with less setup overhead — it is designed for daily use by individual developers rather than research experimentation. Teams that prioritize convenience, Anthropic's specific model capabilities, and a well-supported commercial product over open-source flexibility may find Claude Code's lower friction worth the trade-offs. Additionally, Claude Code's deep integration with Anthropic's Constitutional AI safety work provides specific behavioral guarantees that SWE-agent's model-agnostic approach cannot enforce uniformly across all model choices.

Conclusion

SWE-agent represents the open-source, research-grade end of the autonomous coding agent spectrum. Built by Princeton and Stanford researchers with state-of-the-art SWE-bench performance, MIT licensing, and support for any API-accessible or local model, it offers a level of transparency, flexibility, and control that commercial alternatives cannot match. As a Claude Code alternative for developers who prioritize open-source principles, data privacy, and full customizability, SWE-agent stands as one of the most technically rigorous options available in 2026.

Sources

FAQ

What is the difference between SWE-agent and mini-SWE-agent?

SWE-agent is the full research framework with rich configuration options, extensive tooling, and multimodal support. Mini-SWE-agent is a simplified 100-line Python implementation released in mid-2025 that achieves comparable performance on SWE-bench verified (65%) with dramatically reduced complexity. The SWE-agent team recommends mini-SWE-agent for most new users who do not need the advanced customization of the full framework.

Can SWE-agent run with local models?

Yes. SWE-agent supports any model accessible via an OpenAI-compatible API, which includes local model servers like Ollama, LM Studio, and vLLM. This means you can run SWE-agent entirely on your own hardware using open-weight models like Llama, Mistral, or Qwen with zero API costs and complete data privacy.

How does SWE-agent perform on real codebases?

SWE-agent consistently ranks among the top open-source agents on SWE-bench — the benchmark that evaluates agents on real GitHub issues from major Python repositories. With Claude 3.7, SWE-agent 1.0 achieved state-of-the-art performance on both SWE-bench verified and full in early 2025. Performance varies by codebase complexity, issue type, and the model used, but the benchmark results provide strong evidence of real-world effectiveness.

Is SWE-agent suitable for non-Python projects?

SWE-agent was originally developed and benchmarked primarily on Python repositories (SWE-bench uses Python projects). However, the framework is language-agnostic — it can work with JavaScript, TypeScript, Go, Rust, Java, and other languages since it operates through terminal commands rather than language-specific tooling. Community configurations for various languages are available in the repository.

Who maintains SWE-agent?

SWE-agent is maintained by researchers from Princeton University and Stanford University, with active contributions from the broader AI research community. The project has a public GitHub repository, regular academic publications, and an active Discord community. The research team releases updates tied to their ongoing work on AI software engineering agents and benchmark development.

Reviews

No reviews yet

Similar tools in category