Aikido

One year of Opengrep: What we built and what’s next

Written by
Dimitris Mostrous

It’s been a year since a group of security vendors: Aikido Security, Arnica, Amplify, Endor Labs, Jit, Kodem, Legit, Mobb, Orca Security, Phoenix Security,  forked Semgrep to create Opengrep. The underlying goal was simple: keep static analysis capabilities available in open source and create the most advanced engine. 

The project focused on four areas of work that were difficult or impossible inside Semgrep CE at the time: migrating to OCaml 5 with shared-memory parallelism, introducing intrafile cross-function taint analysis, expanding language support (including Visual Basic, Apex and Elixir), and enabling native Windows support. Semgrep has since followed with its own OCaml 5 migration and Windows support, reflecting similar technical priorities. 

One year later, those changes are starting to show measurable results, while remaining fully compatible with existing rules and configurations: 

  • 25–74% faster scans when running full rule sets across large repositories
  • Up to 2× faster taint analysis, enabling deeper data-flow security checks
  • 1.74 million binary downloads from GitHub releases
  • 2,000+ GitHub stars
  • 10 security companies using Opengrep in production

But the first year of an open source fork is not just about shipping features. It’s about building trust in the project, proving that the goals behind the fork hold up over time, establishing governance, and improving security practices as the project matures. This is outlined in our manifesto, and the README in our repo

This post reflects on what worked, what needed improvement, and what comes next for Opengrep.

Maintainer Q&A

Today, Opengrep is maintained by Dimitris Mostrous, Maciej Piróg, and Corneliu Hoffman.

In this Q&A, we ask them all the important questions around Opengrep. 

Q. What did the fork enable that wasn’t possible inside Semgrep at the time?

A. The fork allowed us to make several architectural decisions that would have been difficult inside the existing upstream project.

First, we migrated the engine to OCaml 5 with shared-memory parallelism. At the time, Semgrep was still using a fork-based concurrency model, which made certain improvements such as Windows support extremely difficult. Moving to OCaml 5 established a better foundation for performance improvements and cross-platform support. As a result, we were able to introduce native Windows support.

Then, we made cross-function taint analysis available in open source for the use of any third party vendor. In Opengrep it is available via the --taint-intrafile flag, including support for higher-order functions across multiple languages. 

We also expanded language support, adding Visual Basic, and enabling Apex and Elixir in the open-source engine. Clojure taint support was also added.

Finally, the fork allowed us to remove telemetry and proprietary service dependencies.

Q. Where can I see a measurable difference between Opengrep and Semgrep CE? 

A. Opengrep comes out faster for both search rules scanning and taint rules scanning. It has better taint detection, with more findings in multi-hop scenarios (for instance 25 vs 5 on ComfyUI with taint rules). It is also easier to run in a wider range of environments. Opengrep ships as a self-contained binary with no Python dependency, supports more languages out of the box, and introduces features such as per-rule timeouts, dynamic timeouts based on file size, and configurable ignore annotations.

Capability Opengrep Semgrep CE
Cross-function taint analysis Available Pro-only
Windows support Native Added later
Python dependency None (self-contained binary) Required
Languages Includes Visual Basic, Apex, Elixir, Clojure More limited
Telemetry None Enabled by default

Q. What engineering work did you undertake in the first year?

A. Behind the performance improvements and new capabilities was a significant amount of engineering work:

  • 43 releases shipped
  • 1,116 commits across 318 pull requests
  • 1,546 files modified
  • 21 contributors involved in the project

Most development has been led by the three core maintainers, but the project has also started to attract external contributions. In the first year, 17 external contributors submitted 29 pull requests. The external contributions include taint tracking and language improvements (Kotlin scope functions), distribution infrastructure (install script), and output improvements (fingerprinting). The barrier to entry is high: the codebase is ~200k lines of OCaml, and contributing a language feature requires understanding how languages are parsed, translated into the generic AST and how the intermediate representations and the taint engine work. It’s a key goal for us to grow the external contributor base for 2026.

Q. What did you get wrong?

A.  We did not secure the package name on pypi and distributed wheels on our release. In combination with a misconfiguration, this created a pathway for a malicious user to hijack the package. 

Governance and long-term sustainability

The maintainer team (Dimitris Mostrous, Maciej Piróg, and Corneliu Hoffman) is responsible for the technical direction of the project, including reviewing contributions, maintaining releases, and guiding the roadmap.

Opengrep began as a collaborative effort across several companies in the security ecosystem that shared the goal of keeping advanced static analysis capabilities available in open source. Today the project is stewarded by the maintainer team and developed openly on GitHub, with roadmap discussions and technical decisions visible to the community.

Looking ahead, the goal is to continue expanding the community around Opengrep while maintaining transparent governance. 

Opengrep is released under the LGPL-2.1 licence, ensuring the engine and its derivatives remain open.

Why Opengrep matters in the age of AI security analysis

AI scanners are probabilistic. The same input can produce different outputs between runs, depending on model state, sampling, and context. That's fine for hunting novel vulnerabilities, but it creates real problems in CI/CD pipelines: results that shift between runs undermine reproducibility, make compliance harder, and erode trust in the tooling. Opengrep produces the same findings from the same code and rules, every time. That consistency is what makes scan output work as a pipeline gate, hold up in an audit, and give developers a reason to act on findings.

There's a practical gap too. AI scanners typically require API calls, GPU compute, or both. Testing by James Berthoty at Latio showed a probabilistic model spending 17 minutes and 155,000 tokens to find an issue Opengrep caught in 30 seconds. Opengrep runs locally, needs no external services, and operates on explicit rules that teams can inspect and tune. For known vulnerability classes running on every commit, the economics are clear.

The strongest security pipelines layer both. Deterministic scanning runs first, catching known patterns fast and consistently, then AI reasoning handles triage, exploitability assessment, and prioritisation. For instance, Opengrep can power the SAST layer, and AI can sit on top to suppress false positives and evaluate severity in context. The deterministic layer gets more valuable in an AI-powered pipeline because AI needs consistent signal to reason over.

As AI agents and coding assistants become part of development workflows, they need tools they can call programmatically with deterministic output, structured results and predictable behaviour on every invocation. Opengrep fits that pattern well and can integrate with AI workflows programmatically or by using our official agent skill. Moreover, coding agents can be used to define rules which can then be used to scan efficiently and at scale using Opengrep.

What next for Opengrep? 

With the core architecture now in place, the next phase of Opengrep focuses on expanding the capabilities of the engine and improving usability. Many of our priorities come directly from feedback from production users and contributors in the community.

One of the biggest priorities is interfile taint analysis, allowing the engine to track how untrusted data flows across multiple files rather than only within a single file. This will significantly improve the detection of complex vulnerabilities in actual codebases.

Another milestone is removing the remaining Python wrapper, moving toward a fully standalone OCaml binary, and simplifying installation and CI usage. Our roadmap also includes new language support, improvements to existing language grammars, and broader distribution through package managers such as Homebrew, Winget and apt.

Our roadmap is ambitious, but the focus remains building a fast, capable static analysis engine that remains openly available to developers and security teams.

Share:

https://www.aikido.dev/blog/opengrep-sast-one-year

Subscribe for news

4.7/5
Tired of false positives?

Try Aikido like 100k others.
Start Now
Get a personalized walkthrough

Trusted by 100k+ teams

Book Now
Scan your app for IDORs and real attack paths

Trusted by 100k+ teams

Start Scanning
See how AI pentests your app

Trusted by 100k+ teams

Start Testing

Get secure now

Secure your code, cloud, and runtime in one central system.
Find and fix vulnerabilities fast automatically.

No credit card required | Scan results in 32secs.