Aikido

AI Pentesting in Action: A TL;DV Recap of Our Live Demo

Trusha SharmaTrusha Sharma
|
#
#

If you missed the live walkthrough of Aikido AI Pentesting, here is the short version of what happened. We set up a real application, configured the assessment, and watched the agents test live flows, explore the app, and surface confirmed findings with full traces.

TL;DV

Pentesting is one of the slowest parts of modern security. Teams deploy every day, while offensive testing still happens once a year and arrives as a static PDF that is already outdated. The demo opened with this gap, then immediately moved into showing how Aikido Attack works inside the product.

Setting up a pentest in Aikido feels like briefing a red team. You define the scope in plain language, choose which domains agents may attack and which must remain reachable, and describe the authentication flow exactly as you would to a human tester. You can include MFA, SSO flows, redirects, or multi-step sequences. The agents follow it.

You can also connect repositories and upload context like API specs, earlier reports, and documentation. More context improves the assessment, which is consistent across both the demo and our documentation.

Once the run began, the dashboard populated with agent terminals and browser sessions. You could watch them explore routes, execute attack attempts, adapt when something succeeded, and validate findings directly in the live environment. Every action was visible, down to request logs and screenshots.

The findings page showed confirmed vulnerabilities with full traces and reproduction steps.

One example in the live session was an improper access control issue where private notes could be fetched through an API call.

Another was a command injection that AutoFix could repair automatically. With one click, the platform generated a pull request and allowed a retest to confirm the fix.

Aikido’s platform is the biggest advantage. Because the product already understands your repositories, your security context, and how your application behaves, the agents test with background knowledge that traditional approaches lack. That context improves the depth of the assessment and allows AutoFix to produce meaningful, targeted fixes.

The session ended with the audit-ready PDF report and a Q&A covering scope control, validation, business logic testing, and how continuous pentesting will fit into normal development workflows.

AI Pentesting FAQs

What is AI pentesting inside Aikido?

Aikido uses coordinated agents that explore the application, follow real user flows, test attack paths, and validate exploitability. They use a browser, a terminal environment, and an HTTP client. When you connect code and upload context, the agents reason through logic and intended behavior instead of relying on static payloads.

The result is a pentest that adapts, explores, and validates.

read more → https://help.aikido.dev/pentests/aikido-pentest

How is this different from traditional DAST tools?

DAST tools rely on fixed patterns. They struggle with authentication steps, roles, and multi-step workflows. They also tend to produce noise.

Aikido Attack behaves more like human offensive testing. Agents read context, plan actions, execute attacks, observe outcomes, and adjust. Every finding must be validated in the target environment before it appears in the report.

What kinds of issues can the agents find?

Everything expected from a penetration test:

  • SQL injection
  • Command injection / RCE
  • XSS
  • SSRF
  • Broken access control
  • IDOR / BOLA
  • Authentication flaws
  • Unsafe or sensitive API paths

And critically, business logic issues that depend on understanding how the application is supposed to behave.

In the demo, the agents identified a private data exposure through an API. In customer environments, they have found permission mismatches, workflow bypasses, and cross-tenant data access issues.

More details → https://help.aikido.dev/pentests/what-issues-can-aikido-pentest-find

Can it really detect IDOR and business logic flaws?

Yes. When the platform understands roles, data flows, and expected behavior, the agents can test whether users can access or modify resources they should not. In several comparisons with human penetration testers, the autonomous run surfaced more logic flaws.

More details → https://help.aikido.dev/pentests/understanding-and-detecting-idor-vulnerabilities

How do you prevent hallucinations or false positives?

The agents may generate hypotheses, but the platform does not trust them until they are validated.

For each proposed issue, Aikido runs a reproducibility test directly against the target.

Only validated findings appear in the report.

How do you keep the pentest safe and in scope?

You define:

  • Attackable domains
  • Reachable but non-attackable domains
  • Authentication instructions
  • Maximum number of agents
  • Allowed testing hours

All network traffic flows through a proxy that blocks anything outside scope.

Pre-flight checks confirm that authentication and connectivity work before the run begins.

If pre-flight fails, credits are refunded. A panic button stops the test within seconds.

more details on scope → https://help.aikido.dev/pentests/scope-of-assessment

Is the final report accepted for SOC 2 and ISO 27001?

Yes. The generated PDF includes methodology, scope, issue details, reproduction steps, and remediation guidance.

Customers already use these reports for SOC 2, ISO 27001, and vendor assessments.

You can also download a sample PDF report here: https://www.aikido.dev/attack/aipentest#report

How does AI pentesting compare to a human pentest?

This was covered in the demo. For web applications, the autonomous run delivers coverage comparable to a manual pentest, and in multiple cases it uncovered logic flaws the human team missed.

Our whitepaper findings match this: AI identified deep logic issues such as IDORs, authentication bypasses, and e-signature forgeries that humans overlooked, while humans tended to focus more on configuration and compliance.

AI finishes in hours instead of weeks.

Most teams use AI pentesting as the foundation and add human review when needed.

Do I need to give access to my code?

You don’t have to, but connecting repositories makes the assessment significantly stronger. With code access, the agents can understand logic paths, data rules, roles, and workflow assumptions. That context improves coverage and reduces guesswork.

Black-box mode still works, but it is naturally slower and less complete because the agents have to infer structure from the outside.

How does pricing work?

Three common entry points:

  • Feature Pentest: CI/CD and new feature deployments
  • Standard Pentest: Comprehensive audit
  • Advanced Pentest: Deeper analysis of mature applications
  • Enterprise (Custom Pricing): For organizations with advanced offensive testing needs

A more detailed breakdown is available here: https://www.aikido.dev/attack/aipentest

What role does AutoFix play?

AutoFix takes a confirmed vulnerability and turns it into a concrete code change. In the demo, a command-injection finding produced a pull request with the exact fix.

The value is the loop:

Attack finds → AutoFix proposes a PR → you merge → Attack retests the fix.

Because Aikido already understands your repositories and structure, the fixes are targeted and the verification is immediate.

How does retesting work?

You can retest any issue as many times as needed for three months after the assessment. Each retest launches new agents to attempt the exploit again and ensure the fix holds.

Where is this heading next?

Two directions discussed in the demo:

  • Smoother onboarding with improved pre-flight checks and automatic credit estimation.
  • Continuous pentesting. Running Attack on staging by default, triggering it on deployments or pull requests, and shifting from a yearly PDF to ongoing verification.

Pentesting becomes part of how you ship.

See it yourself

4.7/5

Secure your software now

Start for Free
No CC required
Book a demo
Your data won't be shared · Read-only access · No CC required

Get secure now

Secure your code, cloud, and runtime in one central system.
Find and fix vulnerabilities fast automatically.

No credit card required | Scan results in 32secs.