Rule
Guard against slow regular expressions.
Regular expressions with nested quantifiers or ambiguous patterns can cause catastrophic backtracking and performance issues.
Supported languages: 45+Introduction
Regular expressions can freeze your application for seconds or minutes with the right input. Catastrophic backtracking occurs when regex engines explore exponentially increasing paths while trying to match a pattern. A regex like (a+)+b takes microseconds to match valid input but can take hours to reject a string of a's without a trailing b. Attackers exploit this through Regular Expression Denial of Service (ReDoS) attacks, sending crafted input that makes your regex engine consume 100% CPU until request timeouts occur or the process crashes.
Why it matters
Security implications (ReDoS attacks): An attacker can paralyze your application with a single request containing crafted input. Email validation and URL parsing patterns are common targets. Unlike traditional DoS attacks requiring bandwidth, ReDoS needs only tiny payloads.
Performance degradation: Normal user input can trigger catastrophic backtracking, causing response times to spike from milliseconds to seconds. This creates unpredictable latency that's difficult to debug because it only manifests with specific input patterns.
Production incidents: Vulnerable regex blocks the event loop in Node.js or consumes thread pool resources. As requests pile up, memory increases and the system becomes unresponsive. In microservices, one vulnerable regex cascades failures to dependent services.
Difficulty in detection: Patterns that work fine in testing with short inputs become exponentially slow with longer inputs. The vulnerability often goes unnoticed until production, requiring emergency deployment during an active incident.
Code examples
❌ Non-compliant:
function validateEmail(email) {
const regex = /^([a-zA-Z0-9_\-\.]+)+@([a-zA-Z0-9_\-\.]+)+\.([a-zA-Z]{2,5})$/;
return regex.test(email);
}
function extractURLs(text) {
const regex = /(https?:\/\/)?([\w\-])+\.(\w+)+([\w\-\.,@?^=%&:/~\+#]*)+/g;
return text.match(regex);
}
Why it's unsafe: The nested quantifiers ([a-zA-Z0-9_\\-\\.]+)+ create exponential backtracking. For an email like aaaaaaaaaaaaaaaaaaaaaaaaa!, the regex engine tries countless combinations before failing. The URL regex has multiple nested quantifiers that compound the problem, making it trivially exploitable with inputs like long strings of valid characters without the expected structure.
✅ Compliant:
function validateEmail(email) {
const regex = /^[a-zA-Z0-9_\-\.]+@[a-zA-Z0-9_\-\.]+\.[a-zA-Z]{2,5}$/;
return regex.test(email);
}
function extractURLs(text) {
const regex = /https?:\/\/[\w\-]+\.[\w\-]+(?:[\w\-\.,@?^=%&:/~\+#]*)?/g;
return text.match(regex);
}
Why it's safe: Removing nested quantifiers eliminates catastrophic backtracking. Single quantifiers like [a-zA-Z0-9_\-\.]+ execute in linear time. The URL pattern uses non-capturing groups with optional suffix (?:...)? instead of nested repetition, ensuring predictable performance regardless of input length or content.
Conclusion
Regular expression performance is a security concern, not just an optimization. Review all regex patterns for nested quantifiers, overlapping character classes in repetition groups, and ambiguous alternatives. Test regex patterns with pathological inputs (long strings of valid characters followed by invalid endings) to identify catastrophic backtracking before deployment. When possible, replace complex regex with string parsing functions that have predictable performance characteristics.
.avif)
