Aikido

10 Code Quality Rules Learned from Grafana’s Engineering Team

Introduction

Grafana is one of the most popular open-source observability platforms, with over 70k GitHub stars and thousands of contributors improving it every day. With more than 3k open issues and hundreds of pull requests constantly in motion, keeping the codebase clean and consistent is a real challenge.

By studying its codebase, we can uncover some of the unwritten rules and best practices that help the team keep quality high while moving fast. Many of these rules focus on security, maintainability, and reliability. Some of them deal with issues that traditional static analysis tools (SAST) cannot catch, like improper async usage, resource leaks, or inconsistent patterns across the code. These are the kinds of problems that human reviewers or AI-powered tools can spot during a code review.

The challenges

Large projects like this face several challenges: massive code volume, many modules (API, UI, plugins), and countless external integrations (Prometheus, Loki, etc.).  Hundreds of contributors may follow different coding styles or assumptions.  New features and quick fixes can introduce hidden bugs, security flaws, or confusing code paths.  Volunteer reviewers may not know every part of the codebase, leading to missed design patterns or best practices. In short, scale and diversity of contributions make consistency and reliability tough to enforce.

Why these rules matter

A clear set of review rules directly benefits Grafana’s health. First, maintainability improves: consistent patterns (folder layout, naming, error handling) make the code easier to read, test, and extend. Reviewers spend less time guessing intent when everyone follows common conventions. Second, security is enhanced: rules like “always validate user input” or “avoid open redirects” prevent vulnerabilities (CVE-2025-6023/4123, etc.) that have been found in Grafana  . Finally, onboarding new contributors is faster: when examples and reviews consistently use the same practices, newcomers learn the “Grafana way” quickly and confidently.

Bridging context to these rules

These rules come from real issues in Grafana’s code and community.  Security advisories and bug reports have uncovered patterns (e.g. path traversal leading to XSS ) that we turn into preventive rules. Each rule below highlights a concrete pitfall, explains why it matters (performance, clarity, security, etc.), and shows a clear ❌ non-compliant vs ✅ compliant snippet in Grafana’s languages (Go or TypeScript/JS).

Now let’s explore the 10 rules that help keep Grafana’s codebase robust, secure, and understandable.

10 Practical Code Quality Rules Inspired by Grafana

1. Use environment variables for configuration (avoid hard-coded values).

Avoid hard-coding ports, credentials, URLs, or other environment-specific values. Read them from environment variables or config files to keep code flexible and secrets out of source.

Non-compliant:

// server.js
const appPort = 3000;
app.listen(appPort, () => console.log("Listening on port " + appPort));

Compliant:

// server.ts
const PORT = Number(process.env.PORT) || 3000;
app.listen(PORT, () => console.log(`Listening on port ${PORT}`));

Why this matters: Using environment variables keeps sensitive data out of the source code, makes deployments flexible across different environments, and avoids accidental leaks of secrets. It also ensures that configuration changes don’t require code modifications, improving maintainability and reducing errors.

2. Sanitize user input before using it.

All input from users or external sources should be validated or sanitized before use to prevent injection attacks and unexpected behavior.

Non-compliant:

// frontend/src/components/UserForm.tsx
const handleSubmit = (username: string) => {
  setUsers([...users, { name: username }]);
};

Compliant:

// frontend/src/utils/sanitize.ts
export function sanitizeInput(input: string): string {
  return input.replace(/<[^>]*>/g, ''); // removes HTML tags
}

// frontend/src/components/UserForm.tsx
import { sanitizeInput } from '../utils/sanitize';

const handleSubmit = (username: string) => {
  const cleanName = sanitizeInput(username);
  setUsers([...users, { name: cleanName }]);
};

Why this matters: Proper input sanitization prevents XSS, injection attacks, and unexpected behavior caused by malformed input. It protects both users and the system, and ensures that downstream processes, logging, and storage handle data safely.

3. Prevent open redirects and path traversal.

Ensure that any URLs or file paths used in your code are properly validated and sanitized. Do not allow user input to directly determine redirects or filesystem paths.

Non-compliant:

// Express route in Grafana plugin
app.get("/goto", (req, res) => {
  const dest = req.query.next;    // attacker can supply any URL
  res.redirect(dest);
});

Compliant:

// Express route with safe redirect
app.get("/goto", (req, res) => {
  const dest = req.query.next;
  // Only allow relative paths starting with '/'
  if (dest && dest.startsWith("/")) {
    res.redirect(dest);
  } else {
    res.status(400).send("Invalid redirect URL");
  }
});

Why this matters: Preventing open redirects and path traversal protects users from phishing, data leaks, and unauthorized file access. It reduces attack surface, enforces security boundaries, and avoids accidental exposure of sensitive server resources.

4. Enable a strict content security policy (CSP).

Enforce a Content Security Policy in the application headers that only allows scripts, styles, images, and other resources from trusted sources. Disallow unsafe-inline, eval, and wildcard sources.

Non-compliant: (No CSP or too permissive)

# grafana.ini (non-compliant)
content_security_policy = false

Compliant: (Strong CSP in config)

# grafana.ini
content_security_policy = true
content_security_policy_template = """
  script-src 'self' 'unsafe-eval' 'unsafe-inline' 'strict-dynamic' $NONCE;
  object-src 'none';
  font-src 'self';
  style-src 'self' 'unsafe-inline' blob:;
  img-src * data:;
  base-uri 'self';
  connect-src 'self' grafana.com ws://$ROOT_PATH wss://$ROOT_PATH;
  manifest-src 'self';
  media-src 'none';
  form-action 'self';
"""

Why this matters: A strict CSP blocks many classes of client-side attacks, including XSS. It enforces predictable behavior for resources, reduces the chance of malicious code execution, and provides a clear security boundary in the browser context.

5. Handle errors and nil checks (avoid panics).

Always check for errors and nil values in function calls, API responses, and data structures. Replace panics with proper error handling and return meaningful error messages or codes.

Non-compliant:

rows, _ := db.Query("SELECT * FROM users WHERE id=?", id)  // ignored error
user := &User{}
rows.Next()
rows.Scan(&user.Name)  // rows might be empty => user is nil => panic

Compliant:

rows, err := db.Query("SELECT * FROM users WHERE id=?", id)
if err != nil {
    return nil, err
}
defer rows.Close()
if !rows.Next() {
    return nil, errors.New("user not found")
}
var name string
if err := rows.Scan(&name); err != nil {
    return nil, err
}
user := &User{Name: name}

Why this matters: Proper error handling prevents crashes and ensures the system remains reliable even when unexpected inputs or conditions occur. It improves maintainability, reduces downtime, and makes debugging easier by providing meaningful error information.

6. Defer resource cleanup (prevent leaks).

Ensure all opened resources such as files, network connections, or database handles are properly closed using defer immediately after allocation. Do not rely on manual cleanup later in the code.

Non-compliant:

resp, err := http.Get(url)
// ... use resp.Body ...
// forgot: resp.Body.Close()

Compliant:

resp, err := http.Get(url)
if err != nil {
    // handle error
}
defer resp.Body.Close()
// ... use resp.Body ...

Why this matters: Proper cleanup prevents memory leaks, file descriptor exhaustion, and connection pool saturation. This maintains system stability, avoids performance degradation over time, and reduces operational issues in production.

7. Use parameterized queries (avoid SQL injection).

Always use parameterized queries or prepared statements when interacting with the database instead of string concatenation for SQL commands.

Non-compliant:

// Dangerous: userID might contain a SQL quote or injection
query := "DELETE FROM sessions WHERE user_id = '" + userID + "';"
db.Exec(query)

Compliant:

// Safe: userID is passed as a parameter
db.Exec("DELETE FROM sessions WHERE user_id = ?", userID)

Why this matters: Parameterized queries prevent SQL injection attacks, one of the most common security vulnerabilities. They protect sensitive data, reduce the risk of database corruption, and make queries more maintainable and easier to audit. This ensures both the security and reliability of your application.

8. Use async/await properly in TypeScript (handle promises).

Always await promises and handle errors using try/catch instead of ignoring rejections or mixing callback-style handling.

Non-compliant:

async function fetchData() {
  // Missing await: fetch returns a Promise, not the actual data
  const res = fetch('/api/values');
  console.log(res.data); // undefined
}

Compliant:

async function fetchData() {
  try {
    const res = await fetch('/api/values');
    const data = await res.json();
    console.log(data);
  } catch (err) {
    console.error("Fetch failed:", err);
  }
}

Why this matters: Proper async handling ensures that errors in asynchronous code don’t go unnoticed, prevents unhandled promise rejections, and maintains predictable program flow. It makes code more readable, easier to debug, and prevents subtle bugs that can lead to data corruption, inconsistent state, or unexpected runtime crashes.

9. Favor strict types in TypeScript (avoid any).

Use precise TypeScript types instead of any to define variables, function parameters, and return types.

Non-compliant:

// No types specified
function updateUser(data) {
  // ...
}
let config: any = loadConfig();

Compliant:

interface User { id: number; name: string; }
function updateUser(data: User): Promise<User> {
  // ...
}
interface AppConfig { endpoint: string; timeoutMs: number; }
const config: AppConfig = loadConfig();

Why this matters: Strict typing catches type-related mistakes at compile time, reducing runtime errors and improving code reliability. It makes the code self-documenting, easier to refactor, and ensures that all parts of the system interact in a predictable, type-safe way, which is crucial in large, complex codebases like Grafana’s.

10. Apply consistent code style and naming.

Enforce uniform formatting, naming conventions, and file structures across the codebase.

Non-compliant: (mixed styles)

const ApiData = await getdata();   // PascalCase for variable? function name not camelCase.
function Fetch_User() { ... }      // Unusual naming.

Compliant:

const apiData = await fetchData();
function fetchUser() { ... }

Why this matters: Consistent style and naming improve readability and make it easier for multiple contributors to understand and maintain the code. It reduces cognitive overhead when navigating the project, prevents subtle bugs caused by misunderstandings, and ensures that automated tools (linters, formatters, code reviewers) can reliably enforce quality standards in a large team environment.

Conclusion

Each rule above addresses a recurring challenge in Grafana’s codebase. Applying them consistently during code reviews helps the team maintain clean and predictable code, improve security by preventing common vulnerabilities, and make onboarding smoother by providing clear patterns for new contributors. As the project scales, these practices keep the codebase reliable, maintainable, and easier to navigate for everyone involved. Following these rules can help any engineering team build and sustain high-quality software at scale.

FAQs

Got Questions?

Why analyze Grafana’s repository for code review rules?

Grafana is a large, mature open-source project with thousands of contributors. Studying its codebase reveals real-world engineering patterns that help teams maintain clean, scalable, and secure software at scale.

What makes these rules different from regular linting or formatting checks?

Traditional linters catch syntax and formatting issues. These Grafana-based rules go deeper, focusing on architecture, readability, consistency, and security decisions that require contextual understanding — something AI-based reviews can handle.

How can AI-based tools help detect these code quality issues?

AI tools can analyze intent, naming, architecture, and context — not just syntax. They can identify maintainability problems, unclear abstractions, and potential security issues that traditional static analysis often misses.

Are these rules specific to Grafana or can any team use them?

While they’re inspired by Grafana’s codebase, the principles apply to any large-scale software project. Teams can adapt them to their own repositories to maintain consistency, prevent regressions, and improve onboarding.

How do these rules relate to Aikido’s code quality checks?

Each rule can be implemented as a custom AI rule in Aikido’s code quality platform, allowing automated detection of architecture, readability, and security issues in every pull request.

Get secure for free

Secure your code, cloud, and runtime in one central system.
Find and fix vulnerabilities fast automatically.

No credit card required | Scan results in 32secs.