How I Built a DevSecOps Pipeline with Semgrep + GitHub Actions

The Problem With Most SAST Integrations

Most teams bolt security scanning onto CI/CD as an afterthought — a job that runs, produces hundreds of findings, gets ignored, and eventually gets disabled. I've seen it happen at multiple companies. The failure mode isn't the tool; it's the integration strategy.

When I joined the AppSec team at my last role, we had a Semgrep scan running in CI. It was generating ~300 findings per scan. Developers had zero context on severity, zero ownership of findings, and zero trust in the tool. The scan was basically decorative.

This post documents how I rebuilt that integration from scratch into something developers actually use.

Why Semgrep Over Other SAST Tools

I've used SonarQube, Checkmarx, and Semgrep extensively. For embedding into developer workflows, Semgrep wins on three dimensions:

Rules are code — you write rules in YAML with pattern syntax, version them in your repo, and review them like any other code change
Fast — sub-60-second scans on most codebases; no waiting 20 minutes for a full analysis
Low false-positive rate when tuned correctly — the registry rules are opinionated and accurate

The trade-off: Semgrep requires upfront rule curation. You can't just point it at a codebase and trust everything it reports.

Step 1: Start With a Focused Ruleset

Don't use p/default out of the box on a new integration. It will flood developers with noise. Instead, start with a narrow, high-confidence ruleset:

# .semgrep.yml
rules:
  - id: hardcoded-secret
    pattern: $VAR = "..."
    message: Potential hardcoded credential in $VAR
    languages: [python, javascript, java]
    severity: ERROR
    metadata:
      category: security
      confidence: HIGH

I start every new integration with three rule categories only:

Hardcoded secrets and credentials
SQL injection sinks
Dangerous function calls (eval, exec, os.system)

These three categories have near-zero false positive rates when tuned to your stack. Ship those first. Add more categories after developers trust the tool.

Step 2: The GitHub Actions Workflow

Here's the core workflow I use. It runs on every PR and posts results as inline code annotations — no separate dashboard to check, findings appear directly in the diff view.

# .github/workflows/semgrep.yml
name: Semgrep SAST

on:
  pull_request:
    branches: [main, develop]
  push:
    branches: [main]

jobs:
  semgrep:
    name: Semgrep Scan
    runs-on: ubuntu-latest
    container:
      image: semgrep/semgrep
    steps:
      - uses: actions/checkout@v4

      - name: Run Semgrep
        run: |
          semgrep ci \
            --config .semgrep.yml \
            --config p/secrets \
            --sarif \
            --output semgrep.sarif
        env:
          SEMGREP_APP_TOKEN: ${{ secrets.SEMGREP_APP_TOKEN }}

      - name: Upload SARIF to GitHub
        uses: github/codeql-action/upload-sarif@v3
        if: always()
        with:
          sarif_file: semgrep.sarif

The SARIF upload is the key piece. GitHub renders SARIF findings as inline annotations on the PR diff — developers see the finding exactly where the vulnerable line is, with the rule explanation.

Step 3: Baseline Suppression

Before turning on blocking enforcement, run the scan against your entire codebase and generate a baseline:

semgrep --config .semgrep.yml --json > baseline.json

Commit baseline.json to the repo. Configure the CI job to only fail on new findings introduced in the current PR — not the existing backlog. This is critical. If you block PRs on pre-existing findings, developers will fight the tool, not fix the code.

- name: Run Semgrep (diff-aware)
  run: |
    semgrep ci \
      --config .semgrep.yml \
      --baseline-commit origin/main

The --baseline-commit flag tells Semgrep to only report findings that are new relative to the target branch.

Step 4: Triage False Positives Systematically

Even a well-tuned ruleset produces some false positives. Give developers a documented, lightweight way to suppress them:

# nosemgrep: rule-id  ← inline suppression
user_input = sanitize(request.get("q"))  # nosemgrep: sql-injection-sink

Track suppression comments in your audit process. If a rule generates more than 2 suppressions per week, revisit the rule — it's probably too broad.

Step 5: Metrics That Matter

After 3 months with this setup, the metrics that convinced leadership to expand the program:

Metric	Before	After
Findings per scan	~300	~8
Developer suppression rate	90%	12%
Mean time to fix HIGH findings	3 weeks	4 days
Scan duration	4 min	47 sec

The noise reduction from 300 to 8 is the number that matters most. Developers started treating Semgrep findings like compiler warnings — something to actually fix.

What I'd Do Differently

One mistake I made early: I added too many rules too fast after the initial trust was built. We went from 3 rule categories to 15 in one sprint, and noise jumped back up. The right cadence is one new rule category per quarter, with a retroactive false-positive review before adding the next.

The pipeline itself is a product. Treat it like one.