How I Built a DevSecOps Pipeline with Semgrep + GitHub Actions
A practical walkthrough of embedding SAST into CI/CD using Semgrep and GitHub Actions — from zero findings noise to actionable signal that developers actually trust.
The Problem With Most SAST Integrations
Most teams bolt security scanning onto CI/CD as an afterthought — a job that runs, produces hundreds of findings, gets ignored, and eventually gets disabled. I've seen it happen at multiple companies. The failure mode isn't the tool; it's the integration strategy.
When I joined the AppSec team at my last role, we had a Semgrep scan running in CI. It was generating ~300 findings per scan. Developers had zero context on severity, zero ownership of findings, and zero trust in the tool. The scan was basically decorative.
This post documents how I rebuilt that integration from scratch into something developers actually use.
Why Semgrep Over Other SAST Tools
I've used SonarQube, Checkmarx, and Semgrep extensively. For embedding into developer workflows, Semgrep wins on three dimensions:
- Rules are code — you write rules in YAML with pattern syntax, version them in your repo, and review them like any other code change
- Fast — sub-60-second scans on most codebases; no waiting 20 minutes for a full analysis
- Low false-positive rate when tuned correctly — the registry rules are opinionated and accurate
The trade-off: Semgrep requires upfront rule curation. You can't just point it at a codebase and trust everything it reports.
Step 1: Start With a Focused Ruleset
Don't use p/default out of the box on a new integration. It will flood developers with noise. Instead, start with a narrow, high-confidence ruleset:
# .semgrep.yml
rules:
- id: hardcoded-secret
pattern: $VAR = "..."
message: Potential hardcoded credential in $VAR
languages: [python, javascript, java]
severity: ERROR
metadata:
category: security
confidence: HIGH
I start every new integration with three rule categories only:
- Hardcoded secrets and credentials
- SQL injection sinks
- Dangerous function calls (eval, exec, os.system)
These three categories have near-zero false positive rates when tuned to your stack. Ship those first. Add more categories after developers trust the tool.
Step 2: The GitHub Actions Workflow
Here's the core workflow I use. It runs on every PR and posts results as inline code annotations — no separate dashboard to check, findings appear directly in the diff view.
# .github/workflows/semgrep.yml
name: Semgrep SAST
on:
pull_request:
branches: [main, develop]
push:
branches: [main]
jobs:
semgrep:
name: Semgrep Scan
runs-on: ubuntu-latest
container:
image: semgrep/semgrep
steps:
- uses: actions/checkout@v4
- name: Run Semgrep
run: |
semgrep ci \
--config .semgrep.yml \
--config p/secrets \
--sarif \
--output semgrep.sarif
env:
SEMGREP_APP_TOKEN: ${{ secrets.SEMGREP_APP_TOKEN }}
- name: Upload SARIF to GitHub
uses: github/codeql-action/upload-sarif@v3
if: always()
with:
sarif_file: semgrep.sarif
The SARIF upload is the key piece. GitHub renders SARIF findings as inline annotations on the PR diff — developers see the finding exactly where the vulnerable line is, with the rule explanation.
Step 3: Baseline Suppression
Before turning on blocking enforcement, run the scan against your entire codebase and generate a baseline:
semgrep --config .semgrep.yml --json > baseline.json
Commit baseline.json to the repo. Configure the CI job to only fail on new findings introduced in the current PR — not the existing backlog. This is critical. If you block PRs on pre-existing findings, developers will fight the tool, not fix the code.
- name: Run Semgrep (diff-aware)
run: |
semgrep ci \
--config .semgrep.yml \
--baseline-commit origin/main
The --baseline-commit flag tells Semgrep to only report findings that are new relative to the target branch.
Step 4: Triage False Positives Systematically
Even a well-tuned ruleset produces some false positives. Give developers a documented, lightweight way to suppress them:
# nosemgrep: rule-id ← inline suppression
user_input = sanitize(request.get("q")) # nosemgrep: sql-injection-sink
Track suppression comments in your audit process. If a rule generates more than 2 suppressions per week, revisit the rule — it's probably too broad.
Step 5: Metrics That Matter
After 3 months with this setup, the metrics that convinced leadership to expand the program:
| Metric | Before | After |
|---|---|---|
| Findings per scan | ~300 | ~8 |
| Developer suppression rate | 90% | 12% |
| Mean time to fix HIGH findings | 3 weeks | 4 days |
| Scan duration | 4 min | 47 sec |
The noise reduction from 300 to 8 is the number that matters most. Developers started treating Semgrep findings like compiler warnings — something to actually fix.
What I'd Do Differently
One mistake I made early: I added too many rules too fast after the initial trust was built. We went from 3 rule categories to 15 in one sprint, and noise jumped back up. The right cadence is one new rule category per quarter, with a retroactive false-positive review before adding the next.
The pipeline itself is a product. Treat it like one.