Security, Privacy & Compliance¶
How this project protects your data, your infrastructure, and your users.
This document covers security architecture, privacy principles, compliance posture, contributor guidelines, and deployment guidance — everything in one place, for everyone who needs it.
Who This Is For¶
| You are a... | Start with |
|---|---|
| User evaluating the tool | Principles, What We Protect Against |
| Administrator deploying it | Deployment Security, Configuration Reference |
| Developer contributing code | Contributor Security Rules, AI Agent Coding Rules |
| Compliance officer | Compliance Posture, Standards Alignment |
Principles¶
Six rules that govern every design decision:
-
Secure by default, permissive by configuration.
The tool ships safe out of the box. Advanced users unlock capabilities through explicit configuration — never the other way around. -
Never limit research capabilities for security theater.
Security controls must protect without reducing the quality, speed, or breadth of results. If a control blocks legitimate research, it becomes configurable — not mandatory. -
Compliance through architecture, not bolt-on checklists.
Data minimization, purpose limitation, and tenant isolation are structural properties of the codebase, not afterthoughts added for an audit. -
STDIO is zero-trust-by-default.
The most common deployment (local CLI via Claude Code, Cursor, etc.) has no network listener, no auth to misconfigure, and no attack surface. Multi-tenant security features only activate in HTTP mode. -
Simple things should be simple; complex things should be possible.
A developer runninggo build && ./web-researcher-mcpshould not need to configure OAuth, encryption keys, or compliance profiles. An enterprise deploying for 10,000 users should have everything they need. -
Elegant code is secure code.
Short, readable, well-tested functions are easier to audit and harder to exploit than clever abstractions. Prefer Go's standard library over third- party dependencies. One clear implementation over pluggable frameworks.
What We Protect Against¶
This server operates in a unique threat environment:
| Threat | Severity | Defense |
|---|---|---|
| SSRF (Server-Side Request Forgery) | Critical | Custom DialContext validating all resolved IPs against private/reserved CIDR blocks + cloud-metadata hostnames + DNS rebinding prevention |
| Prompt injection via scraped content | High | Content sanitization, size limits, and a "trust": "untrusted-external-content" boundary marker in the JSON envelope (the host enforces the prompt boundary — the model lives there) |
| Cost abuse via API key theft | High | Rate limiting (global + per-tenant + daily quota), circuit breakers |
| Cross-tenant data leakage | High | Namespace isolation, per-tenant sessions, configurable cache isolation |
| Denial of Service | High | HTTP timeouts, request size limits, bounded concurrency (semaphore) |
| Cloud metadata credential theft | High | Hostname and IP blocklist for IMDS endpoints (AWS, GCP, Azure, etc.) |
| Supply chain compromise | Medium | cosign-signed binaries, SBOM, CodeQL, govulncheck, dependency pinning |
| Unauthorized access (HTTP mode) | Medium | OAuth 2.1 JWT validation, JWKS auto-refresh, token revocation |
Architecture at a Glance¶
User/LLM → [STDIO or HTTP+TLS] → MCP Server
│
┌─────────────────────┴──────────────────────┐
│ │
Authentication Tool Handler
(JWT/JWKS if HTTP) (search, scrape, etc.)
│ │
Rate Limiting Input Validation
(token bucket) (URL scheme, body size)
│ │
Audit Logging SSRF Protection
(async, structured) (IP validation, DNS pin)
│ │
Tenant Isolation Content Sanitization
(per-session, cache) (bluemonday, size limits)
│ │
Metrics Circuit Breaker
(Prometheus) (per-provider)
│ │
└──────────┬───────────────┘
│
Encrypted Cache
(AES-256-GCM, TTL)
All components are wired through the tools.Dependencies struct — zero global
state, fully testable, every dependency swappable via interfaces.
Defense Layers (Technical Detail)¶
SSRF Protection¶
The highest-severity risk for a scraping server. Our defense:
- Check hostname against blocklist (cloud IMDS endpoints)
- Resolve DNS
- Validate ALL resolved IPs against private/reserved ranges
- Connect directly to the resolved IP (prevents DNS rebinding)
- Re-validate on each redirect hop (max 5 redirects)
Blocked ranges: RFC 1918, link-local (169.254.0.0/16), loopback, multicast,
carrier-grade NAT, documentation ranges, IPv6 equivalents. See
internal/scraper/ssrf.go for the canonical list.
Escape hatch: ALLOW_PRIVATE_IPS=true for development/intranet scraping.
Domain allowlist: ALLOWED_DOMAINS=a.com,b.com for enterprise lock-down.
Content Security¶
Scraped content passes through a configurable sanitization pipeline:
- Default mode: HTML sanitization (bluemonday allowlist — strips scripts, iframes, event handlers), hidden content removal (display:none, zero-width chars), size enforcement (50KB per source, 300KB total)
- Raw mode (
mode: "raw",scrape_pageonly): Returns the fetched bytes verbatim — scripts, styles, and markup intact — for inspecting source such as JSON, HTML, or JavaScript (code analysis, security research, web development). Onlycontent.Processis skipped; the request still passes throughvalidateScrapeURL, the SSRF-safe client, theALLOWED_DOMAINSallowlist, and anio.LimitReaderbounded bymax_length. The returned content is UNTRUSTED (it may carry indirect prompt-injection payloads), so callers must never execute or render it.search_and_scrapehas no raw mode and is always sanitized. - Size limits: Configurable per-request via
maxLengthparameter. The LLM decides how much content it needs based on context window and task. Defaults protect against context flooding; explicit overrides serve legitimate research needs (analyzing large codebases, full-page audits). - Trust boundary marker: every scrape response carries
"trust": "untrusted-external-content"in the JSON envelope (not inside thecontentstring, where a page could forge it) — an explicit signal to treatcontentas data, never as instructions. Alongside it,contentTypereports the form: in sanitized modes the extracted type (e.g.text/markdown); in raw mode the server's realContent-Typeheader (which may be empty) with"raw": true. So downstream consumers always know both what they are handling and that it is untrusted.
The pipeline reduces prompt-injection exposure and context flooding by default, while allowing full access to page source when the research task requires it. One honest limit: the server cannot enforce the prompt boundary itself — the model and agent loop live in the host application, so neutralizing plain-text injection payloads is the host's job; the server's role is to sanitize, cap, and mark provenance so the host can. SSRF protection applies regardless of mode — the security boundary is what URLs you can reach, not what content you can read from them.
Authentication (HTTP mode)¶
OAuth 2.1 Resource Server pattern:
- RS256 JWT signature verification via JWKS endpoint
- Auto-refreshing key cache (configurable interval, default 1 hour)
- Audience and issuer validation
- Token revocation via JTI — in-memory by default, optionally backed by the
encrypted
persist.Storeso revocations survive restarts (fail-closed: a JTI is revoked if present in either layer) - Scope-based per-tool authorization (opt-in via
ENFORCE_SCOPES; see below) - Constant-time comparison for admin key authentication
- Rejects HS256 from external issuers (algorithm confusion prevention)
Scope-based authorization (RBAC). When ENFORCE_SCOPES=true, a token that
carries a scope/scp claim must hold one of tool:*, tool:<name>, or the
coarse research scope to invoke a tool (plus any REQUIRED_SCOPES). It is
permissive by design: ENFORCE_SCOPES=false (default) ignores scopes, and a
token with no scope claim is always allowed (backward-compatible). It fails
closed only on a present-but-insufficient scope. See Middleware.EnforceScopes
in internal/auth/middleware.go and the crosswalk in docs/SECURITY.md.
Session identifiers (accepted risk). Sequential-search session IDs are
static UUIDv4 values that do not rotate over the life of a session. This is an
accepted, documented risk: a session ID is a research-continuity handle, not an
authentication credential. Authorization is always derived from the validated
JWT (tenant/user/scope), and sessions are keyed by the compound
{tenantID}:{userID}:{sessionID} so a guessed or leaked session ID cannot cross a tenant
or user boundary or grant access without a valid token. Rotation was deliberately not
added to avoid breaking the recovery-after-context-loss workflow the IDs exist
to support.
STDIO mode: no authentication (the calling process is the trust boundary).
Rate Limiting (HTTP mode)¶
Three tiers preventing abuse while allowing legitimate high-throughput research:
| Tier | Default | Purpose |
|---|---|---|
| Global | 1000 req/s | Infrastructure protection |
| Per-tenant | 120 req/min | Fair use between tenants |
| Daily quota | 5000 req/day | Cost control |
All configurable. Returns 429 with Retry-After header.
Encryption¶
- At rest: AES-256-GCM with random nonces for cache and sessions (when
CACHE_ENCRYPTION_KEYis set). File permissions 0600. - In transit: TLS 1.2+ for all outbound connections (Go stdlib default). HSTS header for inbound HTTP.
- FIPS option: Build with
GOEXPERIMENT=boringcryptofor FIPS 140-2 validated cryptographic module.
Audit Logging¶
Every tool invocation produces a structured JSON event:
{
"timestamp": "2026-01-15T10:30:00Z",
"event_type": "tool_call",
"tenant_id": "acme-corp",
"user_id": "user@example.com",
"tool_name": "web_search",
"request_id": "550e8400-e29b-41d4-a716-446655440000",
"duration_ms": 342,
"success": true
}
Design: async channel-based (never blocks tool calls), swap-to-disk overflow (never drops events under normal load), configurable output (stderr or file).
What is NOT logged by default: raw query text (only the query length, an
integer, is recorded — unless AUDIT_INCLUDE_REQUEST_BODY=true, when the raw
query is recorded after MaskSecrets redaction), scraped content, and full
request parameters (PII risk). Audit metadata and upstream error messages pass
through audit.MaskSecrets so any credential echoed back by a provider is
redacted before it reaches a sink. Audit files rotate at AUDIT_MAX_BYTES and
are pruned after AUDIT_RETENTION_DAYS (default 180, clamped to [180, 3650]).
Privacy by Design¶
Data Minimization¶
We deliberately minimize what we store:
| Data type | Storage | Retention | Contains PII? |
|---|---|---|---|
| Cached search results | Keyed by query hash (not user ID) | TTL-based (hours) | Rarely |
| Cached scraped pages | Keyed by URL hash | TTL-based (hours) | Possibly |
| Research sessions | Per-tenant:session compound key | Session TTL (default 4h) | Query context |
| Audit logs | Structured JSON | Configurable (default 180 days) | User/tenant IDs only |
| Rate limit counters | Per-tenant in memory | Resets daily | Tenant ID only |
Recent-errors ring (diagnostics://errors/recent) |
Bounded in-memory ring, tenant-aware | None (memory-only, oldest overwritten) | No — causes redacted via audit.MaskSecrets; no query text, no full URLs |
Operator dashboard / diagnostics://health |
Derived on read from existing metrics + breaker state | None (computed per request) | No — aggregate-only, no per-user/per-query data |
The operator-observability surfaces added in the routing-observability work (_meta.routing on results, the diagnostics:// resources, and the admin-gated GET /dashboard) are deliberately privacy-minimal: routing detail exposes only the provider name (never upstream URLs, credentials, or breaker internals), the recent-errors ring is memory-only/bounded/redacted/tenant-scoped, and the dashboard is aggregate-only and admin-gated. None of them is a personal-data store, so none adds a GDPR processing obligation. See docs/DEPLOYMENT.md → Operator Observability and docs/TOOLS.md → Routing Provenance.
Purpose Limitation¶
Each feature declares its data processing purpose explicitly. Core principles:
- Primary purpose: always the user-facing function (search, scrape, analyze)
- Legitimate secondary purposes (when enabled): usage analytics for the user's own benefit, research session memory, trend insights — each opt-in and clearly disclosed
- Never allowed: selling data to third parties, training external models, undisclosed advertising, sharing across tenants without consent
When features use data for purposes beyond the immediate request (e.g., a search analytics dashboard showing trends over time), they are built with: informed activation, transparent data flows, configurable retention, and user-controlled deletion.
Right to Erasure¶
For HTTP multi-tenant deployments: the DELETE /admin/data endpoint erases
all personal data for a (tenant_id, user_id) subject across every registered
store (sessions, analytics, memory, workspace) and simultaneously withdraws
consent so processing cannot silently resume. GET /admin/data exports the
same scope for portability. For STDIO: data is local to your machine — delete
the cache directory. Any feature that stores data beyond the request lifecycle
provides a deletion mechanism.
User Insights Without Surveillance¶
The project offers features that help you understand your own research patterns — which tools you used, how often, and when (opt-in, consent-gated). These are designed as user-owned insights, not profiling:
- Data belongs to the user/tenant and is never shared across boundaries
- The user can view, export, and delete their own data at any time
- No behavioral predictions, scoring, or automated decisions about users
- No data leaves the user's control without explicit action
- STDIO mode stores everything locally — the user controls their own disk
- HTTP mode scopes all analytics to the authenticated tenant
The distinction: showing a user "here's what you searched for last month" is a productivity feature. Building a hidden model of user behavior to influence results without disclosure is profiling. We do the former, never the latter.
Deployment Security¶
STDIO Mode (default, zero-config)¶
No network listener. No attack surface. The binary communicates via stdin/stdout with the calling process. Security is provided by your OS (file permissions, process isolation).
Suitable for: individual developers, local AI assistants, testing.
HTTP Mode (multi-tenant)¶
Activated by setting PORT. Enables OAuth, rate limiting, CORS, and all
multi-tenant security features.
Minimum secure configuration requires: a port, OAuth issuer and audience for JWKS
validation, an allowed origins list, an encryption key for the cache and sessions,
an admin API key, and audit logging enabled with an output path. See
docs/DEPLOYMENT.md for exact variable names and defaults.
Production hardening additionally enables per-tenant cache isolation, sets
rate limits and daily quota, configures a domain allowlist if needed, and adjusts
scrape concurrency. See docs/DEPLOYMENT.md for all variables.
Healthcare deployments (HIPAA): if the server processes or caches content
containing Protected Health Information (PHI), enable encryption at rest, strict
per-tenant cache isolation, and audit logging with a retention path sufficient for
the 6-year HIPAA requirement (45 CFR 164.312(b) requires audit controls; AES-256-GCM
encryption renders a breach non-reportable under Safe Harbor). See
docs/DEPLOYMENT.md for exact variable names and defaults.
A Business Associate Agreement (BAA) is required between the entity deploying
this server and any covered entity whose data it processes. The server's
encryption at rest (AES-256-GCM), encryption in transit (TLS 1.2+), access
controls (OAuth + tenant isolation), and audit logging satisfy HIPAA Technical
Safeguards (45 CFR 164.312). The FIPS build option (GOEXPERIMENT=boringcrypto)
satisfies NIST SP 800-111 requirements referenced by HIPAA.
Container Security¶
The Docker image ships with full rendering capabilities:
- Non-root execution (UID 65534)
- Chromium for headless page rendering (JS-heavy sites, SPAs)
- Alpine base with ca-certificates and required runtime libraries
- Multi-stage build (no build tools in runtime image)
- All images signed with cosign (verify with
cosign verify)
The image includes Chromium because accurate research requires rendering
JavaScript-heavy pages. Chromium is launched with --no-sandbox; container-level
isolation (non-root UID 65534, network policies) provides the sandboxing boundary.
Chromium's attack surface is mitigated by: non-root execution, bounded concurrency
(single browser instance shared across goroutines; page concurrency capped by
MaxConcurrency), SSRF protection on all URLs before they reach the browser, and
container-level network restrictions.
Deployment recommendations:
- Mount cache directory as tmpfs or encrypted volume
- Apply network policies limiting egress to search API endpoints + Chromium download endpoints (for auto-update if enabled)
- Consider seccomp profile (see
deploy/directory when available) - For environments where Chromium is not desired: set
CHROME_PATH=disabledto skip browser-tier scraping entirely
Configuration Reference¶
The security- and privacy-relevant settings are grouped into these knobs — defaults and exact variable names live in .env.example and docs/DEPLOYMENT.md, the two authoritative homes (this guide does not duplicate them, so they cannot drift):
- Encryption at rest — a hex key turns on AES-256-GCM disk encryption for the cache, the TTL key/value store, and sessions; a previous-key slot enables zero-downtime rotation (decrypt-fallback + lazy re-encrypt).
- SSRF / scrape egress — toggles for allowing private/RFC1918 IPs and a domain allowlist that restricts scrape targets.
- CORS — an origin allowlist plus a fail-closed strict toggle (default deny on empty; see
docs/MIGRATION.md). - Auth & scopes (HTTP mode) — OAuth issuer/audience for JWKS validation, JWKS refresh interval, and optional per-tool scope enforcement.
- Admin surface — the admin key that gates every
/admin/*endpoint and the operator dashboard (constant-time compared; a deprecated alias is still accepted with a startup warning). - Tenant isolation — per-tenant cache isolation mode.
- Rate limiting (HTTP mode) — global/second, per-tenant/minute, per-tenant/day, and a pre-auth per-IP ceiling.
- Audit & observability — audit logging on/off, output path, buffer size, log level, and the Prometheus
/metricstoggle.
See docs/DEPLOYMENT.md for the table of every variable, its default, and its effect.
Compliance Posture¶
Prefer slides? The same story — how architecture, not paperwork, keeps this project aligned with 23 standards — is captured in a short visual deck: Compliance as Architecture (PDF · source). Each technical claim names the file that backs it — open any one and check.
What We Target¶
This project is designed to satisfy security and privacy requirements across multiple international frameworks simultaneously. We achieve this through architectural choices that inherently satisfy shared requirements across standards — not through per-framework checkbox exercises.
Standards Alignment¶
| Standard | Relevance | How we align |
|---|---|---|
| ISO 27001 | Foundation | Interface-driven architecture, access control, encryption, audit, supply chain |
| SOC 2 Type II | Enterprise trust | Audit logging, rate limiting, tenant isolation, change management |
| NIST CSF 2.0 | US enterprise | Govern/Identify/Protect/Detect/Respond/Recover mapped to controls |
| GDPR / UK GDPR | Privacy | Data minimization, purpose limitation, TTL caches; data-subject access/portability/erasure endpoints (/admin/data); consent record-verify-honor for regulated features |
| OWASP MCP Cheat Sheet | MCP-specific | SSRF protection, content sanitization, tool annotations, supply chain |
| OWASP Top 10 LLM (2025) | AI security | Prompt injection defense, bounded agency, supply chain verification |
| OWASP Agentic Top 10 (2026 draft) | AI agent security — tool-provider slice only | What this server owns: read-only-default tool annotations (writes non-destructive; deletion is a separate endpoint), JWT scope authorization, consent-gated + tenant-isolated + erasable memory/workspace, SSRF + HTML sanitization + untrusted-content marker, rate limits + circuit breakers + size caps, per-call audit with request-ID correlation. Host-owned (not us): goal/intent manipulation, cascading hallucination, multi-agent trust, the agent permission model — those live in the client that drives the tools. |
| NIST AI RMF | AI governance | Risk-aware design, transparency, continuous monitoring |
| EU AI Act | EU regulation | Transparency, accuracy, robustness, tiered compliance model |
| EU Cyber Resilience Act | Software supply chain | SBOM, signed releases, vulnerability handling, PSIRT, 5yr updates |
| NIS2 | EU critical infrastructure | Incident handling, supply chain, crypto, vulnerability management |
| FedRAMP | US government | FIPS crypto option, access control, audit, vulnerability scanning |
| UK Cyber Essentials | UK market access | Boundary protection, secure config, access control, patching |
| UK NCSC CAF v4.0 | UK critical infra | 14-principle cyber assessment (covers AI risks) |
| BSIMM | Security maturity | Code review, SAST, SCA, architecture analysis, vulnerability mgmt |
| HIPAA | US healthcare | Encryption (AES-256), audit controls, access controls, BAA support, breach notification |
| HITRUST CSF | Healthcare + cross-industry | Maps to 40+ frameworks; combined SOC 2 + HITRUST assessment |
| FIRST PSIRT | Vulnerability handling | Structured triage, remediation, and disclosure for CVEs |
| MITRE ATT&CK | Threat modeling | Security controls mapped to adversary techniques |
| Global CBPR | Cross-border privacy | Data transfer certification for APAC/Americas markets |
| IETF RFC 9700/9449 | OAuth security | Best current practice + DPoP proof-of-possession |
| CSA MCP Security Framework | MCP hardening | Provenance, runtime isolation, secrets, observability |
| NSA MCP Security Guidance | Government/military | Message signing, per-call scoping, trust chains |
What Compliance Means for This Project¶
We build compliance into the architecture, not as an afterthought. Our approach uses a tiered compliance model that scales with capability:
Tier 1 — Core retrieval (always active):
Search, scrape, extract, and return web content. Security controls (SSRF,
encryption, audit, rate limiting) protect infrastructure. Privacy controls
(data minimization, TTL caches, no cross-tenant leakage) protect users. This
tier satisfies the majority of regulatory requirements automatically.
Tier 2 — User-facing analytics and insights (when activated):
Search history, topic trends, usage dashboards, session memory. These features
give users visibility into their own research patterns. Built with: opt-in
activation, user-owned data, tenant-scoped isolation, configurable retention,
full deletion capability. Satisfies GDPR legitimate interest (user's own
benefit) with transparency and control.
Tier 3 — Machine-formatted output (when activated):
The server does not run any LLM or generate prose — synthesis is the client
model's job (server-side summarization was deliberately not built; see #94).
The only machine-shaped output is deterministic generative-UI components
(GENERATIVE_UI_ENABLED): source cards and a quality-comparison table built by
a deterministic transform of already-extracted data, plus consolidated
bibliographies. Built with: a non-bypassable machine-readable marker
("autoFormatted": true, label "mcp-auto-formatted" — explicitly NOT
"AI-generated", because no model is involved), source attribution back to the
raw data, and raw content always present alongside. This transparency posture
aligns with EU AI Act Art. 50 labeling expectations even though no AI content
is produced.
Tier 4 — Personalization and recommendations (when activated):
Cross-session intelligence, personalized ranking, smart suggestions. Built
with: explicit consent, explanation capability, opt-out mechanism, bias
auditing. Satisfies GDPR Art. 22, Quebec Law 25 s.12.1, PIPL Art. 24.
Each tier adds compliance infrastructure proportional to its regulatory exposure. Lower tiers never require higher-tier infrastructure. Features always ship with their compliance requirements met — not after.
Our Compliance Principles (Evergreen)¶
These apply regardless of which features are active:
- Data belongs to the user/tenant. Always viewable, exportable, deletable.
- No hidden data flows. Every processing purpose is disclosed and justified.
- Opt-in for anything beyond the immediate request. Storing data, analyzing patterns, generating content — each requires explicit activation.
- Proportional controls. Simple features get simple compliance. Complex features get comprehensive governance. Nothing is over-engineered.
- Compliance infrastructure ships WITH the feature. Never bolt-on, never "we'll add consent management later."
- Read the code, not the marketing. Our compliance claims are verifiable in source code. Interfaces, tests, and architecture enforce what docs promise.
Operator & Hosted-Service Responsibilities¶
The tiered model and Compliance Principles above describe the technical controls this project ships. They are necessary but not sufficient: every framework on the Standards Alignment table also has an organizational, process half that a source repository cannot contain. "Aligned with" means this project provides the technical controls a standard requires — never that an organization has been audited and certified against it. A binary can't clear a hospital's HIPAA bar; it gives that review its technical-controls evidence.
Responsibility splits three ways.
Shared-responsibility model¶
| The project ships (in code) | The operator owns (process & controller role) | A hosted SaaS additionally owns (org program) |
|---|---|---|
SSRF-safe outbound client (internal/scraper/ssrf.go) |
Being the data controller for data they persist | Staff trained on the controls (e.g. HIPAA workforce training) |
AES-256-GCM at rest (opt-in via CACHE_ENCRYPTION_KEY) + TLS in transit |
Generating & rotating encryption / admin keys | Controls audited operating over a period (SOC 2 Type II: 3–12 mo + an auditor) |
Secrets-masked audit logging (internal/audit) |
Monitoring & alerting on the audit stream; setting the retention schedule (the code gives TTL knobs, not policy) | 24/7 incident-response capability and on-call |
| Tenant isolation, OAuth 2.1 / JWKS / revocation, rate limits | Access reviews; an incident-response runbook; executing breach notification | Signed customer DPAs / BAAs at scale; a named DPO |
Consent + erasure primitives (internal/consent, internal/datasubject, /admin/data) |
Choosing lawful basis; running DPIAs; maintaining a Record of Processing Activities (GDPR Art. 30) | The actual SOC 2 Type II / HITRUST / ISO 27001 audit engagement |
SBOM + cosign signatures + CodeQL + govulncheck + a PSIRT process (root SECURITY.md) |
Signing BAAs (HIPAA) / DPAs (GDPR) with covered entities and their own users | Management review, Statement of Applicability, continual improvement (ISO 27001 Clauses 4–10) |
To actually reach a given standard, the deploying entity must…¶
- SOC 2 Type II — engage an auditor to observe the shipped controls operating over a 3–12 month period, plus a management assertion. The code supplies the control evidence (OAuth, audit logs, change history); it is not the audited program.
- HIPAA — sign a Business Associate Agreement with each covered entity, train its workforce, run a risk analysis, and operate a breach-notification process. The project ships the Technical Safeguards (encryption, audit controls, access controls); the Administrative Safeguards are the operator's.
- GDPR / UK GDPR — name the data controller, choose and document a lawful
basis, run DPIAs where required, maintain a Record of Processing Activities
(Art. 30), and execute 72-hour breach reporting (Art. 33). The project ships the
data-subject-rights primitives (access/portability/erasure via
/admin/data, consent record-verify-honor) — the operator owns the controller obligations. - ISO 27001 — operate an Information Security Management System: documented scope, risk-treatment methodology, Statement of Applicability, internal-audit program, and management review (Clauses 4–10). The project satisfies the relevant Annex A technical controls only.
This split is not a disclaimer — it is the honest shape of "compliance as architecture." The repository can make the technical half provably true; the operating organization owns the process half. Both are required.
Supply Chain Security¶
What We Ship¶
Every release includes:
- Cross-platform binaries (Linux/macOS/Windows, amd64/arm64)
- Multi-arch Docker images (GHCR + Docker Hub)
- Software Bill of Materials (SPDX format)
- cosign signatures on all binaries and container images
- SHA-256 checksums
Verification¶
# 1. Verify the cosign signature on checksums.txt (keyless / Fulcio cert).
# The release signs checksums.txt — not each binary — then the checksum
# file vouches for every artifact.
cosign verify-blob \
--signature checksums.txt.sig \
--certificate checksums.txt.pem \
--certificate-identity-regexp 'https://github.com/zoharbabin/web-researcher-mcp' \
--certificate-oidc-issuer 'https://token.actions.githubusercontent.com' \
checksums.txt
# 2. Verify your downloaded artifacts against the (now-trusted) checksums.
sha256sum -c checksums.txt
# 3. Verify a container image signature directly.
cosign verify ghcr.io/zoharbabin/web-researcher-mcp:latest \
--certificate-identity-regexp 'https://github.com/zoharbabin/web-researcher-mcp' \
--certificate-oidc-issuer 'https://token.actions.githubusercontent.com'
Continuous Security Scanning¶
| Tool | What it checks | When |
|---|---|---|
govulncheck |
Known Go vulnerabilities | Every CI run (make vuln) |
gosec |
Go security scanner (injection, weak crypto, SSRF sinks, unsafe file ops) | Every CI run (make sec) |
golangci-lint |
Static analysis + lint rules | Every CI run (make lint) |
| CodeQL | Semantic code analysis (security-extended) | Every PR + weekly |
| Dependabot | Dependency version vulnerabilities | Continuous |
go mod verify |
Dependency integrity (checksum match) | Every build |
| cosign | Release artifact signatures | Every release |
| Syft | SBOM generation | Every release |
govulncheck, gosec, and golangci-lint are pinned as tool directives in
go.mod, so CI and local runs (make verify) use identical versions. The Go
toolchain version is pinned in go.mod.
Dependency Policy¶
- All dependencies: actively maintained, no unpatched CVEs
- Licenses: permissive only (MIT, Apache 2.0, BSD)
- Preference for Go standard library over third-party
go.sumpins exact dependency hashes- Minimum dependency footprint (fewer deps = smaller attack surface)
Contributor Security Rules¶
Every contributor (human or AI agent) must follow these rules. They are non-negotiable and enforced in code review.
The Rules¶
-
Never introduce OWASP Top 10 vulnerabilities.
No command injection, XSS, SQL injection, SSRF, path traversal, or insecure deserialization. If unsure whether code is safe, ask. -
Validate all external input at system boundaries.
Tool handler inputs, HTTP request parameters, environment variables, scraped content — validate at the boundary, trust within. -
Never log secrets.
API keys, tokens, encryption keys, and credentials must never appear in logs, error messages, or audit events. Even in debug mode. -
Errors are values, never panics.
ReturntoolError()orupstreamErrorResponse(). Neverpanic()in production paths. Never expose internal error details to clients. -
Encrypt sensitive data at rest.
Any new persistent storage of potentially-sensitive data must use the existing encryption infrastructure (cache.DiskCachewith GCM). -
Respect tenant boundaries.
Any new feature touching shared state must consider multi-tenant isolation. Ask: "Can tenant A see tenant B's data?" The answer must be no. -
Use the SSRF-safe client for all outbound HTTP.
Never usehttp.DefaultClientor&http.Client{}directly for fetching user-specified URLs. Always usescraper.NewSSRFSafeClient(). -
Add annotations to new tools.
Read tools must declarereadOnlyAnnotations(idempotent, openWorld). Write tools (those that mutate server-side state or trigger an external side-effect, e.g.memory_save,workspace_contribute,archive_source) must usewriteAnnotations(idempotent)instead. TheTestAllToolsHaveAnnotationstest enforces this in CI. -
Don't accumulate data beyond the request lifecycle.
New features should not store data indefinitely. Use TTLs. If long-term storage is genuinely needed, it requires explicit opt-in + retention policy. -
Keep the dependency footprint minimal.
Prefer standard library. Each new dependency is a supply chain risk. Justify in the PR description why stdlib isn't sufficient.
Security Review Triggers¶
These changes REQUIRE security-focused code review:
- Changes to
internal/auth/orinternal/scraper/ssrf.go - New outbound HTTP calls
- Changes to cache key generation or tenant isolation
- New environment variables accepting secrets
- Changes to the
Dockerfileor CI/CD pipeline - Any use of
unsafe,reflect, oros/exec
Testing Requirements¶
- All new code: unit tests with
t.Parallel() - Security-sensitive code: negative test cases (what happens with malicious input?)
- Race conditions:
go test -race ./...must pass - SSRF: test with private IPs, metadata endpoints, redirect chains
- Auth: test with expired/invalid/missing tokens
AI Agent Coding Rules¶
When AI coding agents (Claude Code, Copilot, Cursor, etc.) work on this codebase, they must follow the contributor rules above PLUS these additional constraints:
Security-First Coding¶
-
Never disable security checks to make tests pass or code compile. Fix the underlying issue instead.
-
Never use
--no-verifyon git commits. Pre-commit hooks exist for a reason. If a hook fails, investigate and fix. -
Never generate or guess URLs for fetching unless explicitly instructed. SSRF can be introduced through hardcoded URLs that happen to resolve to internal services.
-
Never add backdoors, debug endpoints, or admin shortcuts that bypass authentication or authorization. Even "temporary" ones.
-
Never commit secrets, API keys, or credentials. Even example values that look like real keys. Use obviously-fake placeholders:
your-key-here.
Secure Patterns to Follow¶
// DO: Use the SSRF-safe client
client := scraper.NewSSRFSafeClient(cfg.AllowPrivateIPs)
// DON'T: Create an unrestricted client
client := &http.Client{}
// DO: Return typed errors
return toolError("query is required")
// DON'T: Panic
panic("unexpected state")
// DO: Validate URL schemes
if parsed.Scheme != "http" && parsed.Scheme != "https" {
return toolError("URL must use http:// or https://")
}
// DON'T: Pass user input to os/exec
exec.Command("curl", userURL) // NEVER
// DO: Use constant-time comparison for secrets
subtle.ConstantTimeCompare([]byte(provided), []byte(expected))
// DON'T: Direct string comparison for auth
if provided == expected { // timing attack!
What AI Agents Must Check Before Submitting¶
- [ ] No new dependencies added without justification
- [ ] No
panic()calls in non-test code - [ ] No hardcoded IPs, URLs, or credentials
- [ ] No
http.DefaultClientusage for external URLs - [ ] No raw SQL or shell command construction from user input
- [ ]
go test -race ./...passes - [ ]
golangci-lint runpasses - [ ] New tools have annotations + tests
Vulnerability Management¶
Reporting¶
Report security vulnerabilities privately via GitHub Security Advisories.
Do not open public issues for security vulnerabilities.
| SLA | Timeline |
|---|---|
| Acknowledgment | 48 hours |
| Fix plan | 7 days |
| Patch release | 30 days (critical: 72 hours) |
How We Handle Vulnerabilities¶
Our vulnerability handling follows the FIRST PSIRT Services Framework:
- Receive — reports via GitHub Security Advisories or direct contact
- Triage — assess severity using CVSS v4.0, assign CWE identifier
- Remediate — develop and test fix, request CVE if applicable
- Disclose — coordinated disclosure with reporter, publish advisory
- Learn — post-mortem, update threat model, improve defenses
All published advisories include: CVSS v4.0 score, affected versions, CWE identifier, and mitigation guidance.
Threat Model References¶
Our security controls map to MITRE ATT&CK techniques:
| Control | Mitigates |
|---|---|
| SSRF protection | T1190 (Exploit Public-Facing App), T1557 (Adversary-in-the-Middle) |
| Input validation | T1059 (Command/Scripting Interpreter) |
| Content sanitization | T1059.007 (JavaScript), prompt injection vectors |
| Rate limiting | T1499 (Endpoint DoS) |
| Auth/JWKS | T1078 (Valid Accounts), T1550 (Use Alternate Auth Material) |
| Audit logging | T1070 (Indicator Removal) — structured, secret-redacted, request-ID-correlated events (incl. auth failures) aid detection & attribution. Note: logs are append-only JSON, not cryptographically tamper-evident; hash-chaining is a roadmap item (below). |
Roadmap Considerations¶
Security features planned or under consideration:¶
- DPoP token binding (RFC 9449) — proof-of-possession prevents token theft
- Hash-chained audit logs — tamper-evident logging for government deployments
- Breach notification pipeline — webhook alerting on security anomalies
- in-toto build attestations — full supply chain provenance (SLSA Level 3)
- Seccomp profiles — container syscall restriction for hardened deployments
- UK Cyber Essentials certification — UK public sector market access
- Global CBPR certification — cross-border data transfer for APAC markets
Architecture decisions that won't change:¶
- Zero global state (dependency injection via struct)
- Interface-driven design (swap implementations without touching callers)
- Read-only-default tool semantics (reads are the default; write tools exist but are non-destructive — deletion is always a separate admin endpoint, never a tool flag)
- STDIO as the zero-config default
- Go standard library preference over third-party
- Compliance proportional to activated features (tiered model)
Further Reading¶
| Document | Content |
|---|---|
docs/SECURITY.md |
Detailed technical security architecture (threat model, defense layers, crypto specs) |
docs/DEPLOYMENT.md |
Production deployment guide (Docker, K8s, env vars, scaling) |
docs/ERROR_HANDLING.md |
Error taxonomy and LLM-facing message design |
CONTRIBUTING.md |
Full contributor guide (setup, style, PR process) |
SECURITY.md (root) |
Vulnerability reporting policy |
.env.example |
All configuration options with descriptions |