Security, Privacy & Compliance¶

How this project protects your data, your infrastructure, and your users.

This document covers security architecture, privacy principles, compliance posture, contributor guidelines, and deployment guidance — everything in one place, for everyone who needs it.

Who This Is For¶

You are a...	Start with
User evaluating the tool	Principles, What We Protect Against
Administrator deploying it	Deployment Security, Configuration Reference
Developer contributing code	Contributor Security Rules, AI Agent Coding Rules
Compliance officer	Compliance Posture, Standards Alignment

Principles¶

Six rules that govern every design decision:

Secure by default, permissive by configuration.
The tool ships safe out of the box. Advanced users unlock capabilities through explicit configuration — never the other way around.
Never limit research capabilities for security theater.
Security controls must protect without reducing the quality, speed, or breadth of results. If a control blocks legitimate research, it becomes configurable — not mandatory.
Compliance through architecture, not bolt-on checklists.
Data minimization, purpose limitation, and tenant isolation are structural properties of the codebase, not afterthoughts added for an audit.
STDIO is zero-trust-by-default.
The most common deployment (local CLI via Claude Code, Cursor, etc.) has no network listener, no auth to misconfigure, and no attack surface. Multi-tenant security features only activate in HTTP mode.
Simple things should be simple; complex things should be possible.
A developer running go build && ./web-researcher-mcp should not need to configure OAuth, encryption keys, or compliance profiles. An enterprise deploying for 10,000 users should have everything they need.
Elegant code is secure code.
Short, readable, well-tested functions are easier to audit and harder to exploit than clever abstractions. Prefer Go's standard library over third- party dependencies. One clear implementation over pluggable frameworks.

What We Protect Against¶

This server operates in a unique threat environment:

Threat	Severity	Defense
SSRF (Server-Side Request Forgery)	Critical	Custom `DialContext` validating all resolved IPs against private/reserved CIDR blocks + cloud-metadata hostnames + DNS rebinding prevention
Prompt injection via scraped content	High	Content sanitization, size limits, and a `"trust": "untrusted-external-content"` boundary marker in the JSON envelope (the host enforces the prompt boundary — the model lives there)
Cost abuse via API key theft	High	Rate limiting (global + per-tenant + daily quota), circuit breakers
Cross-tenant data leakage	High	Namespace isolation, per-tenant sessions, configurable cache isolation
Denial of Service	High	HTTP timeouts, request size limits, bounded concurrency (semaphore)
Cloud metadata credential theft	High	Hostname and IP blocklist for IMDS endpoints (AWS, GCP, Azure, etc.)
Supply chain compromise	Medium	cosign-signed binaries, SBOM, CodeQL, govulncheck, dependency pinning
Unauthorized access (HTTP mode)	Medium	OAuth 2.1 JWT validation, JWKS auto-refresh, token revocation

Architecture at a Glance¶

User/LLM → [STDIO or HTTP+TLS] → MCP Server
                                      │
                ┌─────────────────────┴──────────────────────┐
                │                                             │
          Authentication              Tool Handler
          (JWT/JWKS if HTTP)          (search, scrape, etc.)
                │                          │
          Rate Limiting               Input Validation
          (token bucket)              (URL scheme, body size)
                │                          │
          Audit Logging               SSRF Protection
          (async, structured)         (IP validation, DNS pin)
                │                          │
          Tenant Isolation            Content Sanitization
          (per-session, cache)        (bluemonday, size limits)
                │                          │
          Metrics                     Circuit Breaker
          (Prometheus)                (per-provider)
                │                          │
                └──────────┬───────────────┘
                           │
                    Encrypted Cache
                    (AES-256-GCM, TTL)

All components are wired through the tools.Dependencies struct — zero global state, fully testable, every dependency swappable via interfaces.

Defense Layers (Technical Detail)¶

SSRF Protection¶

The highest-severity risk for a scraping server. Our defense:

Check hostname against blocklist (cloud IMDS endpoints)
Resolve DNS
Validate ALL resolved IPs against private/reserved ranges
Connect directly to the resolved IP (prevents DNS rebinding)
Re-validate on each redirect hop (max 5 redirects)

Blocked ranges: RFC 1918, link-local (169.254.0.0/16), loopback, multicast, carrier-grade NAT, documentation ranges, IPv6 equivalents. See internal/scraper/ssrf.go for the canonical list.

Escape hatch: ALLOW_PRIVATE_IPS=true for development/intranet scraping. Domain allowlist: ALLOWED_DOMAINS=a.com,b.com for enterprise lock-down.

Content Security¶

Scraped content passes through a configurable sanitization pipeline:

Default mode: HTML sanitization (bluemonday allowlist — strips scripts, iframes, event handlers), hidden content removal (display:none, zero-width chars), size enforcement (50KB per source, 300KB total)
Raw mode (mode: "raw", scrape_page only): Returns the fetched bytes verbatim — scripts, styles, and markup intact — for inspecting source such as JSON, HTML, or JavaScript (code analysis, security research, web development). Only content.Process is skipped; the request still passes through validateScrapeURL, the SSRF-safe client, the ALLOWED_DOMAINS allowlist, and an io.LimitReader bounded by max_length. The returned content is UNTRUSTED (it may carry indirect prompt-injection payloads), so callers must never execute or render it. search_and_scrape has no raw mode and is always sanitized.
Size limits: Configurable per-request via maxLength parameter. The LLM decides how much content it needs based on context window and task. Defaults protect against context flooding; explicit overrides serve legitimate research needs (analyzing large codebases, full-page audits).
Trust boundary marker: every scrape response carries "trust": "untrusted-external-content" in the JSON envelope (not inside the content string, where a page could forge it) — an explicit signal to treat content as data, never as instructions. Alongside it, contentType reports the form: in sanitized modes the extracted type (e.g. text/markdown); in raw mode the server's real Content-Type header (which may be empty) with "raw": true. So downstream consumers always know both what they are handling and that it is untrusted.

The pipeline reduces prompt-injection exposure and context flooding by default, while allowing full access to page source when the research task requires it. One honest limit: the server cannot enforce the prompt boundary itself — the model and agent loop live in the host application, so neutralizing plain-text injection payloads is the host's job; the server's role is to sanitize, cap, and mark provenance so the host can. SSRF protection applies regardless of mode — the security boundary is what URLs you can reach, not what content you can read from them.

Authentication (HTTP mode)¶

OAuth 2.1 Resource Server pattern:

RS256 JWT signature verification via JWKS endpoint
Auto-refreshing key cache (configurable interval, default 1 hour)
Audience and issuer validation
Token revocation via JTI — in-memory by default, optionally backed by the encrypted persist.Store so revocations survive restarts (fail-closed: a JTI is revoked if present in either layer)
Scope-based per-tool authorization (opt-in via ENFORCE_SCOPES; see below)
Constant-time comparison for admin key authentication
Rejects HS256 from external issuers (algorithm confusion prevention)

Scope-based authorization (RBAC). When ENFORCE_SCOPES=true, a token that carries a scope/scp claim must hold one of tool:*, tool:<name>, or the coarse research scope to invoke a tool (plus any REQUIRED_SCOPES). It is permissive by design: ENFORCE_SCOPES=false (default) ignores scopes, and a token with no scope claim is always allowed (backward-compatible). It fails closed only on a present-but-insufficient scope. See Middleware.EnforceScopes in internal/auth/middleware.go and the crosswalk in docs/SECURITY.md.

Session identifiers (accepted risk). Sequential-search session IDs are static UUIDv4 values that do not rotate over the life of a session. This is an accepted, documented risk: a session ID is a research-continuity handle, not an authentication credential. Authorization is always derived from the validated JWT (tenant/user/scope), and sessions are keyed by the compound {tenantID}:{userID}:{sessionID} so a guessed or leaked session ID cannot cross a tenant or user boundary or grant access without a valid token. Rotation was deliberately not added to avoid breaking the recovery-after-context-loss workflow the IDs exist to support.

STDIO mode: no authentication (the calling process is the trust boundary).

Rate Limiting (HTTP mode)¶

Three tiers preventing abuse while allowing legitimate high-throughput research:

Tier	Default	Purpose
Global	1000 req/s	Infrastructure protection
Per-tenant	120 req/min	Fair use between tenants
Daily quota	5000 req/day	Cost control

All configurable. Returns 429 with Retry-After header.

Encryption¶

At rest: AES-256-GCM with random nonces for cache and sessions (when CACHE_ENCRYPTION_KEY is set). File permissions 0600.
In transit: TLS 1.2+ for all outbound connections (Go stdlib default). HSTS header for inbound HTTP.
FIPS option: Build with GOEXPERIMENT=boringcrypto for FIPS 140-2 validated cryptographic module.

Audit Logging¶

Every tool invocation produces a structured JSON event:

{
  "timestamp": "2026-01-15T10:30:00Z",
  "event_type": "tool_call",
  "tenant_id": "acme-corp",
  "user_id": "user@example.com",
  "tool_name": "web_search",
  "request_id": "550e8400-e29b-41d4-a716-446655440000",
  "duration_ms": 342,
  "success": true
}

Design: async channel-based (never blocks tool calls), swap-to-disk overflow (never drops events under normal load), configurable output (stderr or file).

What is NOT logged by default: raw query text (only the query length, an integer, is recorded — unless AUDIT_INCLUDE_REQUEST_BODY=true, when the raw query is recorded after MaskSecrets redaction), scraped content, and full request parameters (PII risk). Audit metadata and upstream error messages pass through audit.MaskSecrets so any credential echoed back by a provider is redacted before it reaches a sink. Audit files rotate at AUDIT_MAX_BYTES and are pruned after AUDIT_RETENTION_DAYS (default 180, clamped to [180, 3650]).

Privacy by Design¶

Data Minimization¶

We deliberately minimize what we store:

Data type	Storage	Retention	Contains PII?
Cached search results	Keyed by query hash (not user ID)	TTL-based (hours)	Rarely
Cached scraped pages	Keyed by URL hash	TTL-based (hours)	Possibly
Research sessions	Per-tenant:session compound key	Session TTL (default 4h)	Query context
Audit logs	Structured JSON	Configurable (default 180 days)	User/tenant IDs only
Rate limit counters	Per-tenant in memory	Resets daily	Tenant ID only
Recent-errors ring (`diagnostics://errors/recent`)	Bounded in-memory ring, tenant-aware	None (memory-only, oldest overwritten)	No — causes redacted via `audit.MaskSecrets`; no query text, no full URLs
Operator dashboard / `diagnostics://health`	Derived on read from existing metrics + breaker state	None (computed per request)	No — aggregate-only, no per-user/per-query data

The operator-observability surfaces added in the routing-observability work (_meta.routing on results, the diagnostics:// resources, and the admin-gated GET /dashboard) are deliberately privacy-minimal: routing detail exposes only the provider name (never upstream URLs, credentials, or breaker internals), the recent-errors ring is memory-only/bounded/redacted/tenant-scoped, and the dashboard is aggregate-only and admin-gated. None of them is a personal-data store, so none adds a GDPR processing obligation. See docs/DEPLOYMENT.md → Operator Observability and docs/TOOLS.md → Routing Provenance.

Purpose Limitation¶

Each feature declares its data processing purpose explicitly. Core principles:

Primary purpose: always the user-facing function (search, scrape, analyze)
Legitimate secondary purposes (when enabled): usage analytics for the user's own benefit, research session memory, trend insights — each opt-in and clearly disclosed
Never allowed: selling data to third parties, training external models, undisclosed advertising, sharing across tenants without consent

When features use data for purposes beyond the immediate request (e.g., a search analytics dashboard showing trends over time), they are built with: informed activation, transparent data flows, configurable retention, and user-controlled deletion.

Right to Erasure¶

For HTTP multi-tenant deployments: the DELETE /admin/data endpoint erases all personal data for a (tenant_id, user_id) subject across every registered store (sessions, analytics, memory, workspace) and simultaneously withdraws consent so processing cannot silently resume. GET /admin/data exports the same scope for portability. For STDIO: data is local to your machine — delete the cache directory. Any feature that stores data beyond the request lifecycle provides a deletion mechanism.

User Insights Without Surveillance¶

The project offers features that help you understand your own research patterns — which tools you used, how often, and when (opt-in, consent-gated). These are designed as user-owned insights, not profiling:

Data belongs to the user/tenant and is never shared across boundaries
The user can view, export, and delete their own data at any time
No behavioral predictions, scoring, or automated decisions about users
No data leaves the user's control without explicit action
STDIO mode stores everything locally — the user controls their own disk
HTTP mode scopes all analytics to the authenticated tenant

The distinction: showing a user "here's what you searched for last month" is a productivity feature. Building a hidden model of user behavior to influence results without disclosure is profiling. We do the former, never the latter.

Deployment Security¶

STDIO Mode (default, zero-config)¶

No network listener. No attack surface. The binary communicates via stdin/stdout with the calling process. Security is provided by your OS (file permissions, process isolation).

Suitable for: individual developers, local AI assistants, testing.

HTTP Mode (multi-tenant)¶

Activated by setting PORT. Enables OAuth, rate limiting, CORS, and all multi-tenant security features.

Minimum secure configuration requires: a port, OAuth issuer and audience for JWKS validation, an allowed origins list, an encryption key for the cache and sessions, an admin API key, and audit logging enabled with an output path. See docs/DEPLOYMENT.md for exact variable names and defaults.

Production hardening additionally enables per-tenant cache isolation, sets rate limits and daily quota, configures a domain allowlist if needed, and adjusts scrape concurrency. See docs/DEPLOYMENT.md for all variables.

Healthcare deployments (HIPAA): if the server processes or caches content containing Protected Health Information (PHI), enable encryption at rest, strict per-tenant cache isolation, and audit logging with a retention path sufficient for the 6-year HIPAA requirement (45 CFR 164.312(b) requires audit controls; AES-256-GCM encryption renders a breach non-reportable under Safe Harbor). See docs/DEPLOYMENT.md for exact variable names and defaults.

A Business Associate Agreement (BAA) is required between the entity deploying this server and any covered entity whose data it processes. The server's encryption at rest (AES-256-GCM), encryption in transit (TLS 1.2+), access controls (OAuth + tenant isolation), and audit logging satisfy HIPAA Technical Safeguards (45 CFR 164.312). The FIPS build option (GOEXPERIMENT=boringcrypto) satisfies NIST SP 800-111 requirements referenced by HIPAA.

Container Security¶

The Docker image ships with full rendering capabilities:

Non-root execution (UID 65534)
Chromium for headless page rendering (JS-heavy sites, SPAs)
Alpine base with ca-certificates and required runtime libraries
Multi-stage build (no build tools in runtime image)
All images signed with cosign (verify with cosign verify)

The image includes Chromium because accurate research requires rendering JavaScript-heavy pages. Chromium is launched with --no-sandbox; container-level isolation (non-root UID 65534, network policies) provides the sandboxing boundary. Chromium's attack surface is mitigated by: non-root execution, bounded concurrency (single browser instance shared across goroutines; page concurrency capped by MaxConcurrency), SSRF protection on all URLs before they reach the browser, and container-level network restrictions.

Deployment recommendations:

Mount cache directory as tmpfs or encrypted volume
Apply network policies limiting egress to search API endpoints + Chromium download endpoints (for auto-update if enabled)
Consider seccomp profile (see deploy/ directory when available)
For environments where Chromium is not desired: set CHROME_PATH=disabled to skip browser-tier scraping entirely

Configuration Reference¶

The security- and privacy-relevant settings are grouped into these knobs — defaults and exact variable names live in .env.example and docs/DEPLOYMENT.md, the two authoritative homes (this guide does not duplicate them, so they cannot drift):

Encryption at rest — a hex key turns on AES-256-GCM disk encryption for the cache, the TTL key/value store, and sessions; a previous-key slot enables zero-downtime rotation (decrypt-fallback + lazy re-encrypt).
SSRF / scrape egress — toggles for allowing private/RFC1918 IPs and a domain allowlist that restricts scrape targets.
CORS — an origin allowlist plus a fail-closed strict toggle (default deny on empty; see docs/MIGRATION.md).
Auth & scopes (HTTP mode) — OAuth issuer/audience for JWKS validation, JWKS refresh interval, and optional per-tool scope enforcement.
Admin surface — the admin key that gates every /admin/* endpoint and the operator dashboard (constant-time compared; a deprecated alias is still accepted with a startup warning).
Tenant isolation — per-tenant cache isolation mode.
Rate limiting (HTTP mode) — global/second, per-tenant/minute, per-tenant/day, and a pre-auth per-IP ceiling.
Audit & observability — audit logging on/off, output path, buffer size, log level, and the Prometheus /metrics toggle.

See docs/DEPLOYMENT.md for the table of every variable, its default, and its effect.

Compliance Posture¶

Prefer slides? The same story — how architecture, not paperwork, keeps this project aligned with 23 standards — is captured in a short visual deck: Compliance as Architecture (PDF · source). Each technical claim names the file that backs it — open any one and check.

What We Target¶

This project is designed to satisfy security and privacy requirements across multiple international frameworks simultaneously. We achieve this through architectural choices that inherently satisfy shared requirements across standards — not through per-framework checkbox exercises.

Standards Alignment¶

Standard	Relevance	How we align
ISO 27001	Foundation	Interface-driven architecture, access control, encryption, audit, supply chain
SOC 2 Type II	Enterprise trust	Audit logging, rate limiting, tenant isolation, change management
NIST CSF 2.0	US enterprise	Govern/Identify/Protect/Detect/Respond/Recover mapped to controls
GDPR / UK GDPR	Privacy	Data minimization, purpose limitation, TTL caches; data-subject access/portability/erasure endpoints (`/admin/data`); consent record-verify-honor for regulated features
OWASP MCP Cheat Sheet	MCP-specific	SSRF protection, content sanitization, tool annotations, supply chain
OWASP Top 10 LLM (2025)	AI security	Prompt injection defense, bounded agency, supply chain verification
OWASP Agentic Top 10 (2026 draft)	AI agent security — tool-provider slice only	What this server owns: read-only-default tool annotations (writes non-destructive; deletion is a separate endpoint), JWT scope authorization, consent-gated + tenant-isolated + erasable memory/workspace, SSRF + HTML sanitization + untrusted-content marker, rate limits + circuit breakers + size caps, per-call audit with request-ID correlation. Host-owned (not us): goal/intent manipulation, cascading hallucination, multi-agent trust, the agent permission model — those live in the client that drives the tools.
NIST AI RMF	AI governance	Risk-aware design, transparency, continuous monitoring
EU AI Act	EU regulation	Transparency, accuracy, robustness, tiered compliance model
EU Cyber Resilience Act	Software supply chain	SBOM, signed releases, vulnerability handling, PSIRT, 5yr updates
NIS2	EU critical infrastructure	Incident handling, supply chain, crypto, vulnerability management
FedRAMP	US government	FIPS crypto option, access control, audit, vulnerability scanning
UK Cyber Essentials	UK market access	Boundary protection, secure config, access control, patching
UK NCSC CAF v4.0	UK critical infra	14-principle cyber assessment (covers AI risks)
BSIMM	Security maturity	Code review, SAST, SCA, architecture analysis, vulnerability mgmt
HIPAA	US healthcare	Encryption (AES-256), audit controls, access controls, BAA support, breach notification
HITRUST CSF	Healthcare + cross-industry	Maps to 40+ frameworks; combined SOC 2 + HITRUST assessment
FIRST PSIRT	Vulnerability handling	Structured triage, remediation, and disclosure for CVEs
MITRE ATT&CK	Threat modeling	Security controls mapped to adversary techniques
Global CBPR	Cross-border privacy	Data transfer certification for APAC/Americas markets
IETF RFC 9700/9449	OAuth security	Best current practice + DPoP proof-of-possession
CSA MCP Security Framework	MCP hardening	Provenance, runtime isolation, secrets, observability
NSA MCP Security Guidance	Government/military	Message signing, per-call scoping, trust chains

What Compliance Means for This Project¶

We build compliance into the architecture, not as an afterthought. Our approach uses a tiered compliance model that scales with capability:

Tier 1 — Core retrieval (always active):
Search, scrape, extract, and return web content. Security controls (SSRF, encryption, audit, rate limiting) protect infrastructure. Privacy controls (data minimization, TTL caches, no cross-tenant leakage) protect users. This tier satisfies the majority of regulatory requirements automatically.

Tier 2 — User-facing analytics and insights (when activated):
Search history, topic trends, usage dashboards, session memory. These features give users visibility into their own research patterns. Built with: opt-in activation, user-owned data, tenant-scoped isolation, configurable retention, full deletion capability. Satisfies GDPR legitimate interest (user's own benefit) with transparency and control.

Tier 3 — Machine-formatted output (when activated):
The server does not run any LLM or generate prose — synthesis is the client model's job (server-side summarization was deliberately not built; see #94). The only machine-shaped output is deterministic generative-UI components (GENERATIVE_UI_ENABLED): source cards and a quality-comparison table built by a deterministic transform of already-extracted data, plus consolidated bibliographies. Built with: a non-bypassable machine-readable marker ("autoFormatted": true, label "mcp-auto-formatted" — explicitly NOT "AI-generated", because no model is involved), source attribution back to the raw data, and raw content always present alongside. This transparency posture aligns with EU AI Act Art. 50 labeling expectations even though no AI content is produced.

Tier 4 — Personalization and recommendations (when activated):
Cross-session intelligence, personalized ranking, smart suggestions. Built with: explicit consent, explanation capability, opt-out mechanism, bias auditing. Satisfies GDPR Art. 22, Quebec Law 25 s.12.1, PIPL Art. 24.

Each tier adds compliance infrastructure proportional to its regulatory exposure. Lower tiers never require higher-tier infrastructure. Features always ship with their compliance requirements met — not after.

Our Compliance Principles (Evergreen)¶

These apply regardless of which features are active:

Data belongs to the user/tenant. Always viewable, exportable, deletable.
No hidden data flows. Every processing purpose is disclosed and justified.
Opt-in for anything beyond the immediate request. Storing data, analyzing patterns, generating content — each requires explicit activation.
Proportional controls. Simple features get simple compliance. Complex features get comprehensive governance. Nothing is over-engineered.
Compliance infrastructure ships WITH the feature. Never bolt-on, never "we'll add consent management later."
Read the code, not the marketing. Our compliance claims are verifiable in source code. Interfaces, tests, and architecture enforce what docs promise.

Operator & Hosted-Service Responsibilities¶

The tiered model and Compliance Principles above describe the technical controls this project ships. They are necessary but not sufficient: every framework on the Standards Alignment table also has an organizational, process half that a source repository cannot contain. "Aligned with" means this project provides the technical controls a standard requires — never that an organization has been audited and certified against it. A binary can't clear a hospital's HIPAA bar; it gives that review its technical-controls evidence.

Responsibility splits three ways.

Shared-responsibility model¶

The project ships (in code)	The operator owns (process & controller role)	A hosted SaaS additionally owns (org program)
SSRF-safe outbound client (`internal/scraper/ssrf.go`)	Being the data controller for data they persist	Staff trained on the controls (e.g. HIPAA workforce training)
AES-256-GCM at rest (opt-in via `CACHE_ENCRYPTION_KEY`) + TLS in transit	Generating & rotating encryption / admin keys	Controls audited operating over a period (SOC 2 Type II: 3–12 mo + an auditor)
Secrets-masked audit logging (`internal/audit`)	Monitoring & alerting on the audit stream; setting the retention schedule (the code gives TTL knobs, not policy)	24/7 incident-response capability and on-call
Tenant isolation, OAuth 2.1 / JWKS / revocation, rate limits	Access reviews; an incident-response runbook; executing breach notification	Signed customer DPAs / BAAs at scale; a named DPO
Consent + erasure primitives (`internal/consent`, `internal/datasubject`, `/admin/data`)	Choosing lawful basis; running DPIAs; maintaining a Record of Processing Activities (GDPR Art. 30)	The actual SOC 2 Type II / HITRUST / ISO 27001 audit engagement
SBOM + cosign signatures + CodeQL + `govulncheck` + a PSIRT process (root `SECURITY.md`)	Signing BAAs (HIPAA) / DPAs (GDPR) with covered entities and their own users	Management review, Statement of Applicability, continual improvement (ISO 27001 Clauses 4–10)

To actually reach a given standard, the deploying entity must…¶

SOC 2 Type II — engage an auditor to observe the shipped controls operating over a 3–12 month period, plus a management assertion. The code supplies the control evidence (OAuth, audit logs, change history); it is not the audited program.
HIPAA — sign a Business Associate Agreement with each covered entity, train its workforce, run a risk analysis, and operate a breach-notification process. The project ships the Technical Safeguards (encryption, audit controls, access controls); the Administrative Safeguards are the operator's.
GDPR / UK GDPR — name the data controller, choose and document a lawful basis, run DPIAs where required, maintain a Record of Processing Activities (Art. 30), and execute 72-hour breach reporting (Art. 33). The project ships the data-subject-rights primitives (access/portability/erasure via /admin/data, consent record-verify-honor) — the operator owns the controller obligations.
ISO 27001 — operate an Information Security Management System: documented scope, risk-treatment methodology, Statement of Applicability, internal-audit program, and management review (Clauses 4–10). The project satisfies the relevant Annex A technical controls only.

This split is not a disclaimer — it is the honest shape of "compliance as architecture." The repository can make the technical half provably true; the operating organization owns the process half. Both are required.

Supply Chain Security¶

What We Ship¶

Every release includes:

Cross-platform binaries (Linux/macOS/Windows, amd64/arm64)
Multi-arch Docker images (GHCR + Docker Hub)
Software Bill of Materials (SPDX format)
cosign signatures on all binaries and container images
SHA-256 checksums

Verification¶

# 1. Verify the cosign signature on checksums.txt (keyless / Fulcio cert).
#    The release signs checksums.txt — not each binary — then the checksum
#    file vouches for every artifact.
cosign verify-blob \
  --signature checksums.txt.sig \
  --certificate checksums.txt.pem \
  --certificate-identity-regexp 'https://github.com/zoharbabin/web-researcher-mcp' \
  --certificate-oidc-issuer 'https://token.actions.githubusercontent.com' \
  checksums.txt

# 2. Verify your downloaded artifacts against the (now-trusted) checksums.
sha256sum -c checksums.txt

# 3. Verify a container image signature directly.
cosign verify ghcr.io/zoharbabin/web-researcher-mcp:latest \
  --certificate-identity-regexp 'https://github.com/zoharbabin/web-researcher-mcp' \
  --certificate-oidc-issuer 'https://token.actions.githubusercontent.com'

Continuous Security Scanning¶

Tool	What it checks	When
`govulncheck`	Known Go vulnerabilities	Every CI run (`make vuln`)
`gosec`	Go security scanner (injection, weak crypto, SSRF sinks, unsafe file ops)	Every CI run (`make sec`)
`golangci-lint`	Static analysis + lint rules	Every CI run (`make lint`)
CodeQL	Semantic code analysis (security-extended)	Every PR + weekly
Dependabot	Dependency version vulnerabilities	Continuous
`go mod verify`	Dependency integrity (checksum match)	Every build
cosign	Release artifact signatures	Every release
Syft	SBOM generation	Every release

govulncheck, gosec, and golangci-lint are pinned as tool directives in go.mod, so CI and local runs (make verify) use identical versions. The Go toolchain version is pinned in go.mod.

Dependency Policy¶

All dependencies: actively maintained, no unpatched CVEs
Licenses: permissive only (MIT, Apache 2.0, BSD)
Preference for Go standard library over third-party
go.sum pins exact dependency hashes
Minimum dependency footprint (fewer deps = smaller attack surface)

Contributor Security Rules¶

Every contributor (human or AI agent) must follow these rules. They are non-negotiable and enforced in code review.

The Rules¶

Never introduce OWASP Top 10 vulnerabilities.
No command injection, XSS, SQL injection, SSRF, path traversal, or insecure deserialization. If unsure whether code is safe, ask.
Validate all external input at system boundaries.
Tool handler inputs, HTTP request parameters, environment variables, scraped content — validate at the boundary, trust within.
Never log secrets.
API keys, tokens, encryption keys, and credentials must never appear in logs, error messages, or audit events. Even in debug mode.
Errors are values, never panics.
Return toolError() or upstreamErrorResponse(). Never panic() in production paths. Never expose internal error details to clients.
Encrypt sensitive data at rest.
Any new persistent storage of potentially-sensitive data must use the existing encryption infrastructure (cache.DiskCache for cached responses, persist.DiskStore for TTL key/value state, or the session store for research sessions — all AES-256-GCM).
Respect tenant boundaries.
Any new feature touching shared state must consider multi-tenant isolation. Ask: "Can tenant A see tenant B's data?" The answer must be no.
Use the SSRF-safe client for all outbound HTTP.
Never use http.DefaultClient or &http.Client{} directly for fetching user-specified URLs. Always use scraper.NewSSRFSafeClient().
Add annotations to new tools.
Read tools must declare readOnlyAnnotations(idempotent, openWorld). Write tools (those that mutate server-side state or trigger an external side-effect, e.g. memory_save, workspace_contribute, archive_source) must use writeAnnotations(idempotent) instead. The TestAllToolsHaveAnnotations test enforces this in CI.
Don't accumulate data beyond the request lifecycle.
New features should not store data indefinitely. Use TTLs. If long-term storage is genuinely needed, it requires explicit opt-in + retention policy.
Keep the dependency footprint minimal.
Prefer standard library. Each new dependency is a supply chain risk. Justify in the PR description why stdlib isn't sufficient.

Security Review Triggers¶

These changes REQUIRE security-focused code review:

Changes to internal/auth/ or internal/scraper/ssrf.go
New outbound HTTP calls
Changes to cache key generation or tenant isolation
New environment variables accepting secrets
Changes to the Dockerfile or CI/CD pipeline
Any use of unsafe, reflect, or os/exec

Testing Requirements¶

All new code: unit tests with t.Parallel()
Security-sensitive code: negative test cases (what happens with malicious input?)
Race conditions: go test -race ./... must pass
SSRF: test with private IPs, metadata endpoints, redirect chains
Auth: test with expired/invalid/missing tokens

AI Agent Coding Rules¶

When AI coding agents (Claude Code, Copilot, Cursor, etc.) work on this codebase, they must follow the contributor rules above PLUS these additional constraints:

Security-First Coding¶

Never disable security checks to make tests pass or code compile. Fix the underlying issue instead.
Never use --no-verify on git commits. Pre-commit hooks exist for a reason. If a hook fails, investigate and fix.
Never generate or guess URLs for fetching unless explicitly instructed. SSRF can be introduced through hardcoded URLs that happen to resolve to internal services.
Never add backdoors, debug endpoints, or admin shortcuts that bypass authentication or authorization. Even "temporary" ones.
Never commit secrets, API keys, or credentials. Even example values that look like real keys. Use obviously-fake placeholders: your-key-here.

Secure Patterns to Follow¶

// DO: Use the SSRF-safe client
client := scraper.NewSSRFSafeClient(cfg.AllowPrivateIPs)

// DON'T: Create an unrestricted client
client := &http.Client{}

// DO: Return typed errors
return toolError("query is required")

// DON'T: Panic
panic("unexpected state")

// DO: Validate URL schemes
if parsed.Scheme != "http" && parsed.Scheme != "https" {
    return toolError("URL must use http:// or https://")
}

// DON'T: Pass user input to os/exec
exec.Command("curl", userURL) // NEVER

// DO: Use constant-time comparison for secrets
subtle.ConstantTimeCompare([]byte(provided), []byte(expected))

// DON'T: Direct string comparison for auth
if provided == expected { // timing attack!

What AI Agents Must Check Before Submitting¶

[ ] No new dependencies added without justification
[ ] No panic() calls in non-test code
[ ] No hardcoded IPs, URLs, or credentials
[ ] No http.DefaultClient usage for external URLs
[ ] No raw SQL or shell command construction from user input
[ ] go test -race ./... passes
[ ] make lint (or go tool golangci-lint run) passes
[ ] New tools have annotations + tests

Vulnerability Management¶

Reporting¶

Report security vulnerabilities privately via GitHub Security Advisories.

Do not open public issues for security vulnerabilities.

SLA	Timeline
Acknowledgment	48 hours
Fix plan	7 days
Patch release	30 days (critical: 72 hours)

How We Handle Vulnerabilities¶

Our vulnerability handling follows the FIRST PSIRT Services Framework:

Receive — reports via GitHub Security Advisories or direct contact
Triage — assess severity using CVSS v4.0, assign CWE identifier
Remediate — develop and test fix, request CVE if applicable
Disclose — coordinated disclosure with reporter, publish advisory
Learn — post-mortem, update threat model, improve defenses

All published advisories include: CVSS v4.0 score, affected versions, CWE identifier, and mitigation guidance.

Threat Model References¶

Our security controls map to MITRE ATT&CK techniques:

Control	Mitigates
SSRF protection	T1190 (Exploit Public-Facing App), T1557 (Adversary-in-the-Middle)
Input validation	T1059 (Command/Scripting Interpreter)
Content sanitization	T1059.007 (JavaScript), prompt injection vectors
Rate limiting	T1499 (Endpoint DoS)
Auth/JWKS	T1078 (Valid Accounts), T1550 (Use Alternate Auth Material)
Audit logging	T1070 (Indicator Removal) — structured, secret-redacted, request-ID-correlated events (incl. auth failures) aid detection & attribution. Note: logs are append-only JSON, not cryptographically tamper-evident; hash-chaining is a roadmap item (below).

Roadmap Considerations¶

Security features planned or under consideration:¶

DPoP token binding (RFC 9449) — proof-of-possession prevents token theft
Hash-chained audit logs — tamper-evident logging for government deployments
Breach notification pipeline — webhook alerting on security anomalies
in-toto build attestations — full supply chain provenance (SLSA Level 3)
Seccomp profiles — container syscall restriction for hardened deployments
UK Cyber Essentials certification — UK public sector market access
Global CBPR certification — cross-border data transfer for APAC markets

Architecture decisions that won't change:¶

Zero global state (dependency injection via struct)
Interface-driven design (swap implementations without touching callers)
Read-only-default tool semantics (reads are the default; write tools exist but are non-destructive — deletion is always a separate admin endpoint, never a tool flag)
STDIO as the zero-config default
Go standard library preference over third-party
Compliance proportional to activated features (tiered model)

Document	Content
`docs/SECURITY.md`	Detailed technical security architecture (threat model, defense layers, crypto specs)
`docs/DEPLOYMENT.md`	Production deployment guide (Docker, K8s, env vars, scaling)
`docs/ERROR_HANDLING.md`	Error taxonomy and LLM-facing message design
`CONTRIBUTING.md`	Full contributor guide (setup, style, PR process)
`SECURITY.md` (root)	Vulnerability reporting policy
`.env.example`	All configuration options with descriptions