We Disabled SSL Verification. Here's Why That Was a Critical Bug

· Eurtifact Platform Team

Context

In October 2025, a developer encountered an SSL certificate verification error while testing API calls to a staging environment. The error message was clear: CERTIFICATE_VERIFY_FAILED.

The immediate fix was to add verify=False to the HTTP client configuration. The test passed. The code was committed, reviewed, and deployed.

In January 2026, during a routine security audit, we discovered that this change had propagated to production. API calls to external services were not verifying SSL certificates. A man-in-the-middle (MITM) attack could have intercepted credentials, API keys, and customer data.

This is a post-mortem, not a confession.

Reality Check

Common Belief

“Disabling SSL verification is obviously wrong. Code review would catch it.”

Why That’s Incomplete

The change was not obviously wrong in context. The developer was troubleshooting a legitimate certificate issue in a non-production environment. The intent was to unblock testing, not to compromise security.

The failure occurred because:

  1. No environment-specific configuration: The verify=False flag applied to all environments, not just staging
  2. Code review focused on logic, not security posture: Reviewers approved the change because it fixed the immediate problem
  3. No automated security scanning: Static analysis tools did not flag disabled SSL verification as a critical issue
  4. Production and staging used the same codebase: There was no environmental gating to prevent staging-only workarounds from reaching production

This was not negligence. It was insufficient defense-in-depth.

Engineering Implications

SSL verification failures have legitimate causes. The response mechanism determines whether the fix becomes a vulnerability.

1. Environment-Specific Configuration

Requirement: Security-sensitive flags must be environment-aware, not globally toggled.

What this means:

# WRONG: Global toggle
import requests
response = requests.get(url, verify=False)  # Applied everywhere

# CORRECT: Environment-aware
import os
import requests

VERIFY_SSL = os.getenv('VERIFY_SSL', 'true').lower() == 'true'
response = requests.get(url, verify=VERIFY_SSL)

Better: Fail closed (default to verification) and require explicit opt-out per environment:

# Production: VERIFY_SSL not set → defaults to True
# Staging: VERIFY_SSL=false (explicit override for testing)
VERIFY_SSL = os.getenv('VERIFY_SSL', 'true').lower() == 'true'

if not VERIFY_SSL and os.getenv('ENVIRONMENT') == 'production':
    raise RuntimeError("SSL verification cannot be disabled in production")

Common failure: Using a single boolean flag that applies globally. Developers toggle it in one place, and it affects all environments.

2. Static Analysis and Linting Rules

Requirement: Automated tools must flag insecure patterns during CI/CD, not during manual audits.

What this means:

  • Bandit (Python) configured to fail on verify=False
  • Semgrep rules to detect requests.get(..., verify=False)
  • Pre-commit hooks that reject commits containing insecure patterns

Example (Bandit configuration):

# .bandit
tests:
  - B501  # SSL certificate verification disabled (requests)
  - B601  # paramiko SSL verification disabled

CI/CD pipeline:

# GitLab CI
security:scan:
  script:
    - bandit -r src/ -f json -o bandit-report.json
    - if grep -q "SEVERITY.*HIGH" bandit-report.json; then exit 1; fi

Common failure: Running security scanners but allowing builds to proceed even when high-severity issues are detected.

3. Certificate Management Infrastructure

Requirement: Certificate errors should be resolved by fixing the certificate, not by disabling verification.

What this means:

  • Internal Certificate Authority (CA) for staging and development environments
  • CA certificate distributed to developer workstations and CI/CD runners
  • Automated certificate renewal (cert-manager, Let’s Encrypt ACME)
  • Clear documentation: “If you see CERTIFICATE_VERIFY_FAILED, add the CA cert. Do not disable verification.”

Example (Docker container with custom CA):

# Add internal CA certificate to trusted store
COPY certs/internal-ca.crt /usr/local/share/ca-certificates/
RUN update-ca-certificates

# Now SSL verification works for internal services
ENV REQUESTS_CA_BUNDLE=/etc/ssl/certs/ca-certificates.crt

Common failure: Treating certificate errors as obstacles to be bypassed rather than symptoms of missing infrastructure.

4. Separation of Production and Non-Production Codepaths

Requirement: Production deployments must not inherit workarounds intended for testing.

What this means:

  • Feature flags or environment checks that enforce stricter behavior in production
  • Separate configuration files for production vs staging (not a single config.yaml with overrides)
  • CI/CD pipelines that reject production deployments if staging-only workarounds are detected

Example (environment enforcement):

def create_http_client():
    if os.getenv('ENVIRONMENT') == 'production':
        # Production: no SSL bypass allowed
        return requests.Session()  # verify=True by default
    else:
        # Non-production: allow override for testing
        verify_ssl = os.getenv('VERIFY_SSL', 'true').lower() == 'true'
        session = requests.Session()
        session.verify = verify_ssl
        if not verify_ssl:
            warnings.warn("SSL verification disabled. Not for production use.")
        return session

Common failure: Using a single codebase with runtime configuration switches that apply universally.

Failure Modes

Pattern 1: Copy-Paste from Stack Overflow

A developer encounters an SSL error and searches Stack Overflow. The top-voted answer says “add verify=False”. They copy-paste it without understanding the implications.

Why this fails: Stack Overflow optimizes for making code work, not for making it secure. Answers rarely include caveats about production readiness.

Solution: Automated linting catches verify=False during pre-commit. Developer is forced to understand why verification is failing.

Pattern 2: Temporary Fix Becomes Permanent

A developer adds verify=False to unblock a demo. They intend to revert it after the demo, but the ticket is closed as “resolved” and the change is never revisited.

Why this fails: Temporary workarounds without expiration mechanisms become permanent. No one revisits “resolved” tickets.

Solution: Feature flags with expiration dates. After 30 days, the flag is automatically removed, forcing a proper fix.

Pattern 3: Certificate Renewal Failure Leads to Emergency Bypass

A production certificate expires overnight. Services fail at 3 AM. An on-call engineer adds verify=False to restore service immediately, planning to fix it properly later.

Why this fails: Emergency fixes bypass code review and testing. Once service is restored, there’s no urgency to revert the bypass.

Solution: Automated certificate renewal monitoring alerts 30 days before expiry. Emergency bypasses trigger incident post-mortems that require remediation plans.

What “Good” Looks Like

A secure certificate verification posture has these properties:

  1. Fail-Closed Defaults: SSL verification enabled by default. Overrides require explicit environment variables that fail in production.

  2. Automated Detection: Static analysis (Bandit, Semgrep) flags verify=False as critical. CI/CD pipelines fail if detected.

  3. Certificate Infrastructure: Internal CA for non-production environments. Developer onboarding includes CA certificate installation.

  4. Environmental Separation: Production code paths enforce strict security. Non-production paths allow overrides with warnings and audit logs.

  5. Expiration Mechanisms: Temporary security bypasses include expiration dates. After expiry, code fails loudly rather than silently continuing with degraded security.

These are technical safeguards, not policy documents.

Limits & Trade-offs

This approach does not:

  • Prevent certificate expiration: Certificates still expire. Monitoring and renewal automation remain necessary.
  • Eliminate legitimate testing needs: Developers may still need to test against self-signed certificates. The solution is better certificate infrastructure, not removal of testing capability.
  • Solve configuration complexity: Environment-specific configuration adds complexity. This is necessary overhead to prevent security bypasses from propagating.

Defense-in-depth means accepting operational complexity to reduce security risk.

Key Takeaways

  • Disabling SSL verification is never the correct long-term solution. It masks the symptom (certificate error) without fixing the cause (missing CA cert, expired certificate, misconfigured hostname).
  • Environment-specific configuration prevents staging workarounds from leaking to production. Use feature flags or environment checks that fail closed in production.
  • Static analysis must run in CI/CD and fail builds on critical findings. Manual code review is not sufficient for catching security anti-patterns.
  • Certificate errors should be resolved by fixing certificate infrastructure (internal CA, automated renewal), not by disabling verification.
  • Temporary fixes need expiration mechanisms. Without them, temporary becomes permanent.

This article is based on a real incident in the Eurtifact platform’s development history. Details have been generalized to focus on systemic lessons rather than individual attribution. For questions about implementing certificate management in your infrastructure, consult security and platform engineering specialists.