Enterprise Security Architecture¶
This document describes the security features built into the E2E Testing Agent for enterprise deployments.
Overview¶
The E2E Testing Agent is designed with security-first principles to ensure:
- No secrets leak to AI - Automatic detection and redaction
- User consent - Explicit approval before sending data
- Audit trails - SOC2/ISO27001 compliant logging
- Data classification - Multi-level sensitivity handling
┌─────────────────────────────────────────────────────────────────────────┐
│ SECURITY DATA FLOW │
└─────────────────────────────────────────────────────────────────────────┘
Your Codebase Anthropic AI
│ │
▼ │
┌──────────────┐ │
│ 1. CLASSIFY │ Identify sensitivity level │
│ FILES │ (public/internal/confidential/restricted) │
└──────┬───────┘ │
▼ │
┌──────────────┐ │
│ 2. CHECK │ User approval required │
│ CONSENT │ before any data leaves │
└──────┬───────┘ │
▼ │
┌──────────────┐ │
│ 3. SANITIZE │ Remove secrets, API keys, │
│ CONTENT │ passwords, tokens, PII │
└──────┬───────┘ │
▼ │
┌──────────────┐ │
│ 4. AUDIT │ Log all access for │
│ LOG │ compliance reporting │
└──────┬───────┘ │
▼ ▼
└─────────────────────────────────────────────────────────►│
SANITIZED DATA ONLY │
▼
┌──────────────┐
│ CLAUDE │
│ ANALYSIS │
└──────────────┘
Components¶
1. Data Classification (classifier.py)¶
Classifies every file by sensitivity level:
| Level | Description | AI Access |
|---|---|---|
PUBLIC | Open source, licenses | Yes |
INTERNAL | Source code, tests | Yes (with consent) |
CONFIDENTIAL | Config files | Yes (after sanitization) |
RESTRICTED | Secrets, credentials | NEVER |
from src.security import DataClassifier, SensitivityLevel
classifier = DataClassifier()
result = classifier.classify_file("/path/to/file.py")
if result.sensitivity == SensitivityLevel.RESTRICTED:
print("This file will NEVER be sent to AI")
elif result.pii_detected:
print("PII detected - special handling required")
Automatic Detection¶
The classifier automatically detects:
- Credentials:
.env,credentials.json, private keys - Secrets: API keys, tokens, passwords
- PII: Email addresses, phone numbers, SSN, credit cards
- Sensitive Config: Database URLs, connection strings
2. Code Sanitization (sanitizer.py)¶
Removes secrets from code before sending to AI:
from src.security import CodeSanitizer
sanitizer = CodeSanitizer()
result = sanitizer.sanitize_file("/path/to/config.py")
# Original content:
# API_KEY = "sk-ant-1234567890abcdef"
# DATABASE_URL = "postgres://user:password@host/db"
# Sanitized content:
# API_KEY = "[REDACTED]:api_key"
# DATABASE_URL = "[REDACTED]:connection_string"
Secret Patterns Detected¶
| Type | Pattern Examples |
|---|---|
| API Keys | sk-ant-*, AIza*, xox*-* |
| Tokens | ghp_*, glpat-*, Bearer tokens |
| Passwords | password=, pwd=, passwd= |
| AWS | AKIA*, aws_secret_access_key |
| Private Keys | -----BEGIN PRIVATE KEY----- |
| Connection Strings | postgres://, mongodb:// |
| JWT | eyJ*.*.* |
Forbidden Files¶
These files are never read, even if requested:
.env,.env.local,.env.productioncredentials.json,secrets.yamlid_rsa,*.pem,*.key,*.p12.npmrc,.pypirc,.netrc.aws/credentials
3. User Consent (consent.py)¶
Explicit consent required before sending data to external services:
from src.security import ConsentManager, ConsentScope
consent = ConsentManager(session_id="user-session-123")
# Check if consent exists
if not consent.has_consent(ConsentScope.SEND_TO_ANTHROPIC):
# Prompt user for consent
consent.prompt_for_consent([
ConsentScope.SOURCE_CODE,
ConsentScope.SEND_TO_ANTHROPIC,
])
# Now safe to proceed
consent.require_consent(ConsentScope.SEND_TO_ANTHROPIC)
Consent Scopes¶
| Scope | Description |
|---|---|
SOURCE_CODE | Read source code files |
TEST_FILES | Read existing test files |
CONFIG_FILES | Read configuration (sanitized) |
SCREENSHOTS | Capture browser screenshots |
BROWSER_ACTIONS | Execute browser automation |
API_RESPONSES | Capture API responses |
SEND_TO_ANTHROPIC | Send data to Claude API |
SEND_TO_GITHUB | Post PR comments |
SEND_TO_SLACK | Send notifications |
STORE_LOCALLY | Save results to disk |
STORE_AUDIT_LOGS | Maintain audit trail |
Consent Modes¶
For CLI usage, you can use auto-consent modes:
# Minimal - only code analysis
CONSENT_MODE=minimal e2e-agent ...
# Standard - typical testing workflow
CONSENT_MODE=standard e2e-agent ...
# Full - all features enabled
CONSENT_MODE=full e2e-agent ...
4. Audit Logging (audit.py)¶
SOC2/ISO27001 compliant audit trail:
from src.security import get_audit_logger, AuditEventType
audit = get_audit_logger()
# Log AI request
audit.log_ai_request(
user_id="user-123",
model="claude-sonnet-4-5",
action="analyze_code",
prompt_hash="abc123...", # Never log actual prompts
input_tokens=1500,
)
# Log file access
audit.log_file_read(
user_id="user-123",
file_path="/app/src/main.py",
classification="internal",
was_sanitized=True,
secrets_redacted=3,
)
# Generate compliance report
report = audit.generate_compliance_report(
start_date=datetime(2024, 1, 1),
end_date=datetime(2024, 1, 31),
)
Audit Events¶
| Event Type | When Logged |
|---|---|
AI_REQUEST | Every API call to Claude |
AI_RESPONSE | Every response received |
FILE_READ | Every file accessed |
SECRET_DETECTED | When secrets are redacted |
TEST_COMPLETED | Test execution results |
BROWSER_ACTION | Browser automation actions |
INTEGRATION_CONNECTED | External service connections |
Audit Log Format¶
Logs are stored in JSONL format with automatic rotation:
{
"id": "uuid",
"timestamp": "2024-01-15T10:30:00Z",
"event_type": "file_read",
"user_id": "user-123",
"session_id": "session-456",
"action": "read",
"resource": "/app/src/config.py",
"data_classification": "confidential",
"metadata": {
"was_sanitized": true,
"secrets_redacted": 2
},
"content_hash": "abc123..."
}
Secure Code Reader (secure_reader.py)¶
The SecureCodeReader integrates all security components:
from src.security import create_secure_reader
# Create reader with security features
reader = create_secure_reader(
user_id="user-123",
auto_consent_mode="standard", # Or prompt interactively
)
# Read codebase - automatically:
# 1. Checks consent
# 2. Classifies files
# 3. Skips restricted files
# 4. Sanitizes secrets
# 5. Logs all access
results = reader.read_codebase("/path/to/app")
# Get safe content for AI
context = reader.get_context_for_ai(results)
# This content is SAFE to send to Claude
Enterprise Configuration¶
Environment Variables¶
# Audit logging
AUDIT_LOG_DIR=./audit-logs
AUDIT_RETENTION_DAYS=90
# Consent
CONSENT_MODE=standard # minimal, standard, full
REQUIRE_EXPLICIT_CONSENT=true
# Classification
STRICT_CLASSIFICATION=true # Unknown files = confidential
SCAN_FOR_PII=true
Custom Secret Patterns¶
Add organization-specific patterns:
from src.security import CodeSanitizer, SecretType
sanitizer = CodeSanitizer(
additional_patterns={
SecretType.API_KEY: [
r"MYORG_API_KEY_[a-zA-Z0-9]{32}",
],
},
additional_forbidden_files={
"internal-secrets.yaml",
"*.myorg-key",
},
)
Custom Forbidden Directories¶
Compliance Reports¶
Generate compliance reports for auditors:
from datetime import datetime, timedelta
from src.security import get_audit_logger
audit = get_audit_logger()
# Last 30 days
report = audit.generate_compliance_report(
start_date=datetime.now() - timedelta(days=30),
end_date=datetime.now(),
)
print(f"Total AI requests: {report['summary']['ai_requests']}")
print(f"Files accessed: {report['summary']['files_accessed']}")
print(f"Secrets detected: {report['summary']['secrets_detected']}")
print(f"Total cost: ${report['summary']['total_cost_usd']:.2f}")
Sample Report¶
{
"period": {
"start": "2024-01-01T00:00:00Z",
"end": "2024-01-31T23:59:59Z"
},
"summary": {
"total_events": 15420,
"ai_requests": 342,
"files_accessed": 1256,
"secrets_detected": 47,
"tests_run": 890,
"total_cost_usd": 45.67
},
"by_user": {
"user-123": {"events": 8200, "cost_usd": 25.00},
"user-456": {"events": 7220, "cost_usd": 20.67}
},
"secrets_by_type": {
"api_key": 23,
"password": 12,
"token": 8,
"connection_string": 4
},
"errors": []
}
Security Best Practices¶
1. Never Skip Sanitization¶
# WRONG - direct file read
content = Path("config.py").read_text()
send_to_ai(content) # May contain secrets!
# CORRECT - use secure reader
reader = create_secure_reader(user_id="me")
result = reader.read_file("config.py")
send_to_ai(result.content) # Safe, sanitized
2. Always Check Consent¶
# WRONG - assume consent
send_data_to_anthropic(data)
# CORRECT - verify consent
consent = get_consent_manager()
consent.require_consent(ConsentScope.SEND_TO_ANTHROPIC)
send_data_to_anthropic(data)
3. Log Everything¶
# WRONG - silent operation
response = call_ai_api(prompt)
# CORRECT - audit trail
audit.log_ai_request(...)
response = call_ai_api(prompt)
audit.log_ai_response(...)
4. Use Content Hashes¶
# WRONG - log actual content
audit.log(content=sensitive_data)
# CORRECT - log hash only
from src.security import hash_content
audit.log(content_hash=hash_content(sensitive_data))
Deployment Considerations¶
On-Premise Deployment¶
For maximum security, deploy the agent on-premise:
- No data leaves your network except to Anthropic API
- Audit logs stay local on your infrastructure
- Custom secret patterns for your organization
- Integration with your IAM for user authentication
Air-Gapped Environments¶
For air-gapped or regulated environments:
- Use a local LLM instead of Anthropic API
- Disable all external integrations
- Store all audit logs locally
- Manual export for compliance reports
Cloud Deployment¶
For cloud deployments:
- Use secrets manager (AWS Secrets Manager, GCP Secret Manager)
- Enable encryption at rest for audit logs
- Set up log shipping to SIEM
- Configure VPC endpoints for Anthropic API
Regulatory Compliance¶
GDPR¶
- PII detection and redaction
- Explicit consent management
- Data subject access requests via audit logs
- Right to erasure via log retention policies
SOC2¶
- Comprehensive audit trails
- Access logging
- Change management tracking
- Security event monitoring
HIPAA¶
- PHI detection (when enabled)
- Access controls
- Audit requirements
- Encryption support
PCI-DSS¶
- Credit card number detection
- Access logging
- Secure data handling
- Audit requirements
Troubleshooting¶
"Consent not granted" Error¶
Solution: Grant consent explicitly or use auto-consent mode:
Files Being Skipped¶
Check if files are classified as RESTRICTED:
classifier = DataClassifier()
result = classifier.classify_file("path/to/file")
print(f"Sensitivity: {result.sensitivity}")
print(f"Reasons: {result.reasons}")
Secrets Not Being Detected¶
Add custom patterns for your organization:
Support¶
For security concerns or vulnerabilities, please contact: - Security Team: security@yourcompany.com - Create a private security advisory on GitHub