Data Handling & Privacy
Noxys implements privacy-by-design principles. This page explains exactly what data is stored, how it's protected, and your privacy guarantees.
Core Privacy Principle
Raw prompts are never stored.
Only SHA-256 hashes and metadata are retained. This is non-reversible and privacy-preserving.
What Data Is Stored
AIInteraction Records
The canonical data model for all AI usage is the AIInteraction.
Stored
{
"id": "int-12345",
"tenant_id": "org-abc123",
"user_id": "user-xyz789",
"platform_id": "chatgpt", // OpenAI, Anthropic, etc.
"model": "gpt-4-turbo", // AI model used
"prompt_hash": "sha256:a1b2c3...", // SHA-256, one-way
"response_hash": "sha256:d4e5f6...", // Response hash (optional)
"content_length": 1240, // Bytes
"classifications": [
{
"type": "email", // PII type: email, phone, iban, etc.
"confidence": 0.95, // Confidence score (0-1)
"tier": 1, // Detection tier (1=regex, 2=server, 3=SLM)
"count": 2 // Number of matches
}
],
"policy_applied": "block-pii-chatgpt",
"policy_action": "Block", // Block, Coach, Log
"risk_score": 0.87, // Risk level (0-1)
"timestamp": "2026-03-23T14:30:00Z",
"platform_url": "https://chatgpt.com/c/abc123",
"extension_version": "v0.2.1",
"browser": "Chrome 124.0"
}
Not Stored
{
"prompt": "What is my SSN? 123-45-6789", // ❌ NEVER stored
"response": "I cannot provide that...", // ❌ NEVER stored (unless Tier 3)
"username_in_prompt": "john.doe", // ❌ Only hashed
"api_key": "sk-...", // ❌ NEVER stored
"full_url": "https://chatgpt.com/...", // ❌ Hostname only
}
User Records
{
"id": "user-xyz789",
"tenant_id": "org-abc123",
"email": "john.doe@company.com", // Encrypted
"password_hash": "bcrypt:...", // Never plain text
"first_name": "John",
"last_name": "Doe",
"role": "viewer", // admin or viewer
"mfa_enabled": true,
"created_at": "2026-01-15T10:00:00Z",
"last_login": "2026-03-23T09:15:00Z",
"status": "active"
}
Policy Records
{
"id": "pol-abc123",
"tenant_id": "org-abc123",
"name": "Block ChatGPT",
"description": "Prevent all ChatGPT usage",
"enabled": true,
"priority": 1,
"conditions": {
"platform_id": ["chatgpt"],
"user_group": ["all"],
"risk_level": null
},
"action": "Block", // Block, Coach, Log
"created_by": "user-xyz789",
"created_at": "2026-01-20T08:00:00Z",
"modified_at": "2026-03-20T14:30:00Z"
}
Audit Logs
{
"id": "audit-12345",
"tenant_id": "org-abc123",
"actor_id": "user-xyz789",
"actor_email": "john.doe@company.com",
"action": "policy_created",
"resource_type": "policy",
"resource_id": "pol-abc123",
"changes": {
"before": null,
"after": { "name": "Block ChatGPT", "enabled": true }
},
"ip_address": "192.0.2.1", // Anonymized in EU regions
"user_agent": "Mozilla/5.0...",
"timestamp": "2026-03-23T14:30:00Z",
"status": "success"
}
What We DON'T Store
- Raw prompt text
- Raw LLM responses
- API keys or credentials
- Personally identifiable information (beyond what's needed for service)
- Location data beyond timezone
- Browser history (only AI platform URLs)
- Cookies or tracking identifiers
Data Retention
Default Retention Policy
| Data Type | Retention | Notes |
|---|---|---|
| Interactions | 90 days | Configurable; older records archived |
| Audit Logs | 1 year | Immutable; never modified or deleted |
| User Records | Indefinite | Deleted when account is closed |
| Backups | 30 days | Automated daily backups |
Customization
Override retention in .env:
NOXYS_INTERACTION_RETENTION=365 # Keep 1 year instead of 90 days
NOXYS_AUDIT_LOG_RETENTION=2555 # Keep ~7 years for compliance
NOXYS_BACKUP_RETENTION=90 # Keep 90 days of backups
Archival
Interactions older than retention period are archived:
NOXYS_INTERACTION_ARCHIVE_ENABLED=true
NOXYS_INTERACTION_ARCHIVE_AFTER_DAYS=30
NOXYS_INTERACTION_ARCHIVE_LOCATION=s3://my-archive-bucket
Encryption
In Transit (Network)
All network traffic uses TLS 1.3:
Extension → API: TLS 1.3
API → Database: TLS 1.3 (optional for local networks)
API → Redis: TLS 1.3 (optional)
Webhooks: TLS 1.3
At Rest (Storage)
Option 1: Disk Encryption (Recommended for Production)
# Linux: dm-crypt
sudo cryptsetup luksFormat /dev/sdb1
sudo cryptsetup luksOpen /dev/sdb1 encrypted-data
sudo mkfs.ext4 /dev/mapper/encrypted-data
Option 2: Application-Level Encryption
# In .env
NOXYS_ENCRYPTION_AT_REST_ENABLED=true
NOXYS_ENCRYPTION_KEY_PATH=/etc/noxys/encryption/key
NOXYS_ENCRYPTION_ALGORITHM=aes-256-gcm
Option 3: Cloud-Provider Encryption
- AWS RDS: Encryption enabled by default
- Azure Database: Encryption enabled by default
- GCP Cloud SQL: Encryption enabled by default
Hashing & One-Way Functions
SHA-256 Hashing
Noxys uses SHA-256 to create prompt fingerprints:
Original Prompt:
"My email is john.doe@company.com and my SSN is 123-45-6789"
SHA-256 Hash:
a1b2c3d4e5f6...789 (64-character hex string)
Property: One-way function. Cannot reverse to get original prompt.
Why Hashing Is Sufficient
- Collision resistance: Practically impossible to find two prompts with same hash
- Non-reversible: Cannot recover original prompt from hash
- Deterministic: Same prompt always produces same hash (allows deduplication)
- Efficient: Hash computation is fast (<1ms)
Data Classification
Noxys classifies content into tiers:
Tier 1: Regex (Client-Side, Built-In)
Detects with regex patterns, no PII stored:
- Email addresses (
[a-z]+@[a-z]+\.[a-z]+) - Phone numbers (E.164 format)
- Credit card numbers (Luhn algorithm)
- IBAN (international bank accounts)
- French NIR (Numéro d'Immatriculation au Répertoire)
- French SIRET/SIREN (business IDs)
Processing: Client-side in browser extension (no data sent)
Tier 2: Server-Side Classification (Optional)
Uses Microsoft Presidio for deeper detection:
- Person names (via NER model)
- Medical terms (ICD-10, medical entities)
- Legal references (statute names, etc.)
- IP addresses (IPv4, IPv6)
- API keys (AWS, Azure, GCP patterns)
- JWT tokens
- Financial amounts (currency patterns)
Processing: Backend server; content is hashed before sending
Tier 3: Semantic Classification (Coming v0.5)
Uses local LLM (Mistral 7B) for context-aware classification:
- Domain-specific PII (research data, source code)
- Business-sensitive information
- Intellectual property
- Context-dependent classifications
Processing: Local LLM (on your infrastructure, no data leaves)
Enable in .env:
NOXYS_CLASSIFICATION_TIER2_ENABLED=true # Server-side
NOXYS_CLASSIFICATION_TIER3_ENABLED=false # Local LLM (coming soon)
Data Privacy by Country
European Union (GDPR)
- ✅ Data stored in EU data centers
- ✅ Compliance with GDPR Articles 5, 25, 32-34
- ✅ Data processing agreements available
- ✅ Right to erasure (delete all user data)
- ✅ Data portability (export as JSON/CSV)
- ✅ No transfers to US without explicit consent
Germany (GDPR + NIS2)
- ✅ Data processed in Frankfurt (AWS eu-central-1 available)
- ✅ German data protection authority cooperation
- ✅ NIS2 compliance features
France (GDPR + French Data Protection Authority)
- ✅ Data processed in Paris (AWS eu-west-1)
- ✅ CNIL compliance
- ✅ French language support
United Kingdom (UK GDPR)
- ⏳ UK data residency coming in v0.4
- ✅ Currently: EU processing with UK adequacy decision
User Rights & Requests
Right to Access
Export all your data:
curl -X GET https://api.noxys.cloud/api/v1/users/me/data-export \
-H "Authorization: Bearer $TOKEN"
Returns: JSON file with all interactions, policies, audit logs.
Right to Erasure
Delete all your data:
curl -X DELETE https://api.noxys.cloud/api/v1/users/me \
-H "Authorization: Bearer $TOKEN"
Timeline: Deleted within 30 days (backups expire within 90 days).
Right to Rectification
Update your profile:
curl -X PATCH https://api.noxys.cloud/api/v1/users/me \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{
"first_name": "Jean",
"last_name": "Dupont"
}'
Right to Portability
Export data in machine-readable format:
Dashboard → Settings → Export Data → Select format (JSON, CSV, NDJSON)
Third-Party Data Handling
Subprocessors
Noxys uses these third-party processors:
| Service | Purpose | Region | DPA |
|---|---|---|---|
| AWS | Cloud hosting | eu-west-1 (Ireland) | Yes |
| Stripe | Payments | US (tokenized) | Yes |
| SendGrid | US | Yes | |
| Slack | Notifications | US (webhook) | Optional |
No Analytics or Tracking
- ✅ No Google Analytics
- ✅ No Mixpanel / Amplitude
- ✅ No data sharing with advertisers
- ✅ No data brokers
Telemetry (Opt-In)
Noxys can send anonymized usage data to help improve the product:
NOXYS_TELEMETRY_ENABLED=true # Disabled by default
NOXYS_TELEMETRY_ENDPOINT=https://telemetry.noxys.eu
Telemetry includes:
- Feature usage (which policies, which classifications)
- Error rates (no details, just metrics)
- Performance metrics (latency percentiles)
Never includes: Prompt text, user data, PII.
Compliance & Auditing
Audit Trail
All data access is logged:
Admin views interactions → Logged to audit trail
API key created → Logged
Policy changed → Logged with before/after values
User invited → Logged
Data exported → Logged with timestamp
View: Dashboard → Settings → Audit Log
Data Minimization
Noxys only collects what's necessary:
- ❌ No biometric data
- ❌ No location tracking
- ❌ No behavior profiling
- ❌ No device fingerprinting
Legal Holds
For legal proceedings, Noxys can place a hold on data deletion:
Email: legal@noxys.eu with:
- Tenant ID
- User ID(s)
- Reason for hold
- Expected duration
International Data Transfers
EU → US Transfers
Noxys does not transfer data to the US by default.
For US-required integrations (Slack webhook, Stripe),:
- Data is minimized (hashes, no PII)
- Transfer is optional (can be disabled)
- Legal basis: Your explicit consent or contract
Data Localization
Choose where your data is processed:
EU-only (default):
AWS eu-west-1 (Ireland)
Azure westeurope (Netherlands)
GCP europe-west1 (Belgium)
Self-hosted (your infrastructure):
Your VPC / On-premise only
Next Steps
Data privacy questions? Email privacy@noxys.eu or security@noxys.eu