Data Handling & Privacy

Noxys implements privacy-by-design principles. This page explains exactly what data is stored, how it's protected, and your privacy guarantees.

Core Privacy Principle

Raw prompts are never stored.

Only SHA-256 hashes and metadata are retained. This is non-reversible and privacy-preserving.

What Data Is Stored

AIInteraction Records

The canonical data model for all AI usage is the AIInteraction.

Stored

{
  "id": "int-12345",
  "tenant_id": "org-abc123",
  "user_id": "user-xyz789",
  "platform_id": "chatgpt",          // OpenAI, Anthropic, etc.
  "model": "gpt-4-turbo",             // AI model used
  "prompt_hash": "sha256:a1b2c3...",  // SHA-256, one-way
  "response_hash": "sha256:d4e5f6...", // Response hash (optional)
  "content_length": 1240,             // Bytes
  "classifications": [
    {
      "type": "email",               // PII type: email, phone, iban, etc.
      "confidence": 0.95,            // Confidence score (0-1)
      "tier": 1,                     // Detection tier (1=regex, 2=server, 3=SLM)
      "count": 2                     // Number of matches
    }
  ],
  "policy_applied": "block-pii-chatgpt",
  "policy_action": "Block",          // Block, Coach, Log
  "risk_score": 0.87,                // Risk level (0-1)
  "timestamp": "2026-03-23T14:30:00Z",
  "platform_url": "https://chatgpt.com/c/abc123",
  "extension_version": "v0.2.1",
  "browser": "Chrome 124.0"
}

Not Stored

{
  "prompt": "What is my SSN? 123-45-6789",  // ❌ NEVER stored
  "response": "I cannot provide that...",    // ❌ NEVER stored (unless Tier 3)
  "username_in_prompt": "john.doe",          // ❌ Only hashed
  "api_key": "sk-...",                       // ❌ NEVER stored
  "full_url": "https://chatgpt.com/...",     // ❌ Hostname only
}

User Records

{
  "id": "user-xyz789",
  "tenant_id": "org-abc123",
  "email": "john.doe@company.com",  // Encrypted
  "password_hash": "bcrypt:...",     // Never plain text
  "first_name": "John",
  "last_name": "Doe",
  "role": "viewer",                  // admin or viewer
  "mfa_enabled": true,
  "created_at": "2026-01-15T10:00:00Z",
  "last_login": "2026-03-23T09:15:00Z",
  "status": "active"
}

Policy Records

{
  "id": "pol-abc123",
  "tenant_id": "org-abc123",
  "name": "Block ChatGPT",
  "description": "Prevent all ChatGPT usage",
  "enabled": true,
  "priority": 1,
  "conditions": {
    "platform_id": ["chatgpt"],
    "user_group": ["all"],
    "risk_level": null
  },
  "action": "Block",                 // Block, Coach, Log
  "created_by": "user-xyz789",
  "created_at": "2026-01-20T08:00:00Z",
  "modified_at": "2026-03-20T14:30:00Z"
}

Audit Logs

{
  "id": "audit-12345",
  "tenant_id": "org-abc123",
  "actor_id": "user-xyz789",
  "actor_email": "john.doe@company.com",
  "action": "policy_created",
  "resource_type": "policy",
  "resource_id": "pol-abc123",
  "changes": {
    "before": null,
    "after": { "name": "Block ChatGPT", "enabled": true }
  },
  "ip_address": "192.0.2.1",          // Anonymized in EU regions
  "user_agent": "Mozilla/5.0...",
  "timestamp": "2026-03-23T14:30:00Z",
  "status": "success"
}

What We DON'T Store

Raw prompt text
Raw LLM responses
API keys or credentials
Personally identifiable information (beyond what's needed for service)
Location data beyond timezone
Browser history (only AI platform URLs)
Cookies or tracking identifiers

Data Retention

Default Retention Policy

Data Type	Retention	Notes
Interactions	90 days	Configurable; older records archived
Audit Logs	1 year	Immutable; never modified or deleted
User Records	Indefinite	Deleted when account is closed
Backups	30 days	Automated daily backups

Customization

Override retention in .env:

NOXYS_INTERACTION_RETENTION=365  # Keep 1 year instead of 90 days
NOXYS_AUDIT_LOG_RETENTION=2555   # Keep ~7 years for compliance
NOXYS_BACKUP_RETENTION=90        # Keep 90 days of backups

Archival

Interactions older than retention period are archived:

NOXYS_INTERACTION_ARCHIVE_ENABLED=true
NOXYS_INTERACTION_ARCHIVE_AFTER_DAYS=30
NOXYS_INTERACTION_ARCHIVE_LOCATION=s3://my-archive-bucket

Encryption

In Transit (Network)

All network traffic uses TLS 1.3:

Extension → API: TLS 1.3
API → Database: TLS 1.3 (optional for local networks)
API → Redis: TLS 1.3 (optional)
Webhooks: TLS 1.3

At Rest (Storage)

Option 1: Disk Encryption (Recommended for Production)

# Linux: dm-crypt
sudo cryptsetup luksFormat /dev/sdb1
sudo cryptsetup luksOpen /dev/sdb1 encrypted-data
sudo mkfs.ext4 /dev/mapper/encrypted-data

Option 2: Application-Level Encryption

# In .env
NOXYS_ENCRYPTION_AT_REST_ENABLED=true
NOXYS_ENCRYPTION_KEY_PATH=/etc/noxys/encryption/key
NOXYS_ENCRYPTION_ALGORITHM=aes-256-gcm

Option 3: Cloud-Provider Encryption

AWS RDS: Encryption enabled by default
Azure Database: Encryption enabled by default
GCP Cloud SQL: Encryption enabled by default

Hashing & One-Way Functions

SHA-256 Hashing

Noxys uses SHA-256 to create prompt fingerprints:

Original Prompt:
  "My email is john.doe@company.com and my SSN is 123-45-6789"

SHA-256 Hash:
  a1b2c3d4e5f6...789 (64-character hex string)

Property: One-way function. Cannot reverse to get original prompt.

Why Hashing Is Sufficient

Collision resistance: Practically impossible to find two prompts with same hash
Non-reversible: Cannot recover original prompt from hash
Deterministic: Same prompt always produces same hash (allows deduplication)
Efficient: Hash computation is fast (<1ms)

Data Classification

Noxys classifies content into tiers:

Tier 1: Regex (Client-Side, Built-In)

Detects with regex patterns, no PII stored:

Email addresses ([a-z]+@[a-z]+\.[a-z]+)
Phone numbers (E.164 format)
Credit card numbers (Luhn algorithm)
IBAN (international bank accounts)
French NIR (Numéro d'Immatriculation au Répertoire)
French SIRET/SIREN (business IDs)

Processing: Client-side in browser extension (no data sent)

Tier 2: Server-Side Classification (Optional)

Uses Microsoft Presidio for deeper detection:

Person names (via NER model)
Medical terms (ICD-10, medical entities)
Legal references (statute names, etc.)
IP addresses (IPv4, IPv6)
API keys (AWS, Azure, GCP patterns)
JWT tokens
Financial amounts (currency patterns)

Processing: Backend server; content is hashed before sending

Tier 3: Semantic Classification (Coming v0.5)

Uses local LLM (Mistral 7B) for context-aware classification:

Domain-specific PII (research data, source code)
Business-sensitive information
Intellectual property
Context-dependent classifications

Processing: Local LLM (on your infrastructure, no data leaves)

Enable in .env:

NOXYS_CLASSIFICATION_TIER2_ENABLED=true   # Server-side
NOXYS_CLASSIFICATION_TIER3_ENABLED=false  # Local LLM (coming soon)

Data Privacy by Country

✅ Data stored in EU data centers
✅ Compliance with GDPR Articles 5, 25, 32-34
✅ Data processing agreements available
✅ Right to erasure (delete all user data)
✅ Data portability (export as JSON/CSV)
✅ No transfers to US without explicit consent

✅ Data processed in Frankfurt (AWS eu-central-1 available)
✅ German data protection authority cooperation
✅ NIS2 compliance features

✅ Data processed in Paris (AWS eu-west-1)
✅ CNIL compliance
✅ French language support

⏳ UK data residency coming in v0.4
✅ Currently: EU processing with UK adequacy decision

User Rights & Requests

Right to Access

Export all your data:

curl -X GET https://api.noxys.cloud/api/v1/users/me/data-export \
  -H "Authorization: Bearer $TOKEN"

Returns: JSON file with all interactions, policies, audit logs.

Right to Erasure

Delete all your data:

curl -X DELETE https://api.noxys.cloud/api/v1/users/me \
  -H "Authorization: Bearer $TOKEN"

Timeline: Deleted within 30 days (backups expire within 90 days).

Right to Rectification

Update your profile:

curl -X PATCH https://api.noxys.cloud/api/v1/users/me \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "first_name": "Jean",
    "last_name": "Dupont"
  }'

Right to Portability

Export data in machine-readable format:

Dashboard → Settings → Export Data → Select format (JSON, CSV, NDJSON)

Third-Party Data Handling

Subprocessors

Noxys uses these third-party processors:

Service	Purpose	Region	DPA
AWS	Cloud hosting	eu-west-1 (Ireland)	Yes
Stripe	Payments	US (tokenized)	Yes
SendGrid	Email	US	Yes
Slack	Notifications	US (webhook)	Optional

No Analytics or Tracking

✅ No Google Analytics
✅ No Mixpanel / Amplitude
✅ No data sharing with advertisers
✅ No data brokers

Telemetry (Opt-In)

Noxys can send anonymized usage data to help improve the product:

NOXYS_TELEMETRY_ENABLED=true  # Disabled by default
NOXYS_TELEMETRY_ENDPOINT=https://telemetry.noxys.eu

Telemetry includes:

Feature usage (which policies, which classifications)
Error rates (no details, just metrics)
Performance metrics (latency percentiles)

Never includes: Prompt text, user data, PII.

Compliance & Auditing

Audit Trail

All data access is logged:

Admin views interactions → Logged to audit trail
API key created → Logged
Policy changed → Logged with before/after values
User invited → Logged
Data exported → Logged with timestamp

View: Dashboard → Settings → Audit Log

Data Minimization

Noxys only collects what's necessary:

❌ No biometric data
❌ No location tracking
❌ No behavior profiling
❌ No device fingerprinting

Legal Holds

For legal proceedings, Noxys can place a hold on data deletion:

Email: legal@noxys.eu with:

Tenant ID
User ID(s)
Reason for hold
Expected duration

International Data Transfers

EU → US Transfers

Noxys does not transfer data to the US by default.

For US-required integrations (Slack webhook, Stripe),:

Data is minimized (hashes, no PII)
Transfer is optional (can be disabled)
Legal basis: Your explicit consent or contract

Data Localization

Choose where your data is processed:

EU-only (default):
  AWS eu-west-1 (Ireland)
  Azure westeurope (Netherlands)
  GCP europe-west1 (Belgium)

Self-hosted (your infrastructure):
  Your VPC / On-premise only

Next Steps

Data privacy questions? Email privacy@noxys.eu or security@noxys.eu

Core Privacy Principle​

What Data Is Stored​

AIInteraction Records​

Stored​

Not Stored​

User Records​

Policy Records​

Audit Logs​

What We DON'T Store​

Data Retention​

Default Retention Policy​

Customization​

Archival​

Encryption​

In Transit (Network)​

At Rest (Storage)​

Option 1: Disk Encryption (Recommended for Production)​

Option 2: Application-Level Encryption​

Option 3: Cloud-Provider Encryption​

Hashing & One-Way Functions​

SHA-256 Hashing​

Why Hashing Is Sufficient​

Data Classification​

Tier 1: Regex (Client-Side, Built-In)​

Tier 2: Server-Side Classification (Optional)​

Tier 3: Semantic Classification (Coming v0.5)​

Data Privacy by Country​

European Union (GDPR)​

Germany (GDPR + NIS2)​

France (GDPR + French Data Protection Authority)​

United Kingdom (UK GDPR)​

User Rights & Requests​

Right to Access​

Right to Erasure​

Right to Rectification​

Right to Portability​

Third-Party Data Handling​

Subprocessors​

No Analytics or Tracking​

Telemetry (Opt-In)​

Compliance & Auditing​

Audit Trail​

Data Minimization​

Legal Holds​

International Data Transfers​

EU → US Transfers​

Data Localization​

Next Steps​