The PII Problem in AI Prompts Nobody Talks About

Ask most engineering teams whether their AI integrations handle PII correctly, and they'll say yes. Ask them how they know, and the answer usually involves some variation of: "We told engineers not to include personal data in prompts."

That's not a control. That's a guideline. And guidelines don't survive the gap between what developers intend and what their code actually does at 2 AM when an edge case hits.

How PII ends up in prompts

The most common path isn't malicious — it's architectural.

A developer builds a customer support feature. The feature retrieves relevant context to help the model answer the user's question. The context retrieval pulls from a database. The database has customer records. The records have names, email addresses, sometimes more.

The developer tests with synthetic data. Everything looks clean. The feature ships. In production, real customer records flow into the context window. The model sees names, emails, and account details as part of every support request.

Nobody decided to send PII to the AI provider. The architecture just made it the path of least resistance.

The second path: user input. A user types their Social Security Number into a form that feeds an AI assistant. Or their credit card number. Or their date of birth. The application passes the user's message directly to the model. The data is now in the provider's infrastructure, processed through their systems, logged in their traces.

The third path: developer tooling. Cursor, Copilot, and similar tools operate on codebases. Codebases contain config files, migration scripts, seed data, test fixtures. Some of that data is real. When a developer asks an AI coding assistant to "look at this test" or "help me debug this query," the context window may include real production data that ended up in test files.

None of these are hypothetical. All three happen regularly in organizations that believe they have PII controls in place.

Why it's hard to catch

PII detection in prompts has several properties that make it genuinely difficult:

Volume and speed. A high-traffic AI integration can generate thousands of requests per minute. You can't manually review prompts. Any detection has to be automated and fast enough to not add perceptible latency.

Context-dependence. "123-45-6789" is a Social Security Number. But "our ticket ID is 123-45-6789" is a ticket ID. Pure regex matching produces false positives that break legitimate requests and false negatives that let real PII through. Precision matters.

Format variation. Phone numbers appear in dozens of formats: (555) 867-5309, 555-867-5309, +15558675309, 5558675309. A detector that catches one format and misses another is worse than useless — it creates a false sense of security.

PII in unexpected places. An email address in the subject line of a customer email that became part of a knowledge base document that got retrieved as context. A partial credit card number in a support ticket note. These aren't in structured fields where you'd think to look.

The three response modes

Once you detect PII, you have three choices:

Block. Reject the request entirely. This is maximally protective and maximally disruptive. Appropriate for high-sensitivity data types (SSN, credit card, MRN) in contexts where the use case doesn't require them. If a customer support bot receives a message containing a credit card number, blocking is the right call — the model shouldn't need it to answer the question.

Obfuscate. Replace the PII with a deterministic token before the request leaves your network. "Patient John Smith's SSN is 123-45-6789" becomes "Patient <PII:NAME:a3f9> 's SSN is <PII:SSN:7c2b>". The model works with the tokenized version. When the response comes back, the tokens are replaced with the originals in post-processing. The model's answer is coherent; the provider never saw the real data.

This is the most useful mode for most integrations. It lets legitimate use cases function while keeping the actual values inside your perimeter.

Log. Record the detection event without modifying the request. Use this for audit purposes — you want to know how often PII appears in this integration, which detectors are firing, and whether the patterns suggest a data handling problem upstream. Don't use it as a primary control; use it to measure before deciding between block and obfuscate.

The quarantine problem

Obfuscation requires a temporary store: the mapping from token to original value, held long enough to decode the response. This store is itself sensitive data. If it leaks, you've moved the problem rather than solved it.

The right answer is burn-after-read. The mapping is sealed with envelope encryption (per-tenant key, sealed by a master KMS key), stored only as long as the round-trip takes, and marked expired on use. After the response is decoded, the plaintext mapping is gone. An attacker who compromises the store after the request completes finds expired ciphertext.

This is not a common implementation. Most teams that build PII obfuscation build it once, under time pressure, and leave the mappings in Redis with a generous TTL. That's better than nothing. It's not a design you'd want to explain to an auditor.

What the audit trail should capture

When a PII detection event fires, the audit record should include:

Timestamp
Org and project
Which detector fired (SSN, credit card, email, etc.)
The action taken (block, obfuscate, log)
That action was taken — not the value itself

Critically: the raw PII value should never appear in the audit log. If your audit log contains the actual SSN that triggered the detector, you've created a second place where PII lives, probably with weaker access controls than your primary data store.

The audit log is evidence that the control worked, not a copy of what was detected.

The CI gate question

How do you verify that your PII controls are actually running in production? The uncomfortable answer is that most organizations don't — they assume the code they wrote is executing the way they think it is.

A more defensible posture: a CI gate that runs your test suite's prompts through the detection engine and verifies that known PII patterns are caught. A test that sends "My SSN is 123-45-6789" and asserts that the request is either blocked or obfuscated before leaving the test harness. This doesn't prove production is correct — but it proves the detection logic works on a known input, and it fails the build if someone accidentally disables or misconfigures the engine.

The same logic applies to traces. OpenTelemetry spans for AI requests should never contain raw prompt content if that content might include PII. A test that scans emitted spans for PII-shaped strings — and fails CI if any are found — closes a gap that's easy to miss.

The regulatory context

GDPR, CCPA, HIPAA, and the emerging EU AI Act all have something to say about personal data in AI systems. The details differ, but the pattern is consistent: you need to know what personal data your systems process, where it goes, and what controls are in place.

"We told engineers not to include PII" doesn't satisfy any of these. "Our gateway detects and obfuscates PII before it leaves our network, with an append-only audit log of every detection event" is a different conversation with a regulator.

It's also a different conversation with a customer who asks "does your AI product see my personal data?" The answer "no — it's obfuscated before the request leaves our infrastructure" is a better answer than "we try not to include it."

Visionality's PII engine implements all of this — twelve detectors, three response modes, burn-after-read quarantine, and CI-verified span cleanliness. See how it works →