A comprehensive look at our client-side sanitization architecture, privacy guarantees, and detection algorithms.
Clean My Prompt is built on a zero-trust, client-only architecture. Every line of code executes in your browser's JavaScript sandbox with no server communication.
Core Principle:
"Don't trust servers with sensitive data. Process it locally where the user has full control."
file:// protocol)When you paste text, it goes through a multi-stage sanitization pipeline:
// User pastes text into textarea
textarea.addEventListener('input', (event) => {
const rawText = event.target.value; // Stored in JavaScript heap (RAM)
sanitizeText(rawText);
});
The text is stored in the browser's JavaScript heap memory (RAM). No disk writes occur. The data exists only in volatile memory.
// Compromise.js analyzes text structure
const doc = nlp(rawText);
const people = doc.people().out('array'); // ['John Smith', 'Dr. Chen']
const places = doc.places().out('array'); // ['Seattle', 'New York']
const orgs = doc.organizations().out('array'); // ['Microsoft', 'Apple Inc.']
Natural Language Processing runs first to detect contextual entities. Compromise.js uses part-of-speech tagging and entity recognition without any network calls.
// Sequential pattern application
const patterns = [
{ name: 'email', regex: /\b[\w.+-]+@[\w.-]+\.[a-z]{2,}\b/gi },
{ name: 'phone', regex: /\b(?:\+?\d{1,3}[-.\s]?)?...\b/gi },
{ name: 'apiKey', regex: /\b(?:sk|pk|api_key)[-_][a-zA-Z0-9_-]{12,}\b/g },
// ... 15+ more patterns
];
patterns.forEach(pattern => {
sanitizedText = sanitizedText.replace(pattern.regex, placeholder);
});
After NLP, regex patterns scan for structured data: emails, API keys, IP addresses, credit cards, IBANs, phone numbers (US/EU), passwords, and URLs.
Sanitized text is displayed in real-time. Two modes available:
[EMAIL_1], [PHONE_2], [API_KEY_3]user@company.com, (555) 123-4567, api_key_prod_xyz123// When you close the tab or navigate away:
window.addEventListener('beforeunload', () => {
// JavaScript GC automatically frees all heap memory
// No data persists. No traces remain.
});
Our NLP engine uses linguistic analysis to detect:
Why NLP First?
Names like "John Smith" can contain common words. Running NLP before regex prevents false positives from word-boundary patterns.
15+ regex patterns detect structured sensitive data:
| Category | Examples Detected | Regions |
| Emails | user@example.com, john.doe+tag@company.io |
Universal |
| Phone Numbers | (555) 123-4567, +49 176 1234567, 0176265124 |
US, EU, DE |
| API Keys | sk_live_abc..., AKIAIOSFODNN7..., ghp_xyz... |
Universal |
| IP Addresses | 192.168.1.1, 2001:0db8:85a3::8a2e:0370:7334 |
IPv4, IPv6 |
| Credit Cards | 4532-1234-5678-9010, 5425 2334 3010 9903 |
Universal |
| IBANs | DE89370400440532013000, FR14 2004... |
EU |
| Credentials | password: abc123, username: admin |
Universal |
| URLs | https://api.example.com, www.site.com |
Universal |
After initial page load, no network requests are made. You can verify this:
Offline Mode Test:
Disconnect from the internet after loading the page. The tool continues to work perfectly. This proves all processing is local.
Your sensitive data never touches:
JavaScript's automatic garbage collection ensures:
Sanitization happens as you type. Typical performance:
You can add custom regex patterns for domain-specific sensitive data:
Name: SSN
Regex: \b\d{3}-\d{2}-\d{4}\b
Placeholder: SSN
Name: UK_NIN
Regex: \b[A-Z]{2}\d{6}[A-D]\b
Placeholder: UK_NATIONAL_INSURANCE
Custom patterns are stored in sessionStorage and lost when you close the tab (privacy by design).
We protect against:
We cannot protect against:
β οΈ Security Best Practice:
Use a clean, updated browser on a trusted device. Disable untrusted extensions when working with sensitive data.
Clean My Prompt is fully open source under the MIT License. You can:
file:// protocol)GitHub: github.com/Eulex0x/cleanmyprompt