Microsoft Clarity records sessions. Sessions contain user input. User input frequently includes personally identifiable information — email addresses, phone numbers, full names, credit card details, sometimes worse. By default, Clarity masks some of this (password fields, anything marked type="password") but not most of it. Your unfiltered session recordings probably contain GDPR violations sitting in a database waiting to happen.
The naive fix is to mask everything — turn on Clarity’s aggressive masking and call it compliance. The problem: aggressive masking destroys the analytical value of recordings. If every form field shows as black boxes, you can’t see what users typed, where they hesitated, what they corrected. You’re blind to the behavioral signals that make session recordings valuable in the first place.
This guide is the selective PII masking framework we deploy for Dallas clients in regulated industries (healthcare, finance, legal). The exact Clarity configuration that masks PII while preserving CRO-relevant behavioral context. The 5 categories of data that need different treatment. The validation pattern that prevents accidental PII leaks. And the GDPR/CCPA/HIPAA implications most teams overlook.
Microsoft Clarity captures real user behavior, including PII typed into form fields. Default masking is insufficient for regulated industries (healthcare, finance, legal) and risky for any business handling EU traffic under GDPR. The selective masking framework: (1) auto-mask sensitive fields via data-clarity-mask="True" on email/phone/SSN/credit card inputs, (2) preserve UX context by NOT masking layout, button text, navigation, scroll behavior, (3) track interactions semantically (e.g., "user typed 12 characters in email field" instead of capturing the actual email), (4) validate ongoing by sampling 10 sessions monthly to check for PII leaks. The framework below covers configuration, validation, and the GDPR/CCPA implications most teams underestimate.
What PII Actually Leaks in Clarity (Without Custom Masking)
Open any Clarity session recording from a form-heavy page. By default, you’ll see:
- Email addresses typed into form fields — visible character-by-character
- Phone numbers typed into fields — visible
- Full names typed into "Name" fields — visible
- Addresses typed into shipping forms — visible
- Date of birth on demographic forms — visible
- SSN or government ID on identity verification forms — potentially visible (depends on field
type) - Credit card numbers — partially masked by default IF the field has
autocomplete="cc-number"; fully visible otherwise
This is not an attack on Clarity — the same issue exists with Hotjar, FullStory, LogRocket, and every session recording tool. The default behavior is to capture what users type so analysts can see real behavior. Without explicit configuration, all that text gets captured.
Privacy implications:
- GDPR: EU users have right to data minimization. Storing their email/name/phone in session recordings without explicit consent and a documented purpose is a violation.
- CCPA: California residents have right to deletion. PII in session recordings must be deletable when requested.
- HIPAA: Healthcare information (including names, emails, dates of birth) in session recordings is PHI. Unmasked PHI in a third-party tool is a HIPAA violation.
- Financial regulations: Account numbers, SSNs, and certain types of customer data captured by financial services are subject to specific protection requirements (GLBA, PCI-DSS).
Microsoft Clarity’s default behavior masks type="password" fields and applies basic redaction to credit-card-flagged fields. That’s it. Everything else — email, phone, name, address, free-text fields where users might paste sensitive info — is captured by default. If you haven’t explicitly configured masking, you have unmasked PII in your recordings right now. Audit it this week, not "later."
The 5 Data Categories That Need Different Treatment
Category 1: Hard identifiers (mask fully)
SSN, government ID numbers, credit card numbers, bank account numbers, medical record numbers. These should be COMPLETELY masked — not even character count should be visible. Treatment: data-clarity-mask="True" on the input, OR ideally the value never reaches the DOM (server-side rendering, secure iframes from payment processors).
Category 2: Contact info (mask content, keep field shape)
Email addresses, phone numbers. Mask the actual value (user typed something), but preserve the fact that they typed in the field, character count, and timing. Treatment: data-clarity-mask="True" applied to the input. Clarity will show that something was typed (visible cursor, character timing) but not what.
Category 3: Names & addresses (mask content, keep field shape)
First name, last name, full address, city, postal code. Same treatment as contact info — mask the content, preserve the interaction patterns. Treatment: data-clarity-mask="True".
Category 4: Selections & clicks (capture normally)
Dropdown selections, radio button choices, checkbox toggles, button clicks. These almost never contain PII (they’re predefined options). Capturing them is high-value for CRO analysis. Treatment: no masking.
Category 5: UI/navigation (capture normally)
Page transitions, scroll behavior, layout interactions, hover patterns. Zero PII risk; high CRO value. Treatment: no masking.
The principle: mask CONTENT in PII-bearing fields, preserve everything else. You still see the user typing in the email field (which tells you they engaged with it), how long they took, whether they corrected typos, whether they rage-clicked elsewhere after. You just don’t see the actual email address.
Implementation: The Selective Masking Setup
Method 1: HTML attribute on specific fields
The simplest approach: add data-clarity-mask="True" directly to PII-bearing input elements.
<!-- Mask the value but preserve interaction patterns -->
<input type="email" name="email" data-clarity-mask="True">
<input type="tel" name="phone" data-clarity-mask="True">
<input type="text" name="full_name" data-clarity-mask="True">
<input type="text" name="street_address" data-clarity-mask="True">
<!-- Don't mask selections or buttons -->
<select name="industry">...</select> <!-- No mask needed -->
<input type="radio" name="company_size" value="50_to_200"> <!-- No mask -->
<!-- Mask the WHOLE container if it might contain PII -->
<div data-clarity-mask="True">
<p>Account number: {{ user.account_number }}</p>
</div>
Method 2: CSS class-based masking
For sites where you control templates centrally, use a CSS class approach:
<!-- Define a single class for all PII inputs -->
<style>
.pii-field { /* visual styles */ }
</style>
<script>
// Auto-apply data-clarity-mask to all .pii-field elements
document.querySelectorAll('.pii-field').forEach(el => {
el.setAttribute('data-clarity-mask', 'True');
});
</script>
<!-- Then in templates -->
<input type="email" class="pii-field" name="email">
Method 3: Server-side defensive defaults
Most robust approach for regulated industries: render any field type that COULD contain PII with masking enabled by default, then explicitly opt-out for non-PII fields. Reverses the failure mode — new fields are private by default, not exposed by default.
{# Base form template — masks by default #}
<input type="{{ field.type }}" name="{{ field.name }}"
{% if field.type != 'submit' and field.type != 'checkbox' %}
data-clarity-mask="True"
{% endif %}>
{# Specific opt-out for non-PII text inputs #}
{% block special_dropdown %}
<select name="industry">
{# No mask — industries are predefined options #}
</select>
{% endblock %}
Tracking Behavior Without Capturing PII
The trick to preserving CRO value while masking PII: track INTERACTIONS, not VALUES. Examples:
- Email field interaction: Track "user typed 12 characters in email field" via custom event, not the actual email. Tells you they engaged; doesn’t leak the address.
- Form completion timing: Track "user spent 47 seconds on form" via duration tracking. Reveals friction without exposing data.
- Field validation failures: Track "email field failed validation 3 times" via error event count. Reveals UX issues without storing the wrong-formatted emails.
- Field correction patterns: Track "user deleted and retyped name field" as a behavior signal. Indicates hesitation or typo-correction without storing the text.
- Tab/focus patterns: Track which fields users focus, in what order, with what dwell time. Reveals form flow issues without capturing entered data.
These behavioral signals provide 80–90% of the CRO insight value of raw text capture, with zero PII risk. The remaining 10–20% (actual text users type) is rarely worth the privacy/legal exposure.
Clarity custom tags (covered in our ecommerce cart tags guide) are perfect for capturing PII-free behavioral signals. Tag email_field_attempts: 3 when a user tries three times to enter email. Tag form_completion_seconds: 47 for timing. Filter sessions by these tags to find struggle patterns without ever capturing the actual data. This is the privacy-friendly equivalent of having full text.
Validation: Auditing for PII Leaks Monthly
Configuration is not enough. New form fields, marketing campaigns, and code deploys can introduce PII leaks without anyone noticing. Set up a monthly validation routine:
- Sample 20 random sessions from the past 30 days.
- For each session, look for visible text in form fields. Should be all masked dots/asterisks. If you can read an email or name, that’s a leak.
- Check key forms specifically: demo request, contact, newsletter signup, checkout, account creation. New forms are highest-risk because they may not have been added to the masking config.
- Look at custom tags: ensure no tag values contain PII (e.g.,
last_email: "joe@example.com"is a violation;email_was_provided: "yes"is fine). - Log findings in a tracker. Each violation gets a ticket. Each ticket has a fix deployed within 7 days. Document fixed-vs-found rate as a privacy KPI.
HIPAA-Specific Considerations
For Dallas healthcare clients, HIPAA imposes stricter requirements than GDPR. Key additions:
- BAA required: Microsoft offers a BAA for Clarity, but only for the paid Clarity Enterprise tier. The free tier is not HIPAA-compliant out of the box, even with full masking. Verify your tier matches your compliance needs.
- Mask broader categories: Beyond standard PII, HIPAA-relevant PHI includes any combination of identifiable + health information. Pages discussing specific conditions, treatments, or medication should mask interactive elements that could reveal patient identity.
- No recording of patient portals: Even with masking, patient portal pages typically should not be recorded at all. Use page-level exclusion in Clarity to skip these URLs.
- Document your config: HIPAA audits will ask. Maintain a "Clarity masking configuration" document showing what’s masked, what isn’t, and why each decision was made.
The full setup pattern for healthcare lives in our Clarity setup guide for Dallas healthcare practices.
Real Case: Dallas Healthcare Practice Achieves HIPAA-Compliant Analytics
In March 2026 we helped a Dallas-based multi-location medical practice deploy Clarity in a HIPAA-compliant way. Their previous setup: Clarity installed with default config, full text capture, recordings stored for 90 days. Their compliance officer flagged this as a critical violation.
The migration plan:
- Day 1: Disable all existing recordings, mark for deletion at 30-day retention boundary
- Week 1: Upgrade to Clarity Enterprise tier, sign BAA with Microsoft
- Week 2: Implement selective masking using Method 2 (CSS class-based) across all forms
- Week 3: Exclude patient portal URLs from recording entirely
- Week 4: Set up custom tags for behavioral signals (form abandonment, validation errors, completion timing)
- Week 5: Validate with 30-session audit; document compliance config
5 Common PII Masking Mistakes
- 1. Masking everything aggressively. Hides legitimate CRO insights. Selective masking is better — PII content hidden, interaction patterns visible.
- 2. Forgetting new forms. New marketing campaigns add new forms. Each new form needs to be added to the masking config. Build this into deploy checklists.
- 3. Using "type=password" hack for non-passwords. Some teams set sensitive fields as
type="password"to trigger Clarity’s default masking. This breaks autofill, accessibility, and mobile keyboard hints. Use properdata-clarity-maskattribute instead. - 4. Capturing PII in custom tags. Custom tags are searchable in Clarity dashboard. Never set tags with actual email/phone/name values — use behavioral counts instead.
- 5. Never validating. Configuration drift happens. Monthly validation audits catch leaks before they become incidents. Make this a recurring calendar event.
For Dallas businesses with regulated-industry exposure (healthcare, finance, legal) or significant EU traffic, the selective masking framework is non-negotiable. For everyone else, it’s still strongly recommended — CCPA expansions and state-level privacy laws (Texas Data Privacy Act, etc.) are making strict PII protection a default expectation, not an edge case. Implement this once, validate monthly, and your behavioral analytics stays both legal and useful.
Frequently Asked Questions
What’s the difference between “mask” and “exclude” in Clarity?
Mask means the field is recorded but its content is hidden (shows as dots/asterisks). The interaction is captured; the value is not. Exclude means the entire page or element is not recorded at all. Use mask for individual PII fields on otherwise-recordable pages. Use exclude for entire pages that shouldn’t be recorded (patient portals, account settings, payment confirmation pages). Both have their place; choose based on whether you need to see any interaction at all on that page.
Does masking affect heatmap data?
No. Heatmaps capture click coordinates and scroll positions, not content. Masking PII in form values doesn’t change where users click or scroll. Heatmap accuracy is preserved. The only behavioral analytics feature affected by content masking is session recording playback — you can’t SEE what users typed, but you can see they typed (cursor, timing, focus events). Heatmaps, scroll maps, and click maps all work normally with masking enabled.
How do I handle multilingual sites where field names vary?
Use field name attribute (which doesn’t change with language) as your masking selector, not the visible label. <input name="email" data-clarity-mask="True"> works regardless of whether the label says "Email", "El correo", or "El.paštas". For server-side defensive masking, key on the field type and name pattern, not the displayed label text. This makes the masking config language-independent.
What about chat widgets that might contain PII in conversations?
Chat widgets (Intercom, Drift, Zendesk Chat) are particularly tricky. Users may type personal information into chat messages. The widget element should usually be excluded entirely from session recording: <div id="chat-container" data-clarity-region="hidden">. Better: configure Clarity to exclude the chat domain entirely if the chat is served from a separate origin. The CRO value of seeing chat content is low compared to the privacy risk.
Will users be notified that their sessions are being recorded?
You should notify them, regardless of masking. Most jurisdictions require informed consent for behavioral data collection. The standard approach: cookie consent banner that includes session recording in its scope, with an option to decline. For GDPR compliance, the banner must allow declining without dark patterns. For Dallas businesses targeting EU traffic, this is mandatory. For US-only traffic, it’s recommended practice and may soon be required by state laws (Texas, Florida, others moving in this direction).
Want us to audit your Clarity privacy config?
We’ll review your current Clarity setup, identify PII leaks in recordings, and deploy selective masking that preserves CRO context while protecting user data. Free for healthcare, finance, and legal businesses with active behavioral analytics.
Get a Privacy Audit Explore Healthcare SEO