Personal data redaction in Google Analytics 4

Data redaction in Google Analytics 4

Recently Google Analytics released a new feature that helps you to prevent accidental collection of Personal Identifiable Information from your website like emails, phone numbers, full names, etc.

In this article, I will show you how to enable data redaction in Google Analytics 4, how it works, and when you can and can’t rely on it.

In case you prefer video format, you can watch it below. You can also subscribe to my YouTube channel!

Find & configure Data redaction

You can find data redaction under the Admin section → Data Streams → Stream details → Redact data.

Data redaction setting location in google analytics 4

You will see 2 settings here – Email redaction and URL Query parameter redaction.

Email redaction

For GA4 properties created after October 2023, Email redaction will be enabled by default. In other cases, you will need to manually enable it here.

Email redaction will use text patterns to identify text that is likely an email address and replace it with “(redacted)” text. As per Google, it will look across all event parameters and the URL query parameters that are collected as part of any event.

URL Query redaction

With URL Query parameter redaction you can remove any additional personal information that is or can be collected as part of the URL. For example, if some of your URLs pass a visitor’s full name, you can list that parameter here and Google Analytics will redact it.

E.g. https://somewebsite.io?fullname=JohnConnor

Note that URL redaction will apply only to the default event parameters that are intended for accepting URL values:

  • page_location
  • page_referrer
  • page_path
  • link_url
  • video_url
  • form_destination

With these settings enabled data redaction will happen on the client side after Analytics modifies or creates events, so this way no personal data reaches Google servers.

  • Below the settings, you can also test URLs with all the parameters to see what values will be redacted and how that URL will look in your reports:
Preview URLs with redacted data in Google Analytics 4

From the screenshot above you can see that from my example URL it redacted not only the provided query parameters but anything that contained email. So if I save the settings and anyone visits URL with the same structure – this would be the output that would be visible to Google Analytics.

Example used:

https://ezexperiments.com/?email=(redacted)&name=John&fullname=(redacted)&phone=(redacted)&otherParam=(redacted)

Redacted output:

https://ezexperiments.com/?email=(redacted)&name=John&fullname=(redacted)&phone=(redacted)&otherParam=(redacted)

Testing data redaction in GTM

Unfortunately, you can’t test event parameters the same way as URLs, at least at the moment. So I’ve put this to a test and created an event that is full of personal data to see if something can get through.

In Tag Manager, I’ve added a simple GA4 event that will send not only a URL with personal data but also an event parameter with regular email. I’ve also added parameters with URL-encoded emails and custom attributes that contain URLs.

URL that I’ve used here:

https://ezsegment.com?fullname=JohnConnor&phone=12345&testemail=hello@ezsegment.com&testemail2=hello@ezsegment.com

And fully URL-encoded version:

https%3A%2F%2Fezsegment.com%3Ffullname%3DJohnConnor%26phone%3D12345%26testemail%3Dhello%40ezsegment.com%26testemail2%3Dhello%40ezsegment.com 

When I run this event in Preview mode I can see that values are indeed redacted before they are sent to GA and this is really good (“dl” parameter).

However, you will see that this didn’t work for the URL-encoded emails in “custom_url_encoded”. The full email was still passed to Google Analytics!

Then for the regular “custom_url” attribute – the email is redacted thanks to the first Email redaction setting, but not affected by URL parameter redaction in this case, since it’s not one of the default URL parameters that GA4 supports.

You can also double-check what values reach GA4 in DebugView, you should see a similar picture as in the browser console and network logs:

Final thoughts

As you can see, the good news – is that Google is working in that direction, and for some cases, data redaction can prevent some data leaks, so It won’t harm to keep these settings enabled.

The bad news – it won’t magically prevent personal data captures, so you might want to set up an additional safety net in your tag management system or directly from the website (with the help of web developer) to prevent sending personal data in GA4.

Also, In case you were wondering, this feature will work only for web tracking and events collected on the website. Data redaction will not be applied to measurement protocol requests or data imports.

I hope you found this quick review useful, and let me know in the comments if you have any questions!

Leave a Reply

Your email address will not be published. Required fields are marked *