skip to main content

Collecting the site data you need – not the site data you want

Collecting loads of personal data on your site visitors won’t lead to the insights you need. You need to look at the bigger picture.

Collecting data is exciting. There’s a big buffet of data out there to choose from, with companies analysing everything from scroll speed to mouse movement. Finding new ways to learn about your site visitors is a great technical challenge. Okay, maybe that’s just me. 

But whether you find it exciting or excruciating, collecting data on your site visitors is useful. There’s just one problem – that little voice in your head wondering if you’re going too far. How much data is too much? 

It’s much harder to collect data on your site’s visitors than it used to be. Since 2019, UK law requires webpages to get explicit permission from users before tracking their activities with cookies. Personally, I think that’s great. But does it make my job more difficult? 

These regulations were the result of debates about privacy. In these debates, morality is often put head-to-head with efficacy. The received wisdom is that the more data you can collect on the individuals browsing your site, the easier it will be for you to understand your audience and meet your goals. By this logic, moral considerations about privacy get in the way of effective data collection. You’re on one side, or the other. 

It’s true that these regulations make collecting data more difficult. They certainly have for me. The 2019 regulations led to a 40-60% drop in recorded sessions across many of the sites we report on. That’s a lot of lost data.

It’s not true, however, that the more data you collect, the better. 

More data ≠ more insights

The idea that collecting as much personal information about your site’s visitors as possible is the best way to understand your audience leaves a lot of businesses chasing diminishing returns. 

That’s because collecting all the personal data you can from your site visitors doesn’t just come with moral dilemmas. It can also make it harder to draw useful, actionable conclusions from your data.

Because of cookie regulations, you need to understand that the data you collect from your site only reflects a portion of its visitors. This means that there will be certain demographics – like privacy conscious people who never accept cookies – that you will never be able to see.  

Not only will those groups be invisible to you, but learning more and more about the users who do accept cookies isn’t going to change that. Instead, it can create the illusion that you’re gaining more insights on your site visitors as a whole when in fact you’re learning personal and often irrelevant info about an unrepresentative portion of your traffic. 

By looking at your visitors’ genders, locations or other personal information, you’re unlikely to find any information which will blow things wide open and give you clear actionable insights. 

I find diving into the minutia of new data exciting, but is it the best use of my time? If you aren’t careful, you can find yourself wasting time and money adding more unnecessary info to the pile, distracting you from the big picture. 

This doesn’t mean that you can’t draw insights from the data you have. 

Doing more with less data

Site metrics that show you how visitors are interacting with your content can be really useful. 

How far are your visitors scrolling down a page or form? Is there one space most visitors tend to stop or leave the site? Which pages are people visiting the most? Where are most of your conversions coming from? To answer these questions, you can use GA4 to look at trends among large groups of visitors, not specific personal information about individuals. 

Even when dealing with data regarding large groups of visitors and their behaviour on your site, it’s still important to question the conclusions you draw from your data. You’re never going to have the full picture with data alone. It isn’t “data-driven” to assume that the limited picture you have can provide all the answers you need. 

If you, like me, find yourself wondering how much data is too much, that’s great news. You don’t have to dig deep into every visitor’s personal info to get the insights you need. In fact, you shouldn’t. 

It can be hard to admit, but data is just one piece of the puzzle. It’s great for complementing user research, but it can’t stand on its own. Speak to your customers, talk to them directly about how they use your site and verify that the data you’ve collected paints an accurate portrait. 

Then, you can make sure your site is user-driven, making it as helpful as possible to your visitors, whether they’re represented in your site data or not. 


Related articles


More help

For more help making sense of your site data, email me at [email protected].

I’ve got plenty to say

View my other articles and opinion pieces below

GA4 migration: Everything you need to make the switch

On July 1, Universal Analytics will stop processing data. UA will be permanently replaced by Google’s new reporting platform, Google Analytics 4.  To help you prepare for this transition, we have compiled all of our GA4 information into one place, ranging from guides to FAQs and deep-dives into specific features including cookie-less tracking, API quotas […]

GA4’s AI: Everything you need to know about artificial intelligence in Google Analytics 4

Stop someone on the street and ask them what they think when they hear “AI.” You’ll likely hear a few answers. ChatGPT. Automation. Computers taking over the human race.  Many conversations about AI are warped by preconceptions. These range from sci-fi ideas about AI capabilities, to limited understandings of AI’s use based on specific services.  […]

The small business guide to GA4’s new API quotas

On 10th November 2022, Google Analytics 4 (GA4) introduced some major changes with massive implications for other Google properties, particularly Looker Studio (formerly Google Data Studio). The GA4 team implemented API “quotas” – a PR friendly way of saying “limits” – on requests for Management and Reporting APIs.  API stands for “Application Programming Interface” and […]