When looking at tracking technologies on the internet in 2022, we can still see that there is a big uncertainty across the globe when it comes to defining what is allowed and what not. GDPR and the EU became synonyms for data privacy regulation’s war on cookies and collecting user data in general.
This article is not meant as legal advice. Ethical tracking is a technical concept, aims to be globally adaptable, and is work-in-progress. The words used in this article have not been optimized to be compatible with any specific privacy regulation around the world. If you need legal advice on data privacy, ask your lawyer.
The internet has always been a global and open place. With data privacy regulation stepping up, it feels like there are virtual barriers being introduced. If we’re honest, it may not be the legislation that causes the problem. We, the digital marketers, have just utilized technology that is nothing else than mass surveillance methodology.
Don’t get me wrong. There is nothing shady about personalized content or smart ad targeting in general. The question is just how the data enabling these technologies will be collected and aggregated.
Is it really necessary that we’re storing information about every single interaction of every single internet user across every single website in centralized systems? How may this data potentially be misused? Do the visitors on your website have the right to be informed about what data will be collected? Does there have to be a possibility to use a website without sharing your personal data? What is even personal data? And why is there such a big obsession in digital marketing about collecting it?
These are the questions we’re tackling when discussing ethical tracking.
Ethical tracking is no well-defined concept. The reason is that ethics in general is a concept that is not defined in the identical way by different groups of people.
Therefore, I want to propose some general pillars of ethical tracking. Feel free to share your remarks and proposals to modify this definition.
1. Transparency: Ethical tracking means to only collect personal data that the user is aware of. It must be visible and verifiable that there is no hidden collection of any other data. Any collection of personal data requires previous consent.
2. Less is more: In contrast to the old “measure everything”, ethical tracking follows the concept of “measure what matters”. The goal is to only collect the data that we need for the desired use. Sometimes that means to intentionally skip the collection of certain information, even if it was technically available.
3. Distributed data ownership (first-party data): In ethical tracking, any collected data is owned by the person or organization with direct relationship to the data subject. Example: A website owns the tracking data collected about their website visitors. If there is a third-party employed for data collection or processing, this must be implemented in a transparent way.
The current situation in tracking is very diverse across the globe. In some parts of the world, we see tons of tracking pixels implemented on websites without any notice. In other countries, there is data privacy regulation that requires to inform your visitors about such data collection. Some require to offer opt-out possibilities. And in the strictest data privacy laws of this world, you even need to ask your visitors for explicit consent (opt-in) to tracking them before you start to fire any pixels.
This article does not want to judge. Especially as the data privacy regulation across the globe does not enforce identical handling of personal data. This is just an overview of the different types of informing visitors. However, in the context of ethical tracking, probably only the last two options seem to fit the previous definition of ethics. Share your thoughts about this in the comments.
Discussing ethical tracking and personal data, we have to clarify what “personal data” even is. For this, I want to refer to the tracking parameter classification of the Ethical Tracking Protocol, an open-source project initiated by Qreuz, my startup. This classification is incomplete and I’m also only adding some examples here to keep this article short.
Name, email, address…: obviously personal dataIP: personal data as it can help providers to identify internet access pointsAd click id: personal data as it can help ad networks to identify (virtual) profilesUser agent: in most cases probably generic enough to be no personal dataOne question I come across very often is why we even need personal data. Why does digital marketing seem to be absolutely obsessed about collecting and aggregating personal data?
One reason for this is definitely the evolution of digital marketing technologies. In the early days of the internet, there were so many hilarious ad placements that were not targeted well to you. Nowadays, we have personalization and smart ads. The goal of these smart systems is to show you only relevant ads and increase the chance for clicks and conversions.
But is there no alternative to aggregating behavioral profiles of human-beings to build smart ad systems? Well, obviously the profiling-based approach is the more obvious one. If it was easy to find a competitive alternative, we’d already know about it.
After investing several years in experimenting with various approaches to the problem of tracking website activity from a privacy-first plus marketing perspective, I can definitely argue that there are some use cases and specific processes that won’t work with any personal data being involved.
Just think of tracking ad conversions and automatically reporting the conversions to ad networks. How would the ad network’s algorithm be able to learn if you would not report at least the ad click id so that they know which ad click was responsible for your conversion? The only solution I’d see here is to train the algorithm with sampled data instead of data about individual internet user behavior. But this would require the provisioning of an appropriate API by ad networks and a tool on your end to control reporting.
Another example would be personalized content on your website or in newsletters. If you don’t aggregate behavioral (virtual) profiles, you can’t know how to personalize this content. There are alternative approaches to this job but to my knowledge they can’t compete with the precision of content personalized based on personal data.
Besides these two applications, there are other reasons why we are collecting data on our websites. We may want to see metrics about our general website performance, acquisition channels, or content performance. For these numbers, we could employ alternative tracking solutions that are operating on anonymous data.
Typical tools for such a job can be found in the field of on-premise website statistics. There is many of them and most of them are mature enough to be valid options for website statistics.
One big problem of any tool working with anonymous data is that there is no default definition of what “anonymous” even means. An information that is absolute nonsense to you may help a third-party to easily identify a real human-being.
To make it even more complicated, sometimes, the aggregation of initially anonymous data may lead to generated data which then exposes personal data. 🤯
This gets especially dangerous if the tracking tools we are using are processing both anonymous and personal data.
Having worked for more than 10 years in offices with various co-workers, I am still convinced that it is possible to identify any human-being by tracking patterns in keystrokes and mouse movement. At least I heard several times that the way I hit the keyboard is unmistakable.
Even without going into science-fiction, just imagine a set of data about an absolutely anonymous website visit session which will later be attributed to a specific purchase from an online shop because the order ID was part of the session data.
Collecting “anonymous” data is a complex topic and we have to consider possible uses and later aggregations of the collected data. Ideally, we even keep the tools collecting anonymous data separated from the ones we use to process personal data. That’s why I have launched Qreuz, a website monitoring solution focusing on collecting website metrics for marketing and aiming to completely avoid personal data.
We may agree on the fact that ethical tracking is more complex than it initially sounds. For many use cases in marketing, we still need personal data. With the tools digital marketing has to offer today, we can’t completely neglect the collection of any personal data, especially in bigger digital businesses and performance-driven e-commerce. The power behind personalization and smart targeting is just too big.
Ethical tracking does not ban the use of personal data. But it aims to foster a discussion around how, when, and why we can collect such information from internet users. There is a difference between tracking ethically and avoiding any personal data.
Further, there are some applications, especially when it comes to statistics and performance analytics, that could run without looking at personal data. With the latest trend towards first-party data, we will probably see more and more server-side tracking solutions entering the market. Migrating to “cookie-less” is already an important topic in digital marketing and we can see a lot of discussion around collecting data without the limitations introduced by “cookie banners”.
However, what sounds like a great thing may also hold a toxic constellation for privacy. By moving tracking from the frontend to server-side, it may become invisible to internet users. It will be more complicated to protect yourself from being tracked. That’s the reason why we initiated the Ethical Tracking Protocol. An open-source project to define a set of rules for ethical tracking.
Ethical Tracking Protocol on Github: https://github.com/EthicalTrackingProtocol/EthicalTrackingProtocol
Qreuz on Indie Hackers: https://www.indiehackers.com/product/qreuz
Qreuz website: https://qreuz.com/
Thanks for sharing in detail. This is a helpful read. Have there been any standards around this earlier? What are your thoughts on privacy-focused analytics platforms capturing just the right amount of data?
There is no industry standard yet. Probably because in the past, everybody has just been tracking "everything" by default until data privacy regulation stepped up. Still today, I see "the law" as the biggest factor in limiting what website owners are doing. I haven't heard of "ethics" in this context from marketers very often. As data privacy regulation is very diverse around the world, there is no common standard. Just look at the question if you can track personal data from visitors BEFORE or AFTER they have clicked "yes" in a consent banner. Or if there even needs to be such a banner. Or what would be the consequences if you did not put a banner there but still get visitors from countries which require it.
Data privacy law enforcement on the internet is very complicated and definitely slower than the industry is dropping new technologies to get around latest regulation updates.
There are a few privacy-friendly analytics tools that are really good. The problem here is that the law around the world is not unified. And there is a big difference between "privacy compliant" and "allowed to be used without asking a visitor for consent before". Out of all the tools that claim to be "privacy compliant", imho, I know exactly one which you can use without explicit prior visitor consent within the EU (not taking our own solution into consideration here).
But what is web analytics worth if visitors need to explicitly opt-in to it before? I feel like we need a separation of statistical analytics data from customer-centric profile aggregation.
When working with data, there is a significant distinction between data and insights. Analysis is something you can have control over, however insight is the unknowable that you seek.
How do they relate to one another and what do they actually mean?
Read this blog: https://bit.ly/3zA6E8o