The question I have isn't why you need analytics but why you'd ever need any PII in the data. I don't care whether Bob clicked the button I only care whether 1% or 50% of users click the button. Or if those who clicked button A are likely to click button B so they should be closer together. Analytics should be anonymous usage statistics not tracking individuals. We are clumping two things together where one is bad and the other is useful and mostly harmless to integrity.
That’s the idea but to know that an anonymous user who has clicked button A goes on to click button B requires you to track that user via some kind of random ID that uniquely identifies their browser/device. This new Swedish ruling says that ID is itself PI.
How does the ruling come to that conclusion? How does an ID that uniquely identifies a user but can't be used to trace back to a physical person be PII?
Reading the linked rulings it seems like it's not the IP (even though it's only blanked at the last octet) but rather other cookie values, which may in turn be traceable to the user?
Of course if a cookie value is sent and in some other system that same cookie value is stored next to a user's name, then that cookie value is definitely PII and can't be sent via GA, that much I understand.
The key passage from the longest ruling (DI) seems to be
Dessa identifierare har skapats med syftet att kunna särskilja individuella besökare, såsom
klaganden. De unika identifierarna gör därmed besökarna på Webbplatsen
identifierbara. Även om sådana unika identifierare (enligt punkt 1 ovan) i sig inte skulle
anses göra enskilda identifierbara, måste det dock beaktas att dessa unika
identifierare i det aktuella fallet kan kombineras med ytterligare element (enligt
punkterna 2–4 ovan) samt att det är möjligt att dra slutsatser i förhållande till
information (enligt punkterna 2–4 ovan) som medför att uppgifter utgör
personuppgifter, oaktat om IP-adressen inte överförts i sin helhet
Basically: the random ids aren't enough by themselves, nor is the IP, but the IDS together with partial IPs and something else is.
I don't know what the bottom line is though. And that worries me a bit. Any analytics will be at risk of doing this. In my desktop app analytics we blank IPs etc, but just storing some hardware data (ram amount, cpu freq, windows version, screen resolution...) means that we eventually have enough entropy to say with certainty that each user we have has a unique set of parameters in the data we log. It's almost impossible to NOT fingerprint perfectly if you gather even just basic hardware and OS info, for example. But there is of course zero possibility that we could use the data backwards and say "ok which single physical person is it that has a 16 core machine and 16Gb ram" making it "not PII"?
I think the key issue in these cases with GA is that it's more a chain leading to actual PII. E.g. the cookie value that GA has access to, can realistically be stored somewhere where there is also PII such as an email address. And that's enough to violate the GDPR.