A lot of geolocation data on the market is anonymized, following medium-lived unique IDs that aren't able to be mapped to other identifiers. The problem with that is that if you have precise locations, or enough samples that you can apply statistics to find precise locations, in many cases you can de-anonymize the IDs. You can purchase address and resident listings from a number of different data vendors, and by checking where the device returns to at night you can figure its home address. Then if you find information on the residents (work locations, schools, etc.), you see if said device goes where each resident of the home address is likely to go, and you now have a pretty good idea of exactly who the device belongs to.
We should have learned this lesson 20 years ago when researchers were able to deanonymize a lot of the Netflix Prize dataset, which contained nothing except movie ratings and their associated dates.
If movie ratings are vulnerable to pattern-matching from noisy external sources, then it should be obvious that location data is enormously more vulnerable.
Location and identity are inextricably linked. You can't destroy identity without also destroying location and location is critical for myriad purposes.
The analytic reconstruction of identity from location is far more sophisticated than the scenarios people imagine. You don't need to know where they live to figure out who they are. Every human leaves a fingerprint in space-time.
I don't follow what you mean by 'logistics in civilization' as that's pretty vague and amorphous.
Could you be more specific with maybe a single example of where my physical geographic location is electronically critical for a purpose that isn't elective/optional/avoidable?
(And I'm not just trying to be obtuse. I think you're touching on at least part of the 'heart' of both this conversation and that of digital ID verification.)
Companies exist that de-anonymize other data brokers data. Lets the other data brokers claim they have anonymized data while end end users get everything.
exactly. calling it 'anonymized' is pure security theater once you have enough data points to map out someones daily routine.
waiting for legislation or eulas to fix this is a lost cause since adtech always finds a loophole. the fix has to be architectural. moving toward stateless proxies that strip device identifiers at the edge before they even hit upstream servers. if the payload never touches a persistent db there is literally nothing to de-anonymize. stateless infra is the only sane way forward
To be honest, I feel like this is where iOS and Android are failing us. Why is every app allowed to embed a bunch of trackers? Only blocking cross-app tracking on user request as iOS does is not enough (and data of different apps/websites can be correlated externally).
IMO we should ban gathering this data without a warrant or specific contractual agreement between the device owner and entity aggregating the data. As much as congress loves to claim the interstate commerce theory of everything, this seems like a slam dunk.
I should have been a bit more clear. We should ban retention for any purposes where it is not explicitly required for the intended function and clearly agreed to by all parties. Think somethig like strava or asset tracking. You know it stores gps data, and why.
There is no such things as "clearly agreed to by all parties" when it comes to end users. Companies provide a one-sided, "take it or leave it" EULA, and if you don't agree to everything in it, you don't use the product. There is no meeting of the minds, there is no negotiation, and there is no actual agreement. It's a rule book dictated by one side.
You can't just bury literally anything in an EULA. There's a fair amount of case law establishing that EULAs clauses that are surprising or illegal aren't enforceable.
That fact does not change the point of the individual to which you replied. Regardless of whether the clauses in the EULA are 100% legal, some mixture or 100% illegal, the entire EULA is a "one sided rule-book dictated completely by one side". You, the person held to the EULA's rules, do not get to negotiate on the individual points. You simply have a "take it or go away" set of options.
if it were up to me i’d require a hand signed contract that explicitly, up front and in plain english gives permission and is not transferable to any “partners”.
I think we should make this type of tracking opt-out by default. We should also ban the sale of its use to third parties and its use for purposes other than the specific functionality which required it to be enabled in the first place.
The compliance model is very simple. Do not collect data. Problem solved. If you need to collect data (e.g. because you are a webshop), only collect the minimum necessary.
The problem is not the GDPR, the problem is the surveillance industry that wants to grab as much data as possible and try to do as much malicious compliance as possible.
Have you read it? It's not that bad, unless you're thinking like an adtech programmer trying to find the exact edge case for the maximal amount of tracking you're allowed to do, because such a bright line does not exist and that fact infuriates adtech professionals. It is vague because reality is vague and complex; each specific case of alleged violation has to be interpreted by multiple humans; there is no algorithm.
The law mandates a data protection officer with specific duties. It also establishes a board that "issue guidelines, recommendations, and best practices" which is where administrative complication and nonsense always creeps in.
It is regulation that imagines companies are a government bureaucracy.
I have read GDPR and don't work in adtech. It is vague and it is pretty easy to find pathological scenarios that don't make much sense or impose an unusually high burden for no benefit. Every European law firm seems to agree with this assessment despite what proponents assert. Consequently, it forces a lot of expensive defensive activity in practice.
To some extent, it was just a failure of imagination on the part of GDPR's authors. Many things are not nearly as simple as it seems to assume and it bleeds into data models that have nothing to do with people.
It is what it is but no one should pretend it is not a burden for companies that have nothing to do with adtech or even data about people.
As someone who has to implement it, it's really not bad at all: Ask the user for consent to use their data, and don't be misleading about it. That's it.
The rest of the "It'S So LaRgE AnD UndErSpEciFieD" is just FUD. The regulators don't just slap fines, they work with you to get you to comply, and they just want to see that you're putting in the effort instead of messing them about.
I have literally never been surprised by the GDPR. Whenever I thought "surely this is allowed" it was, whenever I thought "this can't be allowed", it wasn't. For everything in the middle, nobody will punish you for an honest mistake.
Anti GDPR people: "it's so complicated not being able to walk into someone's house and take their things! Which things can I not take? How about this? And now I need a lawyer if I take someone's things? Ridiculous!"
Yeah that's pretty much what it feels like, or sometimes it's "what if someone's stuff is lying on the street? Can I take it then?" and the regulator is kind of like "look around and ask if it belongs to anyone, and if not, sure".
The "GDPR is complicated" meme has been circulating among software developers since probably before it was even written. It's so wild that HN dunks on it so much: Here we have a societal problem in computing we've been complaining about for decades, someone offers an incremental but imperfect regulation to start taking steps to correct it, and everyone hates it!
There needs to be a believeable legal framework behind this.
Imagine a option on your iPhone that says “Enable this to allow geo-location tracking for organisations registered under the NOADSJUSTPUBLICGOOD Act” - then any wifi endpoint could locate you as long based on signal strength etc and that data could only be made available to people registered under the act.
Would we see new understanding of how people move around in cities, would we see better traffic information, Inthink so - as long as people believe that there are real teeth to the laws and they enforced loudly and publically.
We should embrace the benefits of a society wide epidemiology experiment - the benefits for public health are incredible. (Add to that supply chain logistics on open ledgers and many of the new things that just were not possible before and the future of open transparent but well regulated democracies is bright.
The problem with all these discussions about banning stuff is that privacy is always on the back foot. It's by design. People who want to surveil and manipulate us are actively investigating new ways of doing it, they get paid for it and they risk nothing in the long run. All of these discussions about specifics are just reactions. They aren't even reactions to the surveillance itself, but rather to a discovery by someone that a new surveillance machine has been constructed and launched.
So the current feedback process involves: construction → exploitation → reporting → public awareness → legislation. This is too slow. Moreover, operating in this environment is exhausting.
We need a different feedback loop altogether. I'm not sure which one would work best, but something different needs to be considered.
Let’s just stretch copyright to cover movement/location as a protected creative expression. It’s somewhat ridiculous but we’ve already established case law and technology for handling/mishandling protected assets.
Does anyone know of any groups that are organizing and lobbying to get things like this into law? I know about the EFF but they seem to be more focused on documenting and reporting instead of lobbying and getting things passed.
You can have legitimate use cases where it's a core functionality of the application to store it, so the user obviously knows it's being collected and agrees by using it.
GDPR literally prohibits the sale of user data and tracking without user consent (because yes, you want to give people the possibility to opt in for a variety of reasons).
GDPR has literally nothing to do with cookie popups. That was, and is, adtech
But the only reason the popups are needed is the adtech tracking cookies. You don't need a popup for cookies that are related to essential site functionality.
Smartphones, mobile apps, mobile networks, and WiFi stopped being your friends around 2015-2016. Now it's just a matter of how much data can be harvested from device sensors in real time until reaching a pain point which doesn't exist.
It looks like a cookie prompt, so I assume "Lifespan" refers to cookie expiration and "retention" to how long the data (including geolocation) is retained on the spyware company's servers.
A lot of geolocation data on the market is anonymized, following medium-lived unique IDs that aren't able to be mapped to other identifiers. The problem with that is that if you have precise locations, or enough samples that you can apply statistics to find precise locations, in many cases you can de-anonymize the IDs. You can purchase address and resident listings from a number of different data vendors, and by checking where the device returns to at night you can figure its home address. Then if you find information on the residents (work locations, schools, etc.), you see if said device goes where each resident of the home address is likely to go, and you now have a pretty good idea of exactly who the device belongs to.
There is no such thing as anonymized location data when you have the location of something where and when they sleep and work.
It's a rhetorical fiction the ad industry tells itself.
And with LLM’s now it’s easier than ever to piece the parts together. Companies were doing it before we even knew what LLM’s were capable of.
We should have learned this lesson 20 years ago when researchers were able to deanonymize a lot of the Netflix Prize dataset, which contained nothing except movie ratings and their associated dates.
https://arxiv.org/abs/cs/0610105
If movie ratings are vulnerable to pattern-matching from noisy external sources, then it should be obvious that location data is enormously more vulnerable.
Location and identity are inextricably linked. You can't destroy identity without also destroying location and location is critical for myriad purposes.
The analytic reconstruction of identity from location is far more sophisticated than the scenarios people imagine. You don't need to know where they live to figure out who they are. Every human leaves a fingerprint in space-time.
> and location is critical for myriad purposes.
It's not though.
Critical for myriad elective purposes? Sure.
Only if you consider the entire concept of logistics in civilization as "elective".
I don't follow what you mean by 'logistics in civilization' as that's pretty vague and amorphous.
Could you be more specific with maybe a single example of where my physical geographic location is electronically critical for a purpose that isn't elective/optional/avoidable?
(And I'm not just trying to be obtuse. I think you're touching on at least part of the 'heart' of both this conversation and that of digital ID verification.)
Companies exist that de-anonymize other data brokers data. Lets the other data brokers claim they have anonymized data while end end users get everything.
you could probably run a anonymization company at the same time you run a de-anonymization company
exactly. calling it 'anonymized' is pure security theater once you have enough data points to map out someones daily routine.
waiting for legislation or eulas to fix this is a lost cause since adtech always finds a loophole. the fix has to be architectural. moving toward stateless proxies that strip device identifiers at the edge before they even hit upstream servers. if the payload never touches a persistent db there is literally nothing to de-anonymize. stateless infra is the only sane way forward
To be honest, I feel like this is where iOS and Android are failing us. Why is every app allowed to embed a bunch of trackers? Only blocking cross-app tracking on user request as iOS does is not enough (and data of different apps/websites can be correlated externally).
im not sure about allowed. perhaps required may be closer.
why would someone include tech that makes people think twice about using the app, unless it is required if you want to "sell" in a particular venue.
if your developing geolocation based apps, location tracking is a core function.
a calender, absolutely does not require location tracking beyond what side of the prime meridian are you on.
Because we don’t enforce antitrust law in this country and the people that make those decisions profit from the ads.
In what sense can the latitude and longitude of my house be called anonymous data?
Ultimately, a map is anonymous data containing lat/lon of everyone's house
Alone, these points are not deanonymizing, it's when there's other data associated.
> A lot of geolocation data on the market is anonymized
A lot isn't good enough.
Yep. With side channel/one order of thinking above the laws, its trivial to get around said laws. Need better laws.
IMO we should ban gathering this data without a warrant or specific contractual agreement between the device owner and entity aggregating the data. As much as congress loves to claim the interstate commerce theory of everything, this seems like a slam dunk.
Contractual agreement? Nobody reads things like EULAs or terms of service. It's probably in there already.
I should have been a bit more clear. We should ban retention for any purposes where it is not explicitly required for the intended function and clearly agreed to by all parties. Think somethig like strava or asset tracking. You know it stores gps data, and why.
There is no such things as "clearly agreed to by all parties" when it comes to end users. Companies provide a one-sided, "take it or leave it" EULA, and if you don't agree to everything in it, you don't use the product. There is no meeting of the minds, there is no negotiation, and there is no actual agreement. It's a rule book dictated by one side.
Then it's not a valid contract and therefore does not absolve them of criminal liability for stalking you.
You click on “accept terms and conditions” which means you agree to the contact.
Contracts of adhesion can be valid contracts. The ability to negotiate or equal bargaining power is not a required element of a contract.
Furthermore, you cannot contract away criminal liability if any exists.
Even attempting to use a contract of adhesion to justify selling GPS location data to a third party should be a criminal act.
Yes, the US is in desperate need of better privacy laws.
You can't just bury literally anything in an EULA. There's a fair amount of case law establishing that EULAs clauses that are surprising or illegal aren't enforceable.
That fact does not change the point of the individual to which you replied. Regardless of whether the clauses in the EULA are 100% legal, some mixture or 100% illegal, the entire EULA is a "one sided rule-book dictated completely by one side". You, the person held to the EULA's rules, do not get to negotiate on the individual points. You simply have a "take it or go away" set of options.
[delayed]
https://en.wikipedia.org/wiki/Shrinkwrap_(contract_law)
There is the GDPR.
if it were up to me i’d require a hand signed contract that explicitly, up front and in plain english gives permission and is not transferable to any “partners”.
Right, privacy terms are written to be vague and permissive. Even if you read them you can’t usually understand how the data will be used or opt out.
I think we should make this type of tracking opt-out by default. We should also ban the sale of its use to third parties and its use for purposes other than the specific functionality which required it to be enabled in the first place.
>I think we should make this type of tracking opt-out by default
That's opt-in, not opt-out.
https://en.wiktionary.org/wiki/opt-out
GP states correctly that they believe the default 'choice' of a user should be 'opting-out' of location tracking.
> IMO we should ban gathering this data without
GDPR tried. And the narrative around GDPR was deliberately completely derailed by adtech.
Lack of enforcement didn't help either
GDPR like all EU regulation is needlessly complicated and aimed at a compliance model that seems designed for SAP.
The compliance model is very simple. Do not collect data. Problem solved. If you need to collect data (e.g. because you are a webshop), only collect the minimum necessary.
The problem is not the GDPR, the problem is the surveillance industry that wants to grab as much data as possible and try to do as much malicious compliance as possible.
Designing around GDPR compliance shows up all over the place in industrial data collection. It doesn't only affect surveillance webslop.
The costs are often worse on industrial side because the data is so much larger and faster than web or mobile data.
Have you read it? It's not that bad, unless you're thinking like an adtech programmer trying to find the exact edge case for the maximal amount of tracking you're allowed to do, because such a bright line does not exist and that fact infuriates adtech professionals. It is vague because reality is vague and complex; each specific case of alleged violation has to be interpreted by multiple humans; there is no algorithm.
The law mandates a data protection officer with specific duties. It also establishes a board that "issue guidelines, recommendations, and best practices" which is where administrative complication and nonsense always creeps in.
It is regulation that imagines companies are a government bureaucracy.
I have read GDPR and don't work in adtech. It is vague and it is pretty easy to find pathological scenarios that don't make much sense or impose an unusually high burden for no benefit. Every European law firm seems to agree with this assessment despite what proponents assert. Consequently, it forces a lot of expensive defensive activity in practice.
To some extent, it was just a failure of imagination on the part of GDPR's authors. Many things are not nearly as simple as it seems to assume and it bleeds into data models that have nothing to do with people.
It is what it is but no one should pretend it is not a burden for companies that have nothing to do with adtech or even data about people.
You can literally read the entire "complicated" regulation in one sitting in an afternoon. There's literally nothing complex or complicated about it.
Congrats on gullibly believing the ad tech narrative.
Being able to read something in one sitting doesn't make it simple or obvious. The law establishes a board that gets to set new requirements.
As someone who has to implement it, it's really not bad at all: Ask the user for consent to use their data, and don't be misleading about it. That's it.
The rest of the "It'S So LaRgE AnD UndErSpEciFieD" is just FUD. The regulators don't just slap fines, they work with you to get you to comply, and they just want to see that you're putting in the effort instead of messing them about.
I have literally never been surprised by the GDPR. Whenever I thought "surely this is allowed" it was, whenever I thought "this can't be allowed", it wasn't. For everything in the middle, nobody will punish you for an honest mistake.
Anti GDPR people: "it's so complicated not being able to walk into someone's house and take their things! Which things can I not take? How about this? And now I need a lawyer if I take someone's things? Ridiculous!"
Just don't spy on people.
Yeah that's pretty much what it feels like, or sometimes it's "what if someone's stuff is lying on the street? Can I take it then?" and the regulator is kind of like "look around and ask if it belongs to anyone, and if not, sure".
The "GDPR is complicated" meme has been circulating among software developers since probably before it was even written. It's so wild that HN dunks on it so much: Here we have a societal problem in computing we've been complaining about for decades, someone offers an incremental but imperfect regulation to start taking steps to correct it, and everyone hates it!
Same with the California age input box.
There needs to be a believeable legal framework behind this.
Imagine a option on your iPhone that says “Enable this to allow geo-location tracking for organisations registered under the NOADSJUSTPUBLICGOOD Act” - then any wifi endpoint could locate you as long based on signal strength etc and that data could only be made available to people registered under the act.
Would we see new understanding of how people move around in cities, would we see better traffic information, Inthink so - as long as people believe that there are real teeth to the laws and they enforced loudly and publically.
We should embrace the benefits of a society wide epidemiology experiment - the benefits for public health are incredible. (Add to that supply chain logistics on open ledgers and many of the new things that just were not possible before and the future of open transparent but well regulated democracies is bright.
Let me know if you spot one.
The problem with all these discussions about banning stuff is that privacy is always on the back foot. It's by design. People who want to surveil and manipulate us are actively investigating new ways of doing it, they get paid for it and they risk nothing in the long run. All of these discussions about specifics are just reactions. They aren't even reactions to the surveillance itself, but rather to a discovery by someone that a new surveillance machine has been constructed and launched.
So the current feedback process involves: construction → exploitation → reporting → public awareness → legislation. This is too slow. Moreover, operating in this environment is exhausting.
We need a different feedback loop altogether. I'm not sure which one would work best, but something different needs to be considered.
Let’s just stretch copyright to cover movement/location as a protected creative expression. It’s somewhat ridiculous but we’ve already established case law and technology for handling/mishandling protected assets.
More details are available here, including screenshots of the tool.
https://citizenlab.ca/research/analysis-of-penlinks-ad-based...
These people really have no idea at the level of data collection from Google's rootkit on Android known as "Google Play Services".
Does anyone know of any groups that are organizing and lobbying to get things like this into law? I know about the EFF but they seem to be more focused on documenting and reporting instead of lobbying and getting things passed.
Senator Wyden has been pretty focused on it. I think it's going to take some changes in Congress before it happens though.
How about we just ban the collection of precise geolocation? Wouldn't that be a better solution?
You can have legitimate use cases where it's a core functionality of the application to store it, so the user obviously knows it's being collected and agrees by using it.
I would expect such a law to be lobbied to death.
Just ban the sale of any kind of adtracking. That way we can get rid of the cookiewalls too.
Missed opportunity by the EU when they wrote GDPR.
GDPR literally prohibits the sale of user data and tracking without user consent (because yes, you want to give people the possibility to opt in for a variety of reasons).
GDPR has literally nothing to do with cookie popups. That was, and is, adtech
prohibits [...] without user consent
that's what causes the popups.
it should prohibit it outright, consent or not.
But the only reason the popups are needed is the adtech tracking cookies. You don't need a popup for cookies that are related to essential site functionality.
yes, so if ad tracking is forbidden outright then asking for permission to do it is invalid too.
I think they are saying GDPR did not ban websites from noisily asking for consent and trying to trick you into giving consent.
My job was building cookie walls in response to GDPR. It might not have been the “intent” but it certainly was the consequence of that law.
Smartphones, mobile apps, mobile networks, and WiFi stopped being your friends around 2015-2016. Now it's just a matter of how much data can be harvested from device sensors in real time until reaching a pain point which doesn't exist.
Yep.
And the FLOSS/Linux phone hardware attempts have frankly sucked.
I was hoping that my PinePhone Pro would actually be usable. But no, its a PineDoorstop.
Proper Linux would be a great 3rd choice. But yeah. We've got a duopoly and not much we can do about it.
Don't you want random companies to store your precise location for 12 years? https://x.com/dmitriid/status/1817122117093056541
Screenshot in that tweet says 13 months FYI
[Cookie] Lifespan: 13 Months
Data Retention: Standard Retention (4320 days)