[Throwaway for privacy.] I know this was hashed out on the other threads a bit, ...

ineedasername · on Dec 22, 2021

Ethical guidelines on research exist to prevent an adverse impact on participants. This study had adverse impacts: fear, stress, time & money in consulting lawyers. It was therefore defacto an unethical research study. Speculation as to why the protocol slipped through the IRB cracks are that the language used in the study proposal (at least the part made public) dehumanized the protocols by referring to "websites" rather than humans that would be responding to the inquiries.

The IRB ruled this was not a human subject piece of research, but that is contradicted by the deception protocol. Deception was justified as necessary because people's behavior might change if they knew it was a research request. That acknowledgement made it implicit that human behavior and potential changes to it due to the experiment was a core factor in the study-- ergo, it had human subjects. Behavioral research on human subjects is required to go through a much more rigorous IRB oversight process precisely to anticipate and mitigate potential adverse reactions.

Some people are focussing on the deception, but that is, under some circumstances, allowed by research ethics. The more serious problem was adverse impact which, again, is the primary motivator for why we now have laws and regulation-mandated IRB processes to make sure it doesn't become an issue.

belorn · on Dec 22, 2021

I wonder if IRB ruled in this way because of the assumption of algorithmic response for requests like DMCA take down notices. I can imagine that even for GDPR/CCPA requests, there is still no human involved for website like Google, facebook, youtube and other major sites that is primarily operated through automation. If there is no humans involved then there is no humans to have an adverse impact on.

But as you said, researchers however must have suspected that responses would be made by humans or else the email would have included the fact that it was a study.

throw10920 · on Dec 22, 2021

> Ethical guidelines on research exist to prevent an adverse impact on participants.

Not relevant in this case, because (1) it's clearly not human research[1] and (2) all kinds of other clearly-non-human-experiments still have adverse effects on humans tangentially involved, so that's not unique to human research, either. When scientists were testing the first particle accelerators, they caused some people a lot of stress who were worried that they would destroy the world - does that mean that those tests were human experiments? (clearly not)

[1] https://news.ycombinator.com/item?id=29656792

ineedasername · on Dec 23, 2021

I may be wrong, but I'm going to guess that you don't work with IRB's all that often. I do. I asked a colleague their thoughts on this and they were unequivocal in believing this should have been flagged as involving human subjects if the approving IRB had all of the details of the protocol. Their guess is that the proposal's presentation-- perhaps very innocently-- did not fully convey the details that would have resulted in oversight of the research as a human subject project. These are experts on the nuances of these laws. Of course, experts may disagree, so that reference is not definitive. But it is suggestive that dismissals from those here on HN that are arm-chairing this with no-- or minimal-- experience with this sort of research are not basing their opinions on a full understanding of the IRB & research ethics ecosystem. One or two IRB applications for a research project will not convey the understanding needed to evaluate research protocols under the laws & regulations involved.

As for definitions of human subject, it seems like you are overlooking part of the regs you want to use to support your argument. Per the link in the comment you cite, it's a human subject research if the research obtains data "through intervention or interaction with the individual, and uses, studies, or analyzes the information" That was clearly part of the research in this case: It involved interaction with individuals. I'm not sure how you can overlook this part of the definition when making your determination.

You should also be aware that regulatory laws do not stand alone: The government provides explanatory statements of interpretation and policy guideline. The grand-daddy of these in this case and a fundamental guiding document to anyone in an IRB is the Belmont Report in 1979 that guided the development of modern regulated IRB's. It is pretty clear: participants should "undertake activities freely and with awareness of possible adverse consequence" It's understandable that plenty of folks here on HN are not familiar with the body of clarifications and case-law that that guide interpretation of the law's statutes, but researchers & especially IRB members are supposed to know this stuff inside and out.

Next: Deception to avoid advanced consent is allowed in only very limited circumstances, and is a significant red flag that an IRB needs to bring more scrutiny to the research protocols. I don't know how you (or the IRB if it was doing its job properly) can say that human subjects were not involved if the research protocol relied on the deception of human subjects to see their behaviors.

All of which is somewhat besides the point: This entire research ethics process, independent of specific statutes, exists to prevent adverse impacts on human subjects. This research had adverse impacts, and so defacto it involved humans in research that should have gone through the full IRB human subject oversight process.

codazoda · on Dec 22, 2021

The post here on hacker news mentioned the down sides for one receiver. That person was stressed out thinking that they were about to be sued. They considered retaining council, which could have cost them a few thousand dollars, in order to get ahead of the threat. It didn’t come to that, so it’s a “what if”, but I could see myself trying to retain council too. Hopefully, a lawyer would have talked me down and advised me to wait it out. On the flip side, they may have offered to respond on my behalf (which would cost money).

I would not respond to such an email myself, ignoring it until I was able to defer to an attorney.

I publish a simple personal blog and I worry about the _worldwide_ legal implications of doing so. As one example, I have some old information about making model rocket fuel at home. At the time I had carefully reviewed U.S. law and knew how much I could legally make and have in my possession. Then I got questions from people in other countries and I got spooked. What if I break a law somewhere else?

kstrauser · on Dec 22, 2021

I assume that I’m breaking other countries’ laws all the time, say be criticizing the actions of their governments. I don’t worry about that. I’m much more worried about, say, CCPA compliance while living and working in California. (Not that I’m especially worried it. My personal projects don’t meet any of the criteria which would make it apply to me.)

matkoniecz · on Dec 22, 2021

The problem for people outside USA is that this country repeatedly demonstrated ability to enforce law for example in Europe.

I would not be worried about say Sri Lanka privacy/blasphemy law but USA court can take down my email, website, important accounts, less important accounts starting from HN, gmail and github accounts.

codazoda · on Dec 22, 2021

Yeah, me too. I don’t collect stats on visitors anymore (using Google Analytics for example) because I now understand the privacy implications of doing so. I do use a simple impression counter but I capture no information (not IP, not browser, nothing). I definitely think about the CCPA and ADA laws, but I’m relatively sure they don’t apply to me. Still, I certainly think about them.

kstrauser · on Dec 22, 2021

I personally use a self-hosted analytics app so I can still get some useful feedback without sharing my visitors’ data. I get pretty graphs, and my visitors get to keep their privacy.

dhimes · on Dec 22, 2021

I go as far as saying in the TOS that my sites are for users in the US.

phonebanshee · on Dec 22, 2021

Why? What would make you think this has any impact?

tatersolid · on Dec 24, 2021

If the EU wants an extra-territorial legal framework, why can’t a US website owner do the same in their TOS? That’s a perfectly legal thing to do in the US.

The knife cuts both ways.

dhimes · on Dec 23, 2021

I don't think I understand the question.

wruza · on Dec 22, 2021

What if I break a law somewhere else?

Who knows? I can imagine that an innocent picture of uncovered legs may be illegal in some religious states, but do you have to worry about it? Is that even a thing?

(I’m aware of the chances that you may visit that country some day and find out that you’re a wanted criminal, but not sure if that applies to non-felonies world-legal-wise)

nautilius · on Dec 22, 2021

In that case I'd be mostly worried about breaking law in the U.S. by making rocket knowledge available to foreigners https://en.wikipedia.org/wiki/International_Traffic_in_Arms_...

mint2 · on Dec 22, 2021

Scraping dating is not imposing work, worry and cost on additional people.

The victims of scraping are not going to do any additional work unless the scraped data is used irresponsibly, but that is separate from the act of scraping.

This email required people to do work and caused worry due to the legal threat that the email tried to lead people to believe was applicable to them. They may have had cost if they called a lawyer and it definitely took their time.

Scraping -> no work forced upon victims. That email -> work forced on unwilling victims.

Is there something I’m missing? People including that poster aren’t reaching this same conclusion but it seems very apparent so am I missing something?

nightpool · on Dec 22, 2021

Well, the argument of the GP is that "extra work" is not the only form of harm that is possible. When comparing the harm of extra work and stress due to this email to the harm of have your privacy violated by large, publicly-scraped datasets that include your personal information. For example, once your twitter post is collected in a "posts of Twitter users about X political event" dataset, it's now impossible for you to ever delete that post, which could be harmful for you in the future. it's unclear whether one type of harm is categorically worse then the other.

mint2 · on Dec 22, 2021

public posts on the internet being aggregated is not out of the ordinary, if one group doesn’t do it, another may.

Scraping private posts would be wrong or gaining access to posts under false pretenses. This would be wrong, although different than the email.

The email forced work on people and made legal threats causing work and other effects that would not otherwise happen.

kstrauser · on Dec 22, 2021

Part of it was that the did no (or poor) screening. They got their list of target sites from a research list of the popular websites. I got a letter, and my little not-for-profit, not advertised, purely for fun website was around number 350,000 on that list. First, I sincerely doubt my site is even that popular. Second, if I got the mail, so did lots of people in a similar situation.

They weren’t spamming Fortune 500 companies. They were spamming a huge number of single-person sites that aren’t subject to the CCPA at all and who certainly don’t have legal departments to ask about it.

throwawyaaccoun · on Dec 22, 2021

I mean this all in good faith:

What is the difference between 100,000 individuals emailing 3-5 websites on that list, with their real identities, asking for things to be deleted (such that all 350k are covered)? Where is the meaningful difference between this situation and the one here, ignoring the deception for a moment (unless that is the only issue)?

Could this be a moment of cultural learning for everyone? That's kind of how I am looking at it, frankly, but I am open to being wrong. That is, perhaps small entities will learn, in one or two instances, to just ignore this kind of thing?

rectang · on Dec 22, 2021

You seem extremely unconvinced that any harm was done to the people who were sent scrambling by this alarm. It's as though no matter how convincing the email was, no matter how much of the recipient's time was wasted, no matter how many thousands of dollars they spent on lawyers, you ascribe all blame to the recipient for not having realized they were being deceived — and ascribe no blame whatsoever to the email's author for being deceitful.

This whole discussion was had in the old thread, and there was one person who used the same rhetorical device of belaboring the same question over and over again. It was tiresome.

throwawyaaccoun · on Dec 22, 2021

I should have been more clear, so let me correct that. I am convinced. I agree that harm was done, and suffer from generalized anxiety disorder myself, so I empathize with the panic attacks that people received.

It is because I believe that harm was done, but also because I am a privacy nut myself, that I am trying to, for my own sake, characterize how I should approach sending emails like this in the future. The study may not go on, but individuals still will send these emails as long as CCPA/GDPR exist. (Just to add some color: It's my anxiety which is causing my to want to delete everything from the internet. If there's minimal info about me online, I can rest easy. It's why this is a throwaway that I will abandon shortly.)

Reading everyone's thoughts is what changed my mind. I now understand to have underestimated the emotional and legal effects CCPA/GDPR requests could have on small website operators, and will be more judicious in the future (like this study should have been) in pre-filtering and my wording. Reactions like kstrauser's (elsewhere in thread) were initially surprising to me (perhaps because of the faceless nature of the internet), so I hope you take my about face as genuine.

Where do you think this balance lies? I still believe consumers, in general, should have right to ask those with their data about their processes; to give it to them; and, to upon request, delete it. And further, in general, I think these interactions are the kinds of things that researchers might legitimately want to study. I found your other comments to be thoughtful, so I am curious what you think explicitly.

jwagenet · on Dec 22, 2021

Based on reading https://news.ycombinator.com/item?id=29611139 the other day, my impression is for a small website operator the email template used some potentially threatening language in the line "I look forward to your reply without undue delay and at most within 45 days of this email, as required by Section 1798.130 of the California Civil Code."

There is some discussion that for large websites or gov entities this kind of language may be necessary to communicate your sincerity with the request, but lone operators doing their best probably dont have any sort of legal to ensure they follow the letter of the law. From my perspective maybe its best to approach a small website with a more casual tone that you just want your data gone and "make it serious" if the request is ignored or the response is noncompliant.

rectang · on Dec 22, 2021

What I hope to see is a popularization of business models where no personal data is kept, because that is less expensive in terms of compliance costs, more beneficial to the consumer, and hopefully more attractive to the consumer as well. We can see the dawn of a new age in other comments in this thread where people talk about not collecting any data on their blog visitors!

Right now it is difficult to build businesses under such models because most institutions, frameworks, and tools shunt you towards hoarding all data. Over time, I hope that better tools will emerge so that building better businesses becomes easier.

There are people elsethread bemoaning not only the unfortunate artificial costs created by this email experiment, but the compliance costs of privacy-protecting legislation in general. But businesses should be paying those compliance costs, because it's an iron law at this point that business-collected personal data will leak yet individuals bear the costs when the data leaks.

To my mind, this experiment went awry in the same way that privacy-abusing businesses go awry: the organization reaped a benefit while the externalized costs were borne by outside individuals.

However, I'm inclined to forgive the researchers, as I think they will learn from this and find ways to collect data which cause less alarm and imposition. Similarly, I would hope that individuals pursuing their rights under privacy legislation would start off gently but firmly, giving small entities time to adapt. But simultaneously, I have an appreciation for those with bulldog tenacity who go after recalcitrant businesses (e.g. the heroes who have gone after Equifax in small claims court).

adolph · on Dec 22, 2021

> how I should approach sending emails like this in the future

Don't.

It's that simple.

I look forward to your reply without undue delay and at most within 45 days of this email, as required by Section 1798.130 of the California Civil Code.

kstrauser · on Dec 22, 2021

Thing is, I would cheerfully process a deletion request, even though I don’t have to because I don’t meet the criteria to be subject to the CCPA. For me, part of the deception was quoting a law and incorrectly saying it obligated me to reply to their information request by a certain deadline. The law says no such thing, and getting a letter from someone who quotes specific legal codes almost never ends with “…and then they went out for dinner, newly found lifelong friends.”

Beldin · on Dec 22, 2021

It may have felt like a deception, but there's plenty of bad legal takes on the Internet. For this to be a deception, the sender would have to know for certain that the statute doesn't apply in this case.

Could be they did, but then I missed that. Just as likely, they genuinely thought this correct.

The deception was hiding that this was a study, not a genuine request. Lying about or misrepresenting the goals of a study is deception research. There's strict guidelines for that... in the "soft" sciences (APA guidelines). CS is a bit behind and seems intent on reinventing the wheel :s.

kstrauser · on Dec 23, 2021

I'm not a lawyer or trained in legal jargon, so I'm using "deception" in the colloquial sense: I believe they lied to me and the other recipients.

blululu · on Dec 22, 2021

First, this is an altogether improbably scenario (the odds of winning the lottery are good compared to this scenario ever happening). Site traffic follows a power law. A site at 200k down the list is almost never going to get such attention. It is not someone's full time job. A uniform density of information requests is incredibly unlikely and places a very unfair burden on the smaller sites. Second, the difference is pretty obvious: 100,000 individuals seeking a legal right implies a potential benefit to a large number of people. 1-5 people abusing the system implies a bad faith actor whose benefit is pretty minimal.

s1artibartfast · on Dec 22, 2021

Is about the impact on the humans involved. Imagine the study where are you put police lights on your car and drove behind people on the highway to see how they would respond.

Luc · on Dec 22, 2021

I am having a learning experience right here about reading the meandering thoughts of throwaway accounts.

_lqaf · on Dec 22, 2021

The issue here was not primarily about deception. It seems mainly to be that (a) at least one recipient interpreted their mail as a legal threat, and (b) it was a mass-mailing. Spend a minute thinking through the implications if that were true, and you get a firestorm.

I suspect visibility plays a role in the comparison you're making; out of sight, out of mind and all that. But much more importantly, someone sending you what you think is a legal threat is a lot more salient.

throwawyaaccoun · on Dec 22, 2021

Interesting. Ok, so let's say the deception wasn't the problem, suppose for the moment. Would the study have been more palatable if the researchers had more properly vetted the email list to ensure, say, >95% or perhaps even 100% were corporations that did fall under the law?

nightpool · on Dec 22, 2021

The requirements to be subject to the CCPA are any of: have a gross annual revenue of over $25MM; buy, receive, or sell the personal information of 50,000 or more California residents; derive 50% of more of your annual revenue from selling California residents’ personal information. Yes, I believe that if they emailed only sites for which that was true, I would have no issues with the study.

The requirements to comply with the GDPR are much, much stricter and have a much more outsized effect on small, non-commercial site operators. There are no exceptions to the GDPR for non-profits or non-corporate entities. (except a limited carveout for "household processing" that AIUI has been interpreted very narrowly by the courts). I do not think the GDPR is strict enough in this instance, and I think it would have outsized harms on small and non-corporate operators to email them in this way if your only criteria is "could technically be subject to the GDPR in some possible world".

joshdata · on Dec 22, 2021

I operate a website that likely meets one of the requirements to be subject to CCPA that received the emails from the research study. We have basically no revenue or staff. I didn't appreciate being lied to (about who was sending the message), being threatened (with legal enforcement), wasting my time (the study was scrapped), and being used for research without consent (the fact that this happens all the time doesn't excuse it). If they wanted to know our CCPA/GDPR policies, they could have simply asked. I also received emails from the study at two other domains I own and one that mentioned a domain I don't even own, but which probably don't matter for CCPA - all of which made me think that this was a scam and legal trap to take seriously.

s1artibartfast · on Dec 22, 2021

Deception is a necessary part but not the key. The key is potential for distressing a real human being. The problem is that we live in a legal Society where everyone is at risk of life-altering legal consequences.

throwawyaaccoun · on Dec 22, 2021

Oh, our society, especially America's, is overly litigious. I agree.

But, pushing back a bit (in good faith), do you think asking an entity for your data, or asking them to delete it, should really be considered unusual and panic provoking? I said in another comment the same thing, but do you think this could be a moment of cultural learning?

rcpt · on Dec 22, 2021

> America is overly litigious.

I recall seeing Ralph Nader speak at a fundraising event 20 years ago and asking the crowd "how many people have actually tried to sue someone?" and in a room of hundreds only a few hands went up.

And a year ago when I took my landlord to small claims it was insane how complex the process was and how many paperwork pitfalls are in the way to disqualify you. I remember sitting on the half-day zoom call and watching case after case get thrown out because plaintiffs "forgot to file proof of service" or whatever. I'm generally good with paperwork and still nearly missed out.

There may be some people in America who are overly litigious but for the general population the legal system is wholly inaccessible.

kortilla · on Dec 22, 2021

It doesn’t matter. This isn’t a case where an individual would be suing. This is the government regulation coming down on someone after being flagged by “a victim”.

s1artibartfast · on Dec 22, 2021

In a perfect world, I do not think it should be stressful, but we don't live in that world. I think a stress response is reasonable, given the risk of legal consequences.

Perhaps it is a learning moment, but I think the lesson should be to consider the impact of these kinds of studies.

I'm sure it is a learning experience for bloggers as well, and some of them will learn that hosting a Blog is not worth the legal risk and take it down

s1artibartfast · on Dec 22, 2021

The fact that everyone violates the law in some form, and anyone with sufficient will and resources could ruin a life with legal proceedings is why we have the concept of standing in American law. It acts as a filter so that only someone with skin in the game can bring suit. It is one protection against abuse, and why laws like that give anyone standing Texas abortion ban and forthcoming California gun legislation are problematic.

_lqaf · on Dec 23, 2021

You are translating "legal threat" into "asking for data". And your 'learning' comment makes me think this is a cause for you. That's fine, and I even applaud what I take to be the motivation behind it.

But,

- That does not make one in to the other. Misinterpretation or no, the researcher (who was being deceptive, remember) is responsible for how the message was written. I don't know about you, but I don't usually end my polite requests with references to counterparty legal responsibility. When someone starts trying to sound law-talky, it is in no way paranoid or unreasonable to become concerned about what they might be up to.

The problem here is not that USians enjoy suing each other, or that people and businesses underutilize data protection laws. The problem is that an academic study was performed in a way that caused panic in this, our imperfect world (and object of study).

- I also find the idea that an academic study should (also? or primarily?) be an instrument of "cultural learning" deeply troublesome. I'd hope that IRBs would smack that sort of thing down.

_lqaf · on Dec 22, 2021

Unless you're studying how people react to online legal threats, why would you not try to avoid this problem with your study entirely?

matkoniecz · on Dec 22, 2021

5% of emails going to hobby websites would be unacceptable and unethical.

If it would went solely to major corporations - more OK.

Another part: do not lie that study does not involve human subjects.

yjftsjthsd-h · on Dec 22, 2021

Yes; if they had ensured that 100% of their targets were corporations then I would have very little concern about it.

rubylark · on Dec 22, 2021

Demanding a subject to actively participate in your study upon pain of vague and mostly incorrect legal threat is ethically wrong. Passive participation (like scraping) without consent is morally wrong, but since it doesn't cause undue distress to the subjects, it is not as big of a story.

The IRB in this case didn't consider this ethically suspect because "websites aren't people". And yet the study disproportionately targeted small websites where there is, in many cases, only one person involved.

tylermenezes · on Dec 22, 2021

Because the end of the email (wrongly in most cases) demanded a response by law and implied they were open to legal action, which caused a bunch of people to hire lawyers to check into their liability.

dahfizz · on Dec 22, 2021

Maybe the problem is the laws which create unknown liability for anyone hosting websites.

eli · on Dec 22, 2021

In this case the law wasn't the issue. The email message asserted a legal obligation that does not exist.

dahfizz · on Dec 22, 2021

>The controller shall provide information on action taken on a request under Articles 15 to 22 to the data subject without undue delay and in any event within one month of receipt of the request[1]

The legal obligation may not have applied in this case, but it absolutely exists. If someone submits a request to you for their data, you are legally obligated to respond.

[1] https://gdpr-info.eu/art-12-gdpr/

kstrauser · on Dec 22, 2021

The request I got was about the CCPA. It said:

> I look forward to your reply without undue delay and at most within 45 days of this email, as required by Section 1798.130 of the California Civil Code.

First, the CCPA doesn't apply to my site. It's non-commercial, has many fewer users than required to invoke the CCPA, and zero revenue. No provisions of the CCPA require me to do anything.

Second, the questions were about how I'd handle a CCPA request, and weren't actually a request at all:

> 1. Would you process a CCPA data access request from me even though I am not a resident of California?

> 2. Do you process CCPA data access requests via email, a website, or telephone? If via a website, what is the URL I should go to?

> 3. What personal information do I have to submit for you to verify and process a CCPA data access request?

> 4. What information do you provide in response to a CCPA data access request?

The CCPA doesn't obligate anyone to explain their internal processes. It obligates covered entities to respond to the requests themselves, but not to random drive-by questions.

So basically, that sentence was completely wrong. The CCPA doesn't apply to me, and even if it did, the law doesn't say what the researchers claim it did.

dcow · on Dec 22, 2021

Why isn't this the story? It doesn't even have to be about ethics which nobody can seem to agree on. Sounds like the researchers were simply wrong.

So then the problem actually is that they misinterpreted the law. If someone misinterpreting the law can cause such stress and waste such time, shouldn't society safeguard against this?

Isthatablackgsd · on Dec 22, 2021

More like researchers needs to take classes on law jurisdictions. They seemingly to believes that both laws have jurisdictions over everyone in the world, including countries and states that don't have such laws which causes people to be confused with it since it have legal statement.

The researchers created this issue because they don't understand (or tried to understand) the laws nor they do not screen their statements. The liability is not on the law, the liability falls on the researcher especially with "human subject" comment. Therefore, the researchers are likely to be in violation with their university IRB. The legal statement is forcing people (that are not applicable to them) to respond which in turn violated the ethics of IRB because they did not consent to this research. By 'forcing' them to respond to the research that they don't have people consent to do so will run afoul with IRB.

eli · on Dec 22, 2021

You shouldn't lie to people to trick them into collecting data for you without at least considering the impact on those people.

That's nothing like web scraping. (Though IMHO web scrapers should also use an honest User Agent so if website owners have a problem or question or want to block it, they can)

adolph · on Dec 22, 2021

"meta" good faith != good faith

> why folks are so up in arms about this

The implicit legal threat is similar to the harm described in the Prenda saga: https://arstechnica.com/tag/prenda-law/

It is wronger than the deception because the PI "Jonathan Mayer" is not just a run of the mill academic focused on "publishing a paper or whatever." This is an activist with an ax that won't grind itself. Reviewing his work mentioned in Wikipedia I'm impressed and appreciate the contributions Mayer has made. Mayer can't be not aware of the problems with the approach.

UncleMeat · on Dec 22, 2021

I personally know Jonathan and hugely respect his work.

I could believe that because he is an actual lawyer it was harder to imagine the panic that recipients who have no understanding of the law would experience. But I think that more likely is that the response was a bit of a fluke. Way stranger stuff has been done by security and privacy researchers with the go-ahead from their IRB. This feels to me like this is a methodology that isn't universally agreed on but is not especially uncommon that tripped a response from the internet. The conclusion is more that people should not necessarily take the existence similar research as indication that the broader community is okay with these methodologies.

adolph · on Dec 22, 2021

I suspect Meyer's work is in part preparatory to lawfare in order to force websites to pay for lawyerly services. The letter is akin to a fire insurance company knocking on doors while carrying a torch.

"Of all tyrannies a tyranny sincerely exercised for the good of its victims may be the most oppressive."

https://quoteinvestigator.com/2019/12/19/intentions/

UncleMeat · on Dec 22, 2021

Frankly, that's stupid.

He's got a PhD and a JD from Stanford and has chosen a faculty position and has done a nontrivial amount of unpaid work for various privacy rights organizations. He obviously isn't motivated by money.

adolph · on Dec 22, 2021

Frankly you are great at knocking a strawman down. Jonathan Mayer likely has some motivation for those efforts. I made no claim to the motivation being remunerative or not.

Do you have an alternative hypothesis of a motivation other than preparation for a "public-interest" lawfare campaign?

UncleMeat · on Dec 22, 2021

Actual legitimate research to understand existing privacy legislation, which can be used by policymakers to iterate and ensure that legislation is effective without being wasteful.

kortilla · on Dec 22, 2021

But that’s not how laws are written. Do you think he’s that naive?

UncleMeat · on Dec 23, 2021

He previously worked in a senator’s office so I do suspect he knows how the sausage is made. And yeah, the staff writing bills do look at this sort of material. It is just one part of a bigger picture but it isn’t just throwing research into a void.

otrahuevada · on Dec 22, 2021

The wording on the main driver of the experiment, their especially bad emails, leads website operators to think there is a problem where there is none. This, on top of the research being entirely devoid of consent between the human parties involved, makes it a _very_ bad study, one that could well cause both the university and the research team to lose money if some of the 'subject' parties actually had to go get a lawyer to have a look at their shoddy emails.

In better studies what is supposed to happen is, you propose taking part in the experiment, you get a signed agreement of some sort, and only then actually start experimenting. What happened here is more like some kind of youtube prank than a useful information gathering procedure.

sennight · on Dec 22, 2021

Scraping public data doesn't result in compelling another person to work under a false premise. Sure, you could argue that scraping introduces load that may draw an operator's attention... but the comparison is a pretty big stretch.

How these things pass board review I don't know... it seems pretty obvious to me that creating work for somebody who didn't volunteer to it is, at best, antisocial behavior.

btown · on Dec 22, 2021

In US legal code there is actually a definition of a human subject in https://www.hhs.gov/ohrp/regulations-and-policy/regulations/... (EDIT: to clarify this is a guideline for federal researchers and to my knowledge is not legally binding on private institutions, but seems to be used as a basis for private IRB policies):

"""

(e)(1) Human subject means a living individual about whom an investigator (whether professional or student) conducting research:

(i) Obtains information or biospecimens through intervention or interaction with the individual, and uses, studies, or analyzes the information or biospecimens; or (ii) Obtains, uses, studies, analyzes, or generates identifiable private information or identifiable biospecimens.

(2) Intervention includes both physical procedures by which information or biospecimens are gathered (e.g., venipuncture) and manipulations of the subject or the subject’s environment that are performed for research purposes.

(3) Interaction includes communication or interpersonal contact between investigator and subject.

(4) Private information includes information about behavior that occurs in a context in which an individual can reasonably expect that no observation or recording is taking place, and information that has been provided for specific purposes by an individual and that the individual can reasonably expect will not be made public (e.g., a medical record).

"""

The argument is that scraping of public data, already recorded by data systems for general (e.g. not specifically medical) purposes, is neither intervention, interaction, nor private information.

On the other hand, IMO the researchers here clearly interacted with their subjects. While the email was sent to a privacy@ address, not only are emails different from HTTP GET in how likely they are to be read by humans, but this went a step further and implied legal action would be forthcoming unless a human replied to the message. That's interaction. That makes the recipient a human subject.

(IANAL and the above is not legal advice.)

EDIT 2: I've had the pleasure to meet one of the researchers here. They are a staunch defender of online privacy, and I believe the team sincerely wanted to measure how effectively businesses are adapting to the changing winds beyond their legal obligations. But I also think the team, and the Princeton and Radcliffe IRBs, should have done more to consider the impact on the people who operate these businesses themselves. I'm sad and disappointed that the systems in place didn't catch this.

halpert · on Dec 22, 2021

Your question is essentially whataboutism. Both things can be wrong. We can care about this instance without diluting the conversation talking about something else that is also bad.

throwawyaaccoun · on Dec 22, 2021

It's not intended to be whataboutism (sorry about that, I edited this in to clarify) -- I agree that the deception was wrong. But there seems to be something about this particular event that is riling people up, and that's what I am getting at. I am not trying to whatabout, to be super clear.

runnerup · on Dec 22, 2021

To clarify. I don't think people would be riled up about individuals sending out these emails. Individuals are required to be legal, not 'ethical'.

The people who are riled up believe that University studies should be performed ethically. They know that IRB's exist to prevent researchers from doing unethical, but legal, things. In this case, they feel the harm caused should have been prevented.

Scraping data silently doesn't cause stress/harm to the participants directly, as they are unaware of any potential threat.

It's not "human experimentation should be banned" its "human experimentation should be heavily scrutinized to prevent harm to participants as much as possible. And definitely never cause harm to unwilling / unwitting participants".

dcow · on Dec 22, 2021

What bothers/riles me is that there doesn't seem to be a consistent ethical framework applying to these complex situations. Of course things should be ethical but ethics aren't defined as “whatever people on HN and Twitter feel like isn't slimy”.

mindslight · on Dec 22, 2021

I believe the real issue isn't the research ethics per se, but rather pent up frustration on the larger topic. I posted this in one of the original threads:

https://news.ycombinator.com/item?id=29607123

kortilla · on Dec 22, 2021

Because it comes across as a vague legal threat to a website operator! That’s in no way like scraping databases.

This cost real legal resources (there are Twitter threads of internal legal counsel hiring outside firms to evaluate this).

dataflow · on Dec 23, 2021

I don't see the similarity. Scraping doesn't involve a thinly veiled threat of a potentially costly lawsuit.