Hacker News new | past | comments | ask | show | jobs | submit login
CCPA Scam – Human subject research study conducted by Princeton University (freeradical.zone)
485 points by ColinWright on Dec 17, 2021 | hide | past | favorite | 333 comments



From the study's FAQ[0]:

> Did an Institutional Review Board consider this study?

> We submitted an application detailing our research methods to the Princeton University Institutional Review Board, which determined that our study does not constitute human subjects research.

From the social experiment[as reported by OP's link]:

> I look forward to your reply without undue delay and at most within 45 days of this email, as required by Section 1798.130 of the California Civil Code.

I'm pretty sure most people would find that to be a thinly veiled threat of a lawsuit. I'd like to know if the review board considered the text of the email that the "researchers"[1] intended to send and the fact that they were likely to send it to individuals instead of solely publicly traded companies.

[0] https://archive.md/cSDGT

[1] Seems to fall more under experimental psychology to me https://en.wikipedia.org/wiki/Experimental_psychology?wprov=...


If I had to guess, the wording is in the study's FAQ is carefully chosen: "an application detailing our research methods" doesn't necessarily mean "an application with the verbatim text of the emails we planned to send, including our thinly veiled legal threat at the end."

Not trying to turn this thread into a generic flameware against "academic" research methods, but this whole things seems oddly reminiscent of the "let's try to insert malicious code into Linux" fiasco [1]. I'm conceptually fine with generic passive tools like web crawlers to conduct research, but since when did the internet become a place where nonconsensual interactive research became fine?

[1] https://www.bleepingcomputer.com/news/security/linux-bans-un...


> Not trying to turn this thread into a generic flameware against "academic" research methods, but this whole things seems oddly reminiscent of the "let's try to insert malicious code into Linux" fiasco [1]. I'm conceptually fine with generic passive tools like web crawlers to conduct research, but since when did the internet become a place where nonconsensual interactive research became fine?

In a very real sense, every landing page A/B test is nonconsensual interactive research.

Or at least, if there is line between them, however blurry, I can't find it.

I am skeptical of the idea that such a line should be drawn according to who is doing the experimentation, I don't think that a manipulative act becomes okay just because it is being done by an academic for research purposes, nor do I think that it becomes okay just because it is being done by a layman with a profit motive (or a political one, for that matter).


> In a very real sense, every landing page A/B test is nonconsensual interactive research.

I think that lots of benign testing is only this a bit pedantically, at least for the general "two variants of a page" type of thing, context matters of course.

"I want to use this service" -> "OK, here is the page for that service" is a certain interaction where, granted, you might be presented with a different kind of look, but... well, you are getting what you asked for I suppose. Though you could get into the ethics of price differentiation by geo-data, and other general things that lead you to feeling ripped off.

OK, maybe lots of "growth-hacking" A/B test stuff does fall into this category...

I think the primary component of both this CCPA thing and the Linux kernel is, esentially, dishonesty. Researchers are doing things to outright lie to others. Here they are using fake identities! And it probably fails the general smell test of "if the counterparty was informed of the details, would they feel bad about the whole interaction". I said it elsewhere, I don't know if it's really fraud legally but it sure feels like it.


To play devils advocate - is that really all that different from much other online communication? A significant chunk of the web runs on advertisements; and those are in essence tons of little influence games, often with little regard for the truth or honesty: the aim is to manipulate by whatever means you can get away with.

A lot of forums have issues with spam and sock puppets, and not all of that is obvious nor all of it honest.

Even many large, curated news sites have now succumbed to the benefits of deceiving their audience; whether through outright misrepresentation, or merely selective ommission, or merely editorial emphasis that prioritizes their agenda over their readers' understanding of the material.

Attempts to course correct here run into vast vested interests (when it comes to e.g. advertising or biases media), and also against the implementation of free speech protections in the US (and many other places), and more subtly, against public opinion on free speech, which refuses to countenance any attempts at reform.

In essence, we prioritize the right to deceive over the right not to be deceived - in all but the most extreme of circumstances.

Chalk one up for team deception - while this surely isn't a good trend, I can't see how this research is even close to some of the more problematic stuff floating around.


> In a very real sense, every landing page A/B test is nonconsensual interactive research.

I think the difference here is that the user requests a page with a web browser (which could be argued as giving consent to view the contents) while the person that received this email didn't request the experimental email (and therefore didn't consent to the experiment).


If you consider A/B test nonconsensual research, then you can also consider localized versions of the sites as A/B tests. Or even serving differnet content for mobile and desktop.


The problem, like in that previous case, is that "human subject research" is a pretty narrowly defined category. It is mostly meant to cover testing out drugs on human subjects, and stuff like that. Notably, there is plenty of unethical research that doesn't qualify. So when an IRB gets a proposal that amounts to "I'm going to send some emails/interact with some folks online" their reply is likely to be along them lines "not our problem", and it becomes the responsibility of the research to assess the ethics of what they're doing.


The IRB boards I've interacted with or seen peers go through included more than just drugs etc. A survey or interviewing people has always been included as human subject research by the boards. Depending on the specifics, surveys & interviews may be exempt from a full review of the human subjects process, but only after the IRB itself has made that designation. Basically a PI shouldn't be talking to a human as as part of their research without the IRB making a determination on it.

Anything related to food & drug testing is usually its own special category of review within the IRB, but it's not just meant-- and has never been meant-- to only deal with biomed research. The Belmont Report in 1979 that gave rise to the modern IRB explicitly addressed research with human subjects, not just biomedical research. Anyone in that field is aware of the extreme examples like Milgram's work and the Stanford Prison experiment that make this review necessary.

It may be the case that some IRB's don't take that side of thing as seriously as they should, but that doesn't mean the ethical burden is primarily on the researchers. The legal liability is on the institution, and the IRB is the regulation-mandated body required to ensure compliance.


But when the Sokal squared hoaxers were submitting fake papers to humanities journals, it was deemed human subject research and widely denounced (by humanities researchers) for lack of ethical review. This seems to be a very analogous situation.


The wording of the message is one hell of a detail to leave out when detailing your research methods.


It’s seems like it but it’s not.

The IRB review determination is going to be based on the typology of what you are doing not the internal contents for the most part. Once they decide the level of appropriate review then they will typically look at the ‘details’.


The false legal threat is particularly galling, but this absolutely should have gone through IRB even without it. Someone should have had to at least consider the impact on recipients of the messages before they were sent.

IRB review is typically required even for just simple research surveys.


How is a reminder of the law a legal threat? More specifically when you feel like you're not impacted by this law, it's as far from a legal threat as could be.


Whether or not I feel I am impacted by a law has little to nothing to do with whether someone else will decide to sue or prosecute me based on it. Even if it's without merit, it's still an extreme hassle if that happens (and also very expensive).


> How is a reminder of the law a legal threat?

In the same way that "This is a nice place you've got here, it'd be a real shame if something were to happen to it", when spoken by a Mafia enforcer is most definitely a threat.

It's not, outright, threatening to bring a lawsuit, however the language of that last paragraph is definitely something which you'd expect to see from a lawyer in preparation for such legal action.


I'm not a lawyer, and i've both sent and received email like that over the years, none of which ever produced a legal action. That's standard procedure for private data requests.


Either there's another nonconsensual experiment underway or a legal threat scam involving security bounties, because the phrasing of this Princeton email is very similar to the security bounty emails I keep getting... at my personal, static, blog.


It's obviously implying a threat if you're at all familiar with the legal sphere.

Passive-aggressive language, sure, but still not exactly inviting the recipient to a picnic, and passive-aggressive language doesn't get you off the hook.

Related, doesn't matter if it is completely without merit and could never succeed.

This entire story and thread is just something else. Talk about failing to meet even baseline ethical standards.


> It's obviously implying a threat if you're at all familiar with the legal sphere.

And if you're not a lawyer it's just a very normal message of someone trying to get answers and have their privacy rights respected. I've both sent and received many messages like this one over the years (CNIL requests) and there's nothing frightening about it.

> Talk about failing to meet even baseline ethical standards.

This study certainly meets my ethical standard of trying to hold corporations accountable to what they do with out data. I really don't see what the fuss is about: if freeradical.zone admin had received this email from anyone else (as could well be the case) would we even talk about it?


You're wrong. The reaction to the email and apology from Princeton is proof of that, but there are also very few people here agree with you. I'm not a lawyer or a business man and I read the email as a clear legal threat. I had anxiety just reading it, the same as the OP.

It's clear as day. And if for whatever reason you don't see it that way the rest of the email and the way it's written should set off alarm bells as being a potential scam.


I don’t think it’s intended as a veiled threat of a lawsuit so much as a statement of the compliance requirements. Unfortunately it seems they misunderstood the scope which makes the it inaccurate. But if the statement was true and accurate I would just take it as a helpful reminder of the timeframe.


> But if the statement was true and accurate I would just take it as a helpful reminder of the timeframe.

No. Absolutely not. A helpful reminder of the timeframe would be "the deadline for our study is ..., please try to send your response by then if you wish to be included."

Quoting legal code is not at all a helpful reminder of a timeframe, but is a direct implication of legal ramifications for failure to comply.


IMHO that is the most significant part of this. Any question about the intent is clearly tipped toward legal trouble by that.


> if the statement was true and accurate I would just take it as a helpful reminder of the timeframe.

People don't go through the trouble of digging up the particular section number of the specific statute of the specific jurisdiction in question for the mere sake of a generic "helpful reminder of the timeframe" required by law.

And similarly for the "without undue delay" part.


I do. Every day. I provide the citation so you can read the law and if you disagree with my interpretation you can respond saying as much. This is internet outrage mob justice. I understand why people are mad but it’s far more to do with ignorance on the part of the researchers than malice.


The context is important: if you deal with user support (especially in the context of privacy) then someone quoting law at you is a huge red flag for an impending nightmare. I’ve dealt with irate users who actually did go as far as to file lawsuits and the email from this “study” activated my fight or flight response because of how much it (unintentionally?) mirrors the way angry litigious internet users communicate. The only worse phrase to read is “free speech”.


I would guess as an attorney you're more used to that style of communication than a random blogger or small entity. Citing law has very different signaling purpose and effect in different contexts.


That's a very generous assumption. Especially in the context of an email sent under false pretenses and a false name and an anonymous domain.

It's either a veiled threat or a serious error. Either way, this study needed more oversight.


Calls for more oversight are calls for more bureaucratic procedures and this whole situation is already bureaucracy gone mad.


> and this whole situation is already bureaucracy gone mad.

That sounds like you're referring to the CCPA itself as the bureaucracy gone mad, and probably have already made up your mind about what outcomes research like this will find. If so, that's not really a helpful attitude for this kind of discussion.


In this specific situation it doesn’t seem like there was any bureaucracy at all


What world do you live in that regular webmaster inquiry emails are footnoted with a reference to a law number?


It is interesting in the study web page (https://privacystudy.cs.princeton.edu/) that they consistently mention contacting "websites" instead of "people." As if a website is some autonomous thing that can communicate with a researcher.

I wouldn't be sleeping well if I were involved in this study. There is no way an IRB could determine that this is not human subjects research if you're emailing people and asking them anything.

If I want to ask random people about the weather and publish the results in a journal, that qualifies as human subjects research and IRB protocols must be followed. Emailing people asking about privacy policies is definitely human subjects research. Either they misrepresented the study in their IRB application, or the IRB didn't do due diligence reviewing the application.


In my understanding (from a french cultural context), asking people questions as part of a field study is not human subjects research. Ethical questions arise when you ask people to take specific actions in order to measure their reactions, not when you're asking about the status quo.


Lying about who you are and pretending that you are allowed a certain thing… I mean legally it’s not fraud but it sure feels like it!

Imagine someone showing up to your office building pretending to have an interview , walking into the office waiting room, then walking out saying “oh, just an experiment!”

“It’s just an email” the ease of the mode of communication here is not super relevant to the action, right?


> Lying about who you are and pretending that you are allowed a certain thing… I mean legally it’s not fraud but it sure feels like it!

I'm not aware of the details, but maybe there's an actual Victor Coutant from Nice on the research team. If not, what does it change? The point is precisely to study how a random person trying to defend their rights to privacy will be received/treated by hosts. You can't study that if you sign your emails with "Princeton privacy researcher".

> the mode of communication here is not super relevant to the action, right?

Exactly. Researchers will declare themselves as such before conducting interviews, but they'll rarely hesitate to take notes or ask a simple question as part of a field study.

I mean if you're not happy this researcher is doing their job to investigate how corporate America mistreats people's data privacy rights, i'd be happy to be the person sending tons of pseudonymous automated emails and handing them over to a researcher. Would that change something for you?!


The email ended with a "looking forward to your response within 45 days". Immediately in-house council was under the impression it was someone trying to entrap the website owners into a lawsuit for not being compliant.

There was a Twitter thread of a number of in-house forwarding to external council which costs time and money. This study incurred monetary damages.


> Ethical questions arise when you ask people to take specific actions in order to measure their reactions

"Answer my questions within 45 days or I will sue you." That seems to read like a demand for a specific action.


Nonono you don't understand, it was actually "Answer my questions per my extremely benign request including the exact statute requiring your response".

Totally different. Obviously. /s


That's not what the message said. The message asked specific questions about data processing in regards to privacy regulations. Anyone could have sent this message. Hell, i have been on both ends of this message (with CNIL not CCPA) and as an honest person taking part in non-profits i can assure you there's nothing to feel threatened about.

Maybe in your Silicon Valley culture where lawsuits are more easily triggered than private data requests things are different, though. The point is, if the author acknowledged they're running a study in the original email (not the follow-up) they would have skewed their data because corporate assholes don't treat college researchers and common people the same.


> That's not what the message said.

The message said:

> I look forward to your reply without undue delay and at most within 45 days of this email, as required by Section 1798.130 of the California Civil Code.

This is clearly a threat of legal action. Your experiences with CNIL might be misleading. There are few risks of getting bankrupt because of a frivolous lawsuit in Europe, but this is a very real risk in the US. Even if you are in the right and did nothing wrong.


Guidelines depend on jurisdiction. Research conducted in the United States may have different requirements than other jurisdictions.

The underlying ethos is that researchers should respect the people who are participating in their studies. People should know that you are conducting a study, the aims of the study, and choose whether or not they want to participate.


> I wouldn't be sleeping well if I were involved in this study. There is no way an IRB could determine that this is not human subjects research if you're emailing people and asking them anything.

Do you have a citation for this? What I'm seeing from random Googling is that you have to be obtaining information about the person for it to count.

If I were researching, say, price trends in some commodity and I called up several companies' sales lines and asked for their current price that looks like it would not be human subject research despite the fact that I'm talking to a human to get each company's price.

If I were researching pay trends at those companies and called up their sales lines and asked the people who answered how much they were paid it would be human subject research.


They're sending requests to websites to see how they behave, not to request information.

If you called up companies' sales lines to ask for prices, that's just gathering facts.

If you called up companies' sales lines to see what happens when you ask them for prices for things they don't sell, or to see if they're willing to accept a bribe, or to see if they respond with different prices when you lie to them about who you are, you're researching human behavior.

In this case, they are testing what procedures, if any, companies have in place for handling CCPA and GDPR law, by posing as nonexistent customers and making potentially bogus and misleading requests under the terms of those laws.

This is like performing research on retail refund practices by going into a bunch of shops and seeing how they handle being asked for a refund for an item you didn't buy from there in the first place.

There's a more ethical way to do that study, though, which is to actually buy something from the store first, then go back and try to refund it.

Similarly, there's a more ethical way to discover how websites handle CCPA/GDPR requests, which is to use the website first, and determine in the course of that what possible information about you the website should have; then, within the terms of your rights under CCPA/GDPR, to contact them and make reasonable and legitimate requests to see if/how they are able to handle them.


Here are questions sent to individuals in the study: Would you process a CCPA data access request from me even though I am not a resident of California? Do you process CCPA data access requests via email, a website, or telephone? If via a website, what is the URL I should go to? What personal information do I have to submit for you to verify and process a CCPA data access request? What information do you provide in response to a CCPA data access request?

The word "you" appears in every question.

To quote my IRB training (citiprogram.org):

"Most research in the social and behavioral sciences involves gathering information about individuals. However, some research that involves interactions with people does not meet the regulatory definition of research with human subjects because the focus of the investigation is not the opinions, characteristics, or behavior of the individual. In other words, the information being elicited is not about the individual ("whom"), but rather is about "what." For example, if a researcher calls the director of a shelter for battered women and asks her for the average length of stay of the women who use the shelter, that inquiry would not meet the definition of research with human subjects because the information requested is not "about" the director. If the researcher interviewed the director about her training, experience, and how she defines the problem of battering, then the inquiry becomes about her - and therefore "about whom."

The current example is similar, in my opinion, to "how she defines the problem of battering" which the IRB training identifies as human subjects research. The people receiving the researchers' email in the current study are being ask to define the way they interpret and comply with a legal statute.

I can accept that some people don't see the information requested as being "about whom" and therefore is not human subjects research. But the fact that people who have received this email have panicked indicates that the recipients, at least, felt that the questions were more than merely recording impersonal data about their websites.


> I can accept that some people don't see the information requested as being "about whom" and therefore is not human subjects research. But the fact that people who have received this email have panicked indicates that the recipients, at least, felt that the questions were more than merely recording impersonal data about their websites.

I guess the distinction is whether you see a website as an organization, even when that 'organization' is as small as a sole proprietorship or DBA, or even a personal site or blog.


But at the end of the day a "website" -- whether it's a large organization, a small business or a sole proprietor-- is still maintained by people. There might be a process in place if it's a large enough company, it might even get directly sent to the lawyers to deal with.

But at the end of the day someone has to look at the email and respond, and in many cases that can cost money.

To wave off the whole thing as "we're contacting a website not a person" just shirks all the responsibility of ethically experimenting on people, which is what the study actually does.


Of course it costs money! Even my own "huh, this is weird, anyway, we don't collect any data, so let's tell them to eff off" cost me time, and therefore money, so, invoice time it is.


This is incorrect. It's only human subjects research if the researcher is obtaining data about a human. This is the "about whom" requirement. A classic example is calling a business and asking someone about the products and prices they offer. That's not human subjects research.


If you say "I am a researcher studying X, can you please answer the following questions" then you might be studying a "what", depending on the specific questions.

When you lie about who you are, what your purposes are, and use scary legal language in an attempt to elicit a response, that is absolutely human research. You may be able do those things ethically as scientist but you absolutely need IRB review becausr it is definitely human research.

My guess is that the IRB in this case was not informed of the deceptive nature of some of the emails as lieing is absolutely a red flag that you are doing human research and not just information gathering. Indeed, evaluating such lies for potential harm is an important part of why we have IRBs for psychological and sociological research.


You realize that the internal regulation is wrong, right?

Like the semantic distinction doesnt matter because nobody gives a fuck about Princeton’s organizational policy.


This is not Princeton's organizational policy or internal regulation, this is the regulatory definition of human subjects research as set by the government. Its semantic interpretation and the "about whom" requirement is exactly how you go about making a determination about whether your research is human subjects research.


In that case I care, and would say that’s an inadequate way to prevent trolling by researchers


Any research which involves human subjects is human subject research.

Nobody can disagree with that.


The US federal government does, as do many western governments. Research that involves humans usually comes under many delineations and sub-delineations with precise names that reflect specific ways in which the research takes place and the corresponding laws and regulations which the researchers must follow.

Determining which category a specific research project comes under usually involves checking specific criteria, in the US they have flowcharts, in Europe they have tables. Either way you can be sure a lot of people are going to be looking at it, most of whom have had to undertake ethics training as part of their career, and some of whom have spent their entire life studying these questions and seen them put to the test over hundreds of trials.

In that light, whether this category of research has got "human" in its name is not going to get you far wrt understanding the problem at hand.

source: I've undertaken interventionist medical research in the U.S and Europe.


Who is the subject of the emails sent to personal domains?


Not sure I follow your question.

An example of something that's not human subjects research would be emailing people who have websites and asking about their privacy policy.

An example of something that is human subjects research would be emailing people who have websites and asking what inspired them to start a website.

I realize that may seem like a subtle difference, but it's an important distinction from an IRB perspective. For reference, and because a lot of people seem confused on this thread, here's what the human subjects research training at my university says about this...

"...some research that involves interactions with people does not meet the regulatory definition of research with human subjects because the focus of the investigation is not the opinions, characteristics, or behavior of the individual. In other words, the information being elicited is not about the individual ('whom'), but rather is about 'what.' For example, if a researcher calls the director of a shelter for battered women and asks her for the average length of stay of the women who use the shelter, that inquiry would not meet the definition of research with human subjects because the information requested is not 'about' the director. If the researcher interviewed the director about her training, experience, and how she defines the problem of battering, then the inquiry becomes about her - and therefore 'about whom.'"


You misunderstand the research in question. To quote from the researchers website

> When the system has even higher confidence, it sends up to several emails that simulate real user inquiries about GDPR or CCPA processes. This research method is analogous to the audit and “secret shopper” methods that are common in academic research, enabling realistic evaluation of business practices. Simulating user inquiries also enables the study to better understand how websites respond to users from different locations.

They are not just asking for the existing privacy policy, they are actively attempting to put the subjects into a realistic environment and seeing how they respond. The focus is the behavior of the individual. This should also be evident from the fact that they felt the need to lie to and threaten them...

https://privacystudy.cs.princeton.edu/


He understands perfectly well. What's relevant is whether the response is a property of the individual or the organization, and it's arguable, and controversial, but you'll find a lot of studies performed using this technique that were not considered human subjects research.

As to whether it's deceptive and threatening (the latter of which I find pretty hyperbolic, this is a pretty boilerplate request), that has no relevance as to whether it's human subjects research.

Maybe they should have limited the scope to larger organizations.


Someone looking up the exact statute and quoting it, while not a direct legal threat, certainly carries a lot of implied threats. People don't just look up legal statutes for shits and giggles.


I don't buy that interpretation. That is, I'm willing to believe that's how your university interprets the regulations, but I personally think it's perverse and unethical when applied to this situation.

When you deliberately deceive someone in order to obtain information that you think they would be otherwise unwilling to give you, the response you get back is as much "about" their behavior in response to your deception as it is about the subject of your inquiry. (And if the researchers in this case didn't think the deception would make their targets more willing to cooperate, why the threatening language?)

That doesn't necessarily mean this kind of research should never be allowed, but it should definitely go through an IRB's oversight.


> An example of something that's not human subjects research would be emailing people who have websites and asking about their privacy policy.

No, that's an example of human subjects research that may be exempt from the regulations due to specific reasons, such as by only interacting with subjects through surveys and interviews (while adhering to further restrictions, that this research probably runs afoul of since it's not anonymous).

> For example, if a researcher calls the director of a shelter for battered women and asks her for the average length of stay of the women who use the shelter, that inquiry would not meet the definition of research with human subjects because the information requested is not 'about' the director.

What a terrible example. They've only demonstrated that the director does not qualify as a human subject, while ignoring the question of whether the women staying at the shelter would qualify as human subjects!


Epistemologically, using a fake name and a threat of legal action to elicit a response from whoever's picking up the phone is no different from dressing up as a cop and harassing someone on the street. The question of whether the content of your accusation stems from their own or their employer's action is peanuts compared to the ethical boundary you crossed when you decided to impersonate authority to witness their reaction.


This feels like it creates a massive ethical loophole.

There are different ways to gather pure factual information, too. In particular where the factual information you are trying to gather is information about the extent to which someone complies with the law, there's some real danger in being able to fall back on a 'we're just gathering facts' defense.

Take this example: "a researcher calls the director of a shelter for battered women and asks her for the average length of stay of the women who use the shelter"

What are the regulatory requirements shelters need to comply with? Do any of them concern length of stay? Are there any liabilities a shelter might expose itself to if it were known that it had women staying there for longer than a certain period? Or individual liabilities if it were discovered that they restrict how long people can stay? Would they potentially expose any of their clients to danger if the length of stay information were revealed to a particular person?

If so, then providing the answer to that question is something the shelter needs to give some thought to. And the manner of their response might be different if that question were posed to them by:

- a woman enquiring about staying at the shelter

- a government inspector

- their landlord

- a random man phoning them

- a journalist

- an academic researcher identifying themselves and the nature of the study they are conducting

So if as an academic you ask a 'just gathering information' question, but conceal your identity, don't share whether the information will be aggregated or identifiable, and don't explain what you're gathering the information for, you are not just collecting a fact - you are forcing the person you are asking to make an evaluation of what information to provide; in other words, you are creating a human behavior, and what you are studying will be the outcome of that.


I think one problem is that with small websites run by a single person or small group, a person can feel the website is an extension of herself. So a question about the website in some way becomes a question about the person.


More critically, it may actually be an extension of themselves in terms of legal liability.


Note that the title doesn't provide the full meat.

It reads: "CCPA Scam November 2021"

Story update notes: "This is a human subject research study conducted Princeton University"

I was attempting to submit my own instance of this when I discovered ColinWright's. My suggested title was going to be "CCPA Scam ... is a human subject research study conducted by Princeton University".

Panicking small web operators without consent being the issue.


Ok we've squeezed that in above. Thanks!


And thanks.

I was concerned that might be too much adaptation. Apparently not.


I'm not a moderator (obviously) but I've noticed that when the article's original title is sufficiently vague (like for this one) the mods allow for a lot of leeway in the title as long as it is representative of the article.


Nor am I, though I often email HN with issues (titles is a frequent case).

The guidelines call for original titles unless excessively long or clickbait, and request no editorialisation:

https://news.ycombinator.com/newsguidelines.html

The typical options for alternatives are:

- Shorten the original title, especially by elmininating superlatives or counts. "Ten things you should know ..." becomes "Things you should know...", "Incredible new light..." becomes "New light ..."

- Replace vague terms. "This guy who.,." becomes "<Name of guy>...". Note that this often (though not always) shortens the title.

- Substitute an alternative title. The <title> tag might work, if present, otherwise a line picked verbatim from the article if possible. Occasionally gluing two phrases together is justified, as here, where "CCPA Scam" was the obvious substitution for the pronoun "This".

The key here is to find something that clearly expresses the significant content / context of the article, without sensationalisation.

Different types of submissions have their own challenges. Many commercial pieces are clickbaity, vague, listicles, or sensationalised. Microblogs (Mastodon, Twitter) have no title, and the lede line may not be especially descriptive. Blog titles run the gamut from infuriatingly vague to pretentious to long to overly terse. So long as I think I can make a reasonable argument for a substitution and it improves the headline, I'll propose it. If it's my own submission, I'll include a comment describing any changes made. (The start of this thread was to be that comment except that the article was already submitted.)

There's an exception to the clarity rule in practice for major announcements, as in corporate earnings and departures, which are often simply titled "Letter to..." or "<period> Earning statement ...". Rather than highlight the principle point, especially in the first case, HN tends to shy from that, largely to help tamp down hot takes and emotive responses. Which probably does help improve the quality of discourse. (See for example Jeff Bezos's CEO resignation announcement, or Jack Dorsey's.)

My suggestions for alternatives are frequently accepted by HN mods, or they'll come up with something close to it.


I've gotten 4 of these mails to 4 of my domains (including my personal domain used just for email, and a one-page documentation site for an open source library)... 2 about CCPA and 2 about GDPR. They also gave me a lot of anxiety for no reason. Looking at the responses on Twitter, a lot of websites spent real money consulting lawyers before responding to these mails due to the thinly veiled threat of legal repercussions at the end of each one.


Sounds like a good place for a class action! Those legal fees ought to come out of Princeton.


Why? If that's indeed the law, then it's up to the website owner to comply. Whether it's Princeton or a private individual writing the email doesn't matter.


A key point is that it's not indeed the law. Many (probably most) of the recipients - including the author of the original article - are not actually required by Section 1798.130 of the California Civil Code do do anything even if it was a legitimate question from a real person, because their websites are far below the limits were those CCPA requirements start to apply.

The survey was making a fraudulent legal threat, requiring the recipient do some things for the benefit of the sender (namely, providing them with data for the study; also lying about how that data will be used) based on false allegations that there is a legal duty for them to respond in some certain time, while in fact there is none.


The websites have to comply with the law, but a university should not be sending them emails lying about their obligations under any law.

The emails were from fake people. So any work preparing any response regarding those fake peoples personal data is obviously not required by law. They don't exist.

And, as pointed out elsewhere in the thread, the researchers are probably not protected by either of CCPA or GDPR.


There's a big difference between:

* one honest email

* thousands or millions, sent under false pretenses

There are two differences:

1) I'm allowed to cold call you. I'm not allowed to set up a robot to place millions of automated phone calls.

2) If I lie, I may have a problem. Someone runs up to your home and yells that your house is on fire, and you believe them. You jump out of a second-story window, breaking your windows and your legs. They do it in a stunt for TikTok. Who do you think is liable?

A third difference is IRBs. The right place to handle this are complaints to OMB; Princeton should lose federal funding here.


> Why?

No costs would have been incurred without Princeton asking a very scary question.

Additionally, by not disclosing that the question was research, they also skewed the results if they were looking to see what prevailing attitudes and practices actually are.


Because they sent me three emails and I don't even provide a service for the domains they mentioned.


On what basis? Why can't we reasonably expect these sites to follow the laws? Just that they have in past survived being unethical and not following them does mean they have some sort of claim when they scramble to fix their failures.


The sites that are big corporates that abuse the shit out of this insufficient law will just consider it legitimate interest and they are vendors, not affiliates with the data being sold to.

A huge amount of the people that got this were ethical, individual, not corporate, not profit.


Let me help:

We’re some students from Princeton trying to understand how businesses are responding to CCPA and GRPR requests. Could you help us with our study? How would you answer these questions?

The point is disclosure. It’s unethical to do otherwise, especially given that is about the use of data. I’d love for there to be more data published about the impacts of these policies, but please don’t use the tactics of creeps in the process.


It is likely that quite a few people would lie if they knew they were going to be observed/studied or reported on. However, I'm sure they could've made the actual email less threatening and more friendly/ethical without revealing research intent. (or the intent to research this specfic aspect)


>people would lie

Possibly. And the IRB review process may allow for non disclosure under circumstances of that sort.

The problem is not merely that the IRB allowed non disclosure. It's that the IRB also granted an exemption from full review as human subject research. If the researcher expected that human behavior might change based on secrecy vs. disclosure then it is fundamentally not passive data collection.

But debating the secrecy issue or limits if what constitutes a human subject are all besides the point: the research protocol had an adverse impact on humans involved with the study. Not matter any other considerations, that makes the research defacto one that should have had full IRB review. Evaluating the potential for adverse impact is literally one of the foundational reasons for the existence of IRBs. The presence of an adverse impact is defacto proof that an exemption should not have been granted and that a full review should have been done to determine how the protocols could be tweaked to mitigate the issue.


> It is likely that quite a few people would lie if they knew they were going to be observed/studied or reported on.

Even assuming your premise is true (it's not), you think the solution to not have people lie is....to lie to them?


Yes, there are reasons that studies sometimes lie to their participants. These lies are something that has to be justified to an IRB and the study has to be designed to carefully minimize harm. Lieing to unwilling particiants only raises that bar. In this case, the bare minimum ethical way to conduct this study would have been a careful manual review of every unwilling participant that was going to be recieving deceptive communication to ensure they actually fell under the laws in question.

The compound of deception, legal intimidation, and scatter shot automated selection of unwilling participants is a particularly egregious ethical failure.


Seems like a career academic with no experience in the real world playing around like this is some kind of game. I'm sure they meant no harm, because they don't consider anyone "participating" to be anything more than a potential subject in their agenda to get a good review on their paper.

That letter and their social media posts are nothing more than a facade to maximize return with no consideration of impact.

Total negligence.


Mayer has a JD and is licensed in CA (I don't know about NJ), has worked for at least one US Senate office, and has been so involved in actual practical privacy work that ad companies pressured the president of Stanford to expel him for his legitimate work on DNT.


The person who designed and ran this study is not Mayer. Mayer runs the lab, but this is a subordinate's baby.

From the study's website: "Please contact the lead researcher for this study, Ross Teixeira (rapt@princeton.edu), if you have any questions, believe you received an email in error, or would like to opt out of any future communication related to the study. The additional members of the study team are Professor Jonathan Mayer at the Princeton University Center for Information Technology Policy, who is the Principal Investigator, and Professor Gunes Acar at the Radboud University Digital Security Group."


Ross appears to be a PhD student. Hardly a "career academic."

The point is that the "these are all ivory tower idiots who don't know how real people work" is a silly argument to make given their backgrounds.


Sounds like we should all e-mail him to opt out of his future studies. Maybe he’ll get why a massive e-mail campaign is a bad way to do this “study”.


I received exactly the same email with a different sender ("Anna Roland", a resident of San Francisco, California) and was also left quite paranoid by it. The email had an combative tone and felt like a legal threat.


We received the same email as well, also from “Anna Roland.”


Me, 5 days ago:

https://news.ycombinator.com/item?id=29539266

I was concerned enough about this that I updated our project privacy policy with pre-emptive wording about CCPA (now reverted):

https://web.archive.org/web/20211218125309/https://textpatte...

I'm mildly annoyed about the time I wasted on this, but I guess that in itself is anecdata for this study.


As a counterpoint here I don't consider this study or the Linux kernel study human subject research unless we define human subject research so broadly that the definition essentially becomes meaningless.

As a side note I find the "outrage" about these small academic studies quite hipocritical. This is a community where a significant proportion of people work in related to ads/clicks who constantly experiment on human subjects (at least by the definition applied to the cases above). Let's not even talk about the research done by Facebook et al (where a significant proportion of developers here work as well), who literally looked at how changing the time line affects the mental health of users.


If a study is observing how human reacts to a certain situation, that's research with human subjects. The Linux study observed how maintainers react to bugs, this CCPA/GDPR request spam observed how data protection staff reacts to requests about their processes.

And the backlash is not hypocritical. You're of course right that FB has also done really questionable research, but that doesn't matter here. I've also seen significant uncertainty about this spam series in the data protection/privacy community, i.e. criticism by those people who get to deal with these emails.


By that definition if I change the layout of my website and observe if it changes how humans change their behaviour, i.e. how and where they click it's human research. With that definition pretty much everything is human research. Well even if I track where rubbish is being transported to it is observing human behaviour and thus human research.

It remains also hypocritical. If you (not you personally but in general) are outraged by this research as unethical and you are working for companies who do any optimisation of their ads/engagement etc., you are contributing to the same (and arguably much worse) behaviour that you condem as unethical. I call complaining in others about something that you do yourself on an often much larger scale hypocrisy.


When you study how humans react to changes to a website, you study the behavior of the visitors to the website.

When you study where rubbish is being transported to, you study a system designed by humans.

There's an obvious difference.

Also, I think most of optimisation of ads/engagement is plain unethical. So there's no hypocrisy on my part.


> I call complaining in others about something that you do yourself on an often much larger scale hypocrisy.

The assumption that everyone complaining about this works in adtech or even does A/B testing is ridiculous. HN has a very strong contingent of people who are antagonistic to that entire field so making such a generalization is absolutely false.

I would point out that A/B testing minor changes without consent is a little creepy, deceiving your users in the process (by say changing pricing measure on them) makes it far more creepy. If you add bogus legal intimidation (or other language designed to elicit a strong emotional response) to that, it becomes creepy on a whole different scale.


If you’ll recall, Facebook did take a lot of flack for changing newsfeed content for users to see what behavioral changes it would induce.

https://www.theguardian.com/technology/2014/jul/02/facebook-...


The nub there is not the definition of "involving human subjects", but the definition of "research".

By the relevant federal regulations (https://irb.ufl.edu/index/humanrsch.html)

================================================================

(l) Research means a systematic investigation, including research development, testing, and evaluation, designed to develop or contribute to generalizable knowledge. Activities that meet this definition constitute research for purposes of this policy, whether or not they are conducted or supported under a program that is considered research for other purposes. For example, some demonstration and service programs may include research activities.

================================================================

This is a good overview: https://irb.ufl.edu/index/humanrsch.html


Wait. Did you really mean to tag anyone simply _employed_ by Facebook as hypocritical? As if being employed by an entity immediately connotes acquiescence to whatever unethical or immoral behavior that entity engages in? Isn't that "guilt by association" taken a bit far? It's as if you were to call Google pro Sanders because a significant amount of money was donated to the Senator's campaign by non-executive Google employees (as if there were many other avenues to protest the status quo and still put food on the table). Full disclosure: not a big tech employee or a dev, just another over-the-hill greybeard sysadmin with absolutely no influence on corporate behavior.


That is a ridiculous comparison and absolutely not what guilt by association is.

If you work for a political campaign before the candidate announces their support for killing puppies, holding you responsible for that position is "guilt by association", but if you go to work for that same candidate after the announcement, it is no longer guilt by association, you have made a deliberate choice to support someone who wants to kill puppies and now share some responsibility.

Much of Facebook's anti-privacy behavior is widely documented. Going to work for Facebook absolutely makes you a little responsible for that behavior.


I thought a friend Jim, a professor of Writing, Rhetoric, and Digital Studies at the University of Kentucky, had a good take on this: https://twitter.com/ridolfoj/status/1471536878658719748


He seems to have good intentions, but does not seem to have knowledge of IRB which may make this situation worse.

Specifically, he confuses "does not constitute human subjects research" with "exemption" which is a pretty big difference and anyone who works with human subjects should know this.

From his Twitter thread, "Update: They are now saying they have an exemption. They have not made any forms available or explained the lack of informed consent."

Exemptions are protocols that have been reviewed, and deemed exempt based on one of 8 very specific criteria. Studies deemed not constituting human subjects research are returned by the IRB, and not considered reviewed.

Given that the authors actually said "...to the Princeton University Institutional Review Board, which determined that our study does not constitute human subjects research" this is clearly NOT an exemption, and informed consent is not a consideration as far as the IRB is concerned.


They've been doing this since April 202 [1]. Early reports also found on reddit with people being concerned. We too received quite a few of these emails over the last few days from various fake idenities and wasted time responding to one of them before realising it was not legit!

I can't imagine how much time and potentially money was wasted on these mass emails.

[1] https://joewein.net/blog/2021/04/21/questions-about-gdpr-dat...


This is Teixeira's Twitter claiming that the study has been very well received by its unknowing participants: https://twitter.com/RossTeixeira/status/1471249557883432967


Well I hope the bill for legal services is as well received by Teixeira as he claims the unknowing participants are receiving his "research"


It appears the study website has been edited to state they are no longer sending emails.

Previously it said they were sending these emails up until Spring 2022[1]

[1] https://web.archive.org/web/20211216002029/https://privacyst...


The study FAQ claims:

> What happens if a website ignores an email that is part of this study?

> We are not aware of any adverse consequences for a website declining to respond to an email that is part of this study.

But the email sent out states:

> I look forward to your reply without undue delay and at most within 45 days of this email, as required by Section 1798.130 of the California Civil Code.

So the email very clearly states that there is an adverse consequence for a failure to respond, namely a violation of the California Civil Code.


> I look forward to your reply without undue delay and at most within 45 days of this email, as required by Section 1798.130 of the California Civil Code.

>So the email very clearly states that there is an adverse consequence for a failure to respond, namely a violation of the California Civil Code.

I've read and re-read (and read many comments) but where is the adverse consequence stated?


> but where is the adverse consequence stated?

Here:

> as required by Section 1798.130 of the California Civil Code.

If you tell someone that they are obliged to do something as per the law, the meaning is obvious that not doing so is a violation of the law, which is an adverse consequence (the consequence being breaking the law and whatever penalties come with that).


The researcher is using the "secret shopper" justification for why secrecy was needed here.

This completely undermines his receipt of exempt status for full review of human subjects in research. If he anticipated that behaviors of recipients would be different as a result of knowing whether or not the communication was an authentic request from a user.

This is a bit of a subtle point and wouldn't necessarily expect a PhD student to pickup: It's really something-- depending on how the application was presented-- that the IRB should have caught. I'm surprised that any IRB would grant an exempt status under those conditions because it fundamentally meant the study was not passive data collection such as a surveys, which usually are exempt.

Hopefully there is a public post mortem of how this approval went down. Considering the adverse impacts on display it was defacto a study that needed complete human subject review.


I'm the person who wrote that blog post. I got an email from a fake person in France who asked several questions about my small social media site's CCPA compliance, then ended the letter with:

> I look forward to your reply without undue delay and at most within 45 days of this email, as required by Section 1798.130 of the California Civil Code.

I thought I was about to be sued by someone who was the equivalent of a patent troll, but for the CCPA. I had a minor panic attack before I was able to calm down and piece together an open response (as I wasn't about to reply directly to them, not after their thinly veiled legal threat). I briefly considered lawyering up just in case, which would have cost me a fair chunk of money.

After all that, it turns out it was research by Princeton University: https://privacystudy.cs.princeton.edu

The lead researcher, Ross Teixeira, says that:

> We submitted an application detailing our research methods to the Princeton University Institutional Review Board, which determined that our study does not constitute human subjects research. The focus of the study is understanding website policies and practices, and emails associated with the study do not solicit personally identifiable information.

Either:

1. Teixeira misrepresented his research to the IRB, or

2. The IRB is grossly incompetent or unimaginative.

I don't see a middle ground on this, particularly after the University of Minnesota vs Linux contributors debacle (see https://www.theverge.com/2021/4/30/22410164/linux-kernel-uni...). I see Teixeira's research as the equivalent of sending a bunch of fake legal notices to random people to "study the legal system", while claiming with a straight face that it doesn't involve human subjects research. Frankly, that's bullshit, and I can't believe someone signed off on this.

As noted on the blog post, I've reported this to Princeton's Research Integrity & Assurance department. This isn't OK. I didn't consent to be a part of their human research (which it absolutely is, however they might try to claim otherwise), and this research seriously freaked me out. I slept poorly for a couple of days thinking I was facing possible legal issues over hosting a little not-for-profit hobby website.


There is a separate Research Integrity group to address exactly this sort of problem. I'd call and email them [0] and then maybe cc' the office of General Council (their lawyers) [1] so they are aware of the type of liability that lax research oversight may be causing.

For good measure, his research advisors should hear about this as well. Per his own website (now unavailable but archived on wayback [2]) they are Jonathan Mayer [3] and Jenifer Rexford. [4]

Ross: should you come across this HN post, read through every comment. You need to understand-- especially studying tech policy!!!-- just how poorly done this was. Really not a great way to begin your reputation in this field.

[0] https://ria.princeton.edu/report-concern

[1] https://ogc.princeton.edu/

[2] https://web.archive.org/web/20210122100955/https://www.rosst...

[3] https://jonathanmayer.org/

[4] https://www.cs.princeton.edu/~jrex/


I've met Ross during my time at Princeton and he is a really genuine person, he is not trying to ruin anyone's life. This incident is the result of an uncharacteristic blind spot in empathy: a mistake.

I also have experience with the Princeton IRB on similar topics. The reality is that Princeton's IRB, and IRBs in general, are not equipped to deal with this sort of online research. IRBs were created as a reaction to unethical medical research, in particular the Tuskegee Syphilis Study [1]. My experience has been that the IRB has a greater expertise on medical and sociological studies. This leads the IRB to having a very narrow view of its remit in other domains. Unless humans are in a very literal way "subjects" of the study, then the IRB doesn't see it as human subjects research. In this case the IRB likely saw "Free Radical" and other websites as the subject. In both my studies and those done by my peers, the responses on what is and isn't human subjects research is uneven and you will often get a generic "this study does not constitute human subjects research" response from the IRB. This can be the case even if there possible negative repercussions to the "not subjects" in your research.

For example, say your study involves testing the vulnerability disclosure policies. How well do websites respond to vuln reports? In your study you send out 100 vulnerability disclosures. After you report these vulnerabilities, a human may read your vulnerability report and make a decision based on it. This presents a risk that the individual security team employees involved in your study will be scapegoated and fired when you publish your (potentially damning) results. How do you balance the value this study provides the public against the risk to the individual employees' livelihoods? The IRB isn't going to help you do this balancing, they will just say "this isn't human subjects research".

IRBs quite simply aren't equipped to evaluate this sort of research at the moment. This can be frustrating for a young twenty-something researcher just out of college trying to do the right thing while generating impactful research. You come in thinking that the IRB will be a guiding hand of wisdom and prudence, but you are quickly disabused of that notion after most of your interactions feel like conversations with lawyers in a compliance department. Many researchers in "CS" don't even involve the IRB, because they don't always see the ethical dimension of their work, but the fact that Ross did shows that he was trying to do the right thing here.

[1] https://en.wikipedia.org/wiki/Tuskegee_Syphilis_Study


I don’t doubt that Ross is a nice person, and I think he meant well. FWIW, I think this is a great thing to study and in other circumstances I’d be glad he’s doing it. But much as Ross didn’t intend for me to be hyperventilating, heart pounding as I imagine trying to explain to my wife how my little hobby is getting us sued, that’s exactly what happened. That was a whole awful lot of extra stress that I didn’t need.


Howdy Paul!

I definitely see a problem in that some people think that if the IRB doesn't object to what they're doing, it's OK. But ethics is a responsibility of the entire research team, and the research team is usually far better placed to understand the implications of their research strategy than the IRB.

The following are big problems here:

  - lack of informed consent
  - deception
Researchers should be trained that those are only allowed in exceptional cases where the benefits outweigh the harms.


I feel like "coercion" (legal threats) should probably be a separate bullet point from "deception"?


Well if you have informed consent, it's not going to be a problem. If you don't, then you need to do a more careful analysis of ill effects might ensue when someone gets the letter (feel distress, spend money on a lawyer).


Aren't these issues common in many other societal studies, for example fake resume hiring studies?


Yes.

IRB's exist, in part, to weight the cost to the humans/etc vs the possible benefit of the study.

Take: https://www.nber.org/system/files/working_papers/w21560/w215...

Look at footnote 3.

There is often a tendency to dehumanize things when it involves sending stuff to corporations. Even in that footnote, it's not employers processing fictious resumes, it's people.

So it's much more likely you'd get approval to do something "to a corporation" even though 99% of the time, it's really still being done to humans


I'd like to disagree as someone who knew Ross during my time at Berkeley. He absolutely is intelligent and thoughtful enough to know what he was doing -- including the consequences.

Berkeley's IRB is similarly illed -- resulting, a lot of trust (i.e. empathy) is placed that the lead will not do anything as obviously unethical as this. This is not the mistake that someone as intelligent as Ross makes, this was a conscious decision that backfired.


The fact that Ross didn't mean to do this is all the more reason why someone - maybe an IRB, maybe not (your argument makes sense) - should be assisting 20-something researchers with having a well-informed perspective.

In the absence of an organization that's good at this (which doesn't seem to exist and should), this probably should be the supervising professors.


As a research group leader, I find it unfortunate that the grad student seems to be the public face of this and is therefore attracting most of the ire. Feels like the student is being thrown under the bus, and responsibility for ensuring the study is conducted ethically should ultimately be that of the principal investigator.


Mayer posted an apology last night taking full responsibility. Hardly throwing his student under the bus.


Thank you for making this point. I didn't articulate it, but this is part of why I felt I had to say something.


I hope this fellow Ross does not become suicidal or otherwise depressed when he sees the weight of the internet coming down on him for this faux pas. Ross, none of this wil matter in a year. Or 5 years.


[dead]


Sorry, but this sort of attack is not ok on HN and I've banned the account. It's entirely possible to post substantive critique without stooping to this—as many HN users have demonstrated in this very thread.

https://news.ycombinator.com/newsguidelines.html


Fortunately there are a lot of other more-senior leaders in larger companies and longer-lasting non-profits who believe people can learn from their mistakes and there is no magical class of people who never make mistakes. Obviously mistakes were made here, and obviously at least the PIs should have known better - but deciding that you're going to keep the student on some mental blacklist forever, and instead find people who haven't yet had the chance to learn from their mistakes, is short-sighted.

You've probably heard the story of IBM's Thomas Watson being asked if he was going to fire someone who made a mistake that cost the company $600,000 in lost sales. No, he said - I just spent $600,000 training you! Why would I want someone else to benefit from that training?

(Also, the fact that you are aware there might be enduring consequences to this comment if you associated it with your name, and are therefore keeping your name off of it, is ... interesting. My advice to Ross Texiera is that, until proven otherwise, the commenter above is some random troll in high school who doesn't know how the real world works. If they wanted the threat to be taken seriously, they'd post it on LinkedIn.)


I think it’s kinda funny how sending vaguely threading emails (suggesting violation of statutes) sailed right through an IRB, but Scott Alexander got the third degree for giving patients a survey. Description:

>> When we got patients, I would give them the bipolar screening exam and record the results. Then Dr. W. would conduct a full clinical interview and formally assess them. We’d compare notes and see how often the screening test results matched Dr. W’s expert diagnosis. We usually got about twenty new patients a week; if half of them were willing and able to join our study, we should be able to gather about a hundred data points over the next three months.

https://slatestarcodex.com/2017/08/29/my-irb-nightmare/


UCI, for example, seems to have a very well defined notion of human subjects research, and this would clearly meet it.

Let's look: https://services-web.research.uci.edu/compliance/human-resea...

"Any systematic investigation (including pilot studies, program evaluations, qualitative research), that is designed to develop or contribute to generalizable (scholarly) knowledge, and which uses living humans or identifiable private information about living humans qualifies as human subjects research. See Definition of Human Subjects Research for more information."

Down the rabbit hole to https://services-web.research.uci.edu/compliance/human-resea...

"Research is as a systematic investigation, including research development, testing and evaluation, designed to develop or contribute to generalizable knowledge. ...

Examples of systematic investigations include:

Surveys and questionnaires

"

So far, we got it in one.

I'll skip the part of whether it's generalizable - it's clearly intended to be here.

"A human subject means a living individual about whom an investigator (whether professional or student) conducting research:

Obtains information or biospecimens through intervention or interaction with the individual, and uses, studies, or analyzes the information or biospecimens; or

<The or is about getting PII in more cases, but this study is not getting PII>

...

Interaction includes communication or interpersonal contact between investigator and subject.

... "

Well, there we go.

Seems a lot more straightforward in various IRBs than you seem to say. As an aside, lots of IRB's also have mass email policies and are required to approve the text.

Now, maybe Princeton's IRB does not have as clear a definition. I can buy it, in fact!

But honestly, it doesn't seem that hard. If you are going to simulate fake emails to humans, for the purpose of gathering their responses, you are in fact, doing human subject research.

It also doesn't seem very hard to draw bright lines:

1. If you are interacting with people to see what their response is, even by email, they need to consent.

2. Do not deliberately deceive humans.

(You can even modify #2 to "do not deliberately deceive humans without an IRB explicitly understanding and weighing the cost/benefit" if you like, but most of the time, you actually do not need to deceive humans)

It's also really really hard to believe someone went to an IRB, and said "i'm going to survey people by sending them emails from fake people that seem mildly threatening, and seeing how they respond.", and an IRB was like "yeah, that seems okay, it's definitely not human subjects research".

It's up to the researchers to explain precisely what thy are doing in an accurate way. Saying you are surveying websites is totally inaccurate and confusing.

If a sociological researcher was like "whoa, i'm not emailing people asking for their family histories", that would be human subject research. Instead, i'm just "retrieving directed graph data from remote email addresses". I don't think that would go over very well.

Finally, as for not seeing the ethical dimension of their work, there is an easy fix for this (IMHO): Make ethics classes required. In fact, in lots of places, IRB's wont' review things if you haven't!


I think their point was IRBs say information isn't about an individual when the individual would say it is. Everything you quoted depends on the word about. And UCI's policy refers to US regulations. Those regulations contain surprisingly broad exemptions.[1]

People talk about emailing web sites any time they don't know if it's a person or a company in my experience. And ethics classes don't give everyone the same understanding of ethics.

[1] https://www.hhs.gov/ohrp/regulations-and-policy/regulations/...


Quick meta-comment; this is useful and informative information about the original link - please don’t downvote because you disagree with the point of view. If you disagree, please add a comment and make HN a great place for discourse!


Downvotes are ok to use to indicate and it has been so forever. Pg and others have supported this view


Poignant detail: the Radboud professor that's part of this "research" has experience in dark patterns so he knows damn well what he's doing. By his own words, from his website:

> I also study anonymous communication networks such as Tor, and investigate deceptive and manipulative (dark) design patterns

Radboud has policies about informed consent [1] that were clearly ignored, or were explained away with the idea that informing should be allowed afterwards to not taint the experiment (even though this is just a basic data policy).

I believe the recipients of these emails should file a complaint against Teixeira's co-conspirator as well. Contact information for the Radboud ethic's board can be found at [2], though the documentation is mostly aimed at students.

[1]: https://www.ru.nl/rdm/collecting-data/informed-consent-ethic...

[2]: https://www.ru.nl/science/research/about-our-research/ethics...


Great advice, thanks! I’ve filed a complaint with Radboud, too.


I agree that you were wronged by these so-called ‘researchers’, but it’s also tragic that our legal systems are so bad that everyone fears them. The legal system should be a low-stress, reliable, and predictable way to avoid or reduce conflict, instead of a weapon of terror.


What's more annoying is that the decision to construct the email as a non-academic "other person" was a CONSCIOUS decision by the research team (most likely the advisor). I don't see what benefit hiding behind the illusion gathered does for the information beyond worrying the recipients wouldn't respond to a more traditional "We are researching CCPA..." style email.

IRB does allow for deception, just to be clear. Its annoying, but sometimes that's what's needed to get genuine responses. HOWEVER, the hoops the team needed to do to justify its use here was very poorly executed.


Implying the law said they had to respond looks like a bigger problem to me. Some people said they wouldn't have ignored email from a researcher. So saying it was for research would have changed the responses.


That's a fair point. The text of the CCPA is very clear that it doesn't apply to my hobby website (see https://oag.ca.gov/privacy/ccpa for a nice FAQ), but I wasn't thrilled about the idea of having to explain that to a jury.


You did not lawyer up, but some other recipient might have had. Is there a ground for a lawsuit here for... well.. fraud? After all, resources were spent; surely, there was some stress..

Yeah, I agree with you.


I'm not a lawyer, clearly, but I'd say so. Some people are replying to him on Twitter saying that they've spent money here and asking who to send the invoice to.


Not a lawyer, but I would say no. Being aware of your legal obligations is part of cost of doing business. Expecting company to follow law isn't anything special. Just because in past they have been unethical and not spend money to follow them doesn't mean when they hear about it and spend money the person they heard about it from has to pay. That would be insane. They have been bad people by not following the laws or being aware. Now they are aware of them. In no way is him responsible for these idiots legal cost. Specially when they should have already paid them before.


Might be worth reaching out to Princeton's IRB?

https://undergraduateresearch.princeton.edu/compliance/human...


Done, thanks. I'm not sure what the difference is between that and their RIA (https://ria.princeton.edu/report-concern), but now I've reported it to both.


You should report this to the IRB. The research is conducted on information obtained by interacting with humans, and therefore should be classify as Human Subjects Research [1].

Waivers of informed consent can be obtained under some circumstances, for example in the case of a retrospective study where the data has already been collected and contacting subjects would be difficult/unnecessary, or it can be shown to adversely affect the outcomes of the study [2, search for waiver].

But regardless, even if informed consent were to be waived, the fact that this is human research means that the researchers should be trained in Research Ethics and Good Clinical Practice (even if the research is not clinical), and understand that the goal is always to minimize risk for the participants - risk which was clearly not properly evaluated under the current project.

[1] https://grants.nih.gov/policy/humansubjects/research.htm [2] https://www.law.cornell.edu/cfr/text/45/46.116


That’s great information, thanks! I’m learning an awful lot about this stuff very quickly.


The IRB is underneath the RIA. If this was the IRB's screwup, it's the RIA they will answer to. And probably a nice chat with general counsel to assess any liability.


1. Teixeira misrepresented his research to the IRB, or

2. The IRB is grossly incompetent or unimaginative

My experience with IRB's is that they are often extremely conservative in their interpretations. Legal liability is attached to them. Human-involved research requires an order of magnitude more review to get approved. There are gray areas, and my (very limited) direct observations are that people will try to frame their material to avoid the extra review. I can't rely completely on Princeton's reputation for this, but if I had to guess, Ross either did a poor job presenting to the IRB or deliberately downplayed the nature of the of things.


I love how they say they are "contacting websites" as if websites are sentient beings that can respond to questions, rather than operations run by human beings who will receive and respond to the communication.


Websites aren’t sentient beings, but they are more similar to commercial entities than people. Even if not intentional, websites gain traffic and can display ads. They have Google ranking. They have an audience and can get paid to share information with their audience.

Would there be an issue if they sent out letters to businesses asking how they comply with a California regulation?


>Would there be an issue if they sent out letters to businesses asking how they comply with a California regulation?

I think this is where you may be overlooking the context.

People aren't mad they asked about compliance with a law, they are mad about the way it was asked: from fake personas implying legal threat, while cataloguing the replies for their study no one asked to be involved in.


That seems like a pretense.

The researcher was an actual human being, so all they would have to do to require a response is to register on the site before sending the email. If they had registered accounts, then requested their information be sent to them and required its deletion, it would have been an order of magnitude more work for the site owners than just sending answers about the process (which, if the site is subject to the law, should already be prepared.)

I think people are mad precisely because they were asked about compliance with a law. Largely because emails went out to sites that were not commercial or too small to be bound by the law, so they weren't aware of it and panicked.


If the researcher is at Princeton, which (last I checked) is neither in the EU nor in California, they may not have standing to compel a response under GDPR or CCPA, both of which apply to data about persons within their territories, as I understand it (although interpretations certainly vary).

According to the linked blog, the owner wasn't covered by CCPA anyway as I suspect is the case for a lot of the recipients, so there would still not be a response required. Some of the sites may have data exports and account deletion clearly available to users anyway, in which case no human interaction would be needed; but the research wasn't looking for that.


Here's to hoping you're awarded damages for pain and suffering, or at least get a nice settlement.


If there's a settlement, I hope it's in the form of a nice bottle of scotch, and a letter from Princeton apologizing and swearing not to do it again. I'm not out money; I'm out peace of mind.


I just hope there is some incentive for other researchers to not follow in this study's footsteps.


I don't think this reaches the level of a cash settlement, personally. It is certainly shocking and would upset me. I agree that Princeton should try harder.


Same. I’m neither asking for nor wanting any kind of a settlement or anything. I just want them not to do it again.


I don't know California statutes, but in Illinois that would be a crime:

(720 ILCS 5/17-50) (was 720 ILCS 5/16D-5 and 5/16D-6) Sec. 17-50. Computer fraud. (a) A person commits computer fraud when he or she knowingly: (1) Accesses or causes to be accessed a computer or any part thereof, or a program or data, with the intent of devising or executing any scheme or artifice to defraud, or as part of a deception;


I don't understand why you reacted so strongly. I feel like it's not a big deal to receive a message like that; what am I missing?


Because receiving a letter citing chapter & verse of the legal code is generally never the precursor to a nice friendly chat.

The actual-not-fake-researcher (instead of fictional people) could have sent a nice friendly request saying, "I'm a PhD candidate working on public policy in the tech sector. Could you please answer the following questions regarding your process of CCPA compliance, if applicable"


Legal threats are a common occurance nowadays though. I get calls weekly saying a warrant had been issued for my arrest or that my "SSN is about to be revoked".

The email also clearly says they are not sending a request at this time and it seems nicely written to me. I guess I don't get why this is on HN and everyone is so livid about it.


I understand what you’re saying, but this seemed a far more credible threat than someone wanting me to send them Bitcoin to delete my webcam video. For instance, here’s a story about a lawyer who filed so many ADA lawsuits that a judge barred them from filing any more. People abuse the legal system all the time, and while people on the receiving end of a lawsuit can fight it, it’s guaranteed to be expensive in many ways. I could absolutely see someone filing thousands of CCPA lawsuits that wouldn’t actually stand up in trial, but which would be an utter fiasco for even the un-liable defendants.

Edit: Oops, here’s a story: https://www.azag.gov/press-release/serial-litigant-permanent...


> I guess I don't get why this is on HN and everyone is so livid about it.

I think this kind of scam would end up on HN even if it was a bunch of Nigerians doing it, and what's making people angry instead of merely taking note while rolling their eyes at scammers stooping to a new low is the fact that it's respectable universities rather than Nigerians.


> . I get calls weekly saying a warrant had been issued for my arrest or that my "SSN is about to be revoked".

And those are all illegal. If the telecoms weren't incompetent and protected from liability, you could find the people who did those things and either sue them or file charges.


Sure, of course not, but a near panic attack still seems a bit... out of proportion.


If you have run a business small enough that you don't have a lawyer on standby then you might understand a little better.

I have, and received a real legal threat. A bit of panic as you contemplate the financial devastation & wreckage it might leave your life in... well, a little bit of panic is actually a pretty reasonable response there.

If you've been in that situation and been totally calm about it then that's a good thing for you, but that's not the common response for someone contemplating a life-changing encounter with the legal system.


These businesses should have been aware of this already. It is their own fault for not being aware of their status and preparing for it. In no way anyone but they are to blame in this case.


I think you didn’t read my original blog post that’s linked here. I’m not a business. They send the email to me regarding my personal, hobby, zero-revenue website. I have no legal obligations under the CCPA, but I didn’t know that until I spent a few stressed-out hours researching this. Even then I was worried about the idea of being sued over it anyway, and having to explain to a court why I believed I shouldn’t be liable for damages.


Because it sounds like a letter you’d receive from someone who’s prepared to abuse the legal system to extract money from you. Also I think the OP pretty well described their reaction: they were afraid they were going to be sued.


I'm just running a hobby website. I'm not at all used to receiving letters that bring up legal questions, then give me a time frame to reply as per a specific law. To my non-lawyer reading, that looks like someone's doing their homework to figure out how to drag me into court. Judging from a lot of the responses I've gotten from other recipients, I'm far from the only one.


[flagged]


Please don't cross into personal attack in HN comments.

https://news.ycombinator.com/newsguidelines.html


Obvious scams are a lot easier to dismiss without worry than ones that actually look like potentially credible legal threats.

You're just blaming the victim here, possibly because you're biased by the hindsight of already knowing the legal threat was never real in the first place.


For those who revive what appears to be a legal threat seems to be worthy of anxiety


Have you ever been a sole proprietor in California[0]? Do you have a family?

Adding something about a timeframe and quoting the law clearly signals legal troubles. I would have 'reacted strongly' too.

[0] https://www.ftb.ca.gov/file/business/types/sole-proprietorsh...


I see that you're being downvoted, but for what it's worth I agree with you. I read the message and if I received it I wouldn't have thought much of it. Honestly I would've just thought it was spam. It's a shame though if the OP did have mental duress as a result of it, though.


It's entirely possible that you and OP do not share the same 'what do I stand to lose' frame of reference.


I am surprised by the reactions here. I did not receive that e-mail but I receive all sort of weird inquiries for my websites, (at least 2 or 3 per day). I don't understand why people are so mad about it or even panicking.


While I personally believe the questions stated were perfectly reasonable (and could have been genuine questions from someone), I can understand that people feel (legally) pressured to provide answers to the questions. Which puts this study in a grey area.

With regard to GDPR, the "respond within x" is simply not applicable. The one month period is strictly for any requests concerning Article 15 through 22 and none of the questions are talking about any of that.

Now, if one of the questions was something along the lines of "do you process any information related to me?" then it would potentially fall under Article 15.1 and would require a timely response. IANAL, however, I think in such cases you could simply point to a privacy policy, which you are already required to have.


Mentioning a specific section of a law is threatening to hold them to the letter of that law. What happens if they don't follow the letter of that law? The insinuation is legal action will follow.


They did not threat anything, they were just asking information regarding their personal data. And therefor they were in their right.


You don't have to actually directly say the threat for the threat to be known. For example, if you're hitting me up for protection money, and you said "it would be a shame if something were to happen to x", that would reasonably be understood as a threat. Notice how you don't have to directly say that you will be the reason something happens to x.


There is no real person behind the inquiry. This is what made it deceitful and unethical.


The problem here seems to be that governments created laws that allow everyone to scare the shit out of people who dare to build something and put it into the public - without having to leave the comfort of their chair (pun intended). Data privacy is important. Not feeling the urge to hire a lawyer just to publish a small blog is even more important.


Even more fundamental is the fact that defending your innocence is expensive. A system that lightens your pockets when a bad actor invokes your name and will never compensate you for your loss afterwards is a bad system.


Bryan Cantrill (DTrace etc) has a good talk on ethics in software engineering[0].

Skip to 12:29 where he recalls Facebook in 2011 experimenting with users news feeds.

[0] https://www.youtube.com/watch?v=0wtvQZijPzg


Is there ever a bad Cantrill talk? thanks for the link.


Anecdotally - no, I've yet to come across one.



Maybe the principal investigator Jonathan Mayer [1] should have read his own paper [2] about dark patterns. From the abstract "There is a rapidly growing literature on dark patterns [...] that researchers deem problematic" [2].

This research looks like a dark pattern to me.

[1] https://privacystudy.cs.princeton.edu/ [2] https://dl.acm.org/doi/abs/10.1145/3411764.3445610


Yep, we got the same. Seriously unethical behavior.


We've gotten them too, including one as far back as last April.


Interesting, did HN respond, or was it considered spam?


We always respond, and we get a plethora of them. Far beyond this particular study.

There's a whole other blog post waiting to be written about the intersection between GDPR/CCPA threats and spam.


We've gotten them not just for CCPA but also GDPR.


I got the GDPR one in April as well, and four more this month.


Please explain why this is unethical. The worst case is that you’re simply subject to the law. Presumably you’re abiding by the law. edit: I'm playing devil's advocate. I think the law sucks, the study is weird, and I empathize directly with the blog author. That said, downvoting to 'disagree' without explaining your reasoning is below the grade of this fine institution.


Sure. In my view there are two things that are unethical about this email.

The first is that they lied about who they were (and lied by omission about the purpose of the email). The ethics of deceiving someone for research are complicated, but should go through an IRB evaluation. Since they avoided the IRB by claiming it was a study about process, they should have avoided the ethics issues from lying in the survey.

They should have been up-front and honest about who they were, and why they were asking the question.

If their research truly requires deceiving the participant (and, I’m not at all convinced that it does), then it needed to be rigorously evaluated by the IRB, which almost certainly should and should have rejected it.

Second, their research makes a demand for a response “without undue delay”, rather than a request. That is also unethical, as it’s misrepresenting the law and implying a response within 45 days is required by law. It is not.

Many of the involuntary subject participants are not subject to the cited provision of the law. As such, demanding a response within the time frame and citing that provision of the law is misleading. Also, the law makes no requirement for a business to respond to such a query within 45 days. The legislative text is here (https://casetext.com/statute/california-codes/california-civ...) and the only 45 day window that exists is for responding to an actual CCPA request, which this query explicitly disclaimed from being. So even if this was a business required to comply with CCPA, they are not required to respond to this query. So, they lied (by implication) to claim that provision of the law approach lied to the business, and lied (explicitly) by claiming the law created a duty to respond to the email within a 45 day period, which it does not.


From the perspective of someone not living in the US, how on earth would one be expected to know what "the law" in California is?


when they invented the legal principle of "ignorance of the law is no defence", most common people were illiterate. That didn't stop them from making people guilty for committing crimes they didn't know were crimes.


Part of the premise of the nation state is that you don't need to abide by other countries' laws. If you're doing business within that nation state, you're required to operate by their laws.


Quick, someone tell that to Julian Assange. :) Or all the webpages which choose to implement GDPR by blocking european visitors.


I agree with you. The networks allow incursion, the law seems to allow for excursion.


This 'study' is effectively a survey. Can you explain what ethical survey 1) does not inform its participants that it's a survey, 2) poses as a fake individual persona to conceal the identity of the surveyor, 3) cites either knowingly or unknowingly scary legal jargon to coerce participation, 4) does not make any attempt to disclose confidentiality?


I'm confused how this is human experimentation. Were they not merely collecting information on how a site handles these requests? Is it because they erroneously sent emails to sites that do not fall under the umbrella of the law they were examining?

An email asking an organization for answers to questions is human experimentation?

I must be missing something.


* When you do something to people to see how they act, it's a human experiment. The purpose of this study was officially "to understand how websites would respond to real users"

* The participants / subjects of the study are people, not "websites" as the study claims. Websites don't read and respond to emails, people do.

* The participants of this study were selected without their consent

* The participants were not told that they are in a study, nor the purpose of that study

* The participants were lied to, as the researcher pretended to be someone else

* The researcher deliberately communicated in a way that heavily implied legal consequences if they don't get what they demand

* The researcher's threats and demands do not actually match what is afforded to them by the law

* The study caused undue stress and financial losses (e.g. hiring lawyers) to the participants, including those that were not subject to CCPA at all.

I don't care what labels you choose to put or not put on it, it's a shitty, abusive study that should have never passed ethics review.


I think there's a scale issue here - if they email an entity that the CCPA actually applies to (for profit, >$25mil revenue, etc.) then it's likely going to a customer service rep who has a policy to follow and doesn't have any skin in the game, and at that point I'd agree it's not really a human experiment. Like if they emailed google with these questions, I wouldn't bat an eye.

When they email a single person who runs a website, that's very different.


Even the customer service rep case may become problematic, if the entity they represent is not adequately prepared to comply with the law. Then the rep's decision of how to respond to a difficult and awkward inquiry could potentially impact things like their criminal or civil liability, employability, or reputation—things that the regulations are concerned with.


> When you do something to people to see how they act, it's a human experiment.

That's not the official definition an IRB would use. The official definition is a lot less broad than a lot of people on this thread seem to think. It requires that are collecting biospecimens, identifiable private information, or certain kinds of information about a specific person. [0]

[0]: https://grants.nih.gov/policy/humansubjects/research.htm


You're misreading it, the definition quoted on that page is very broad, specifically a human subject is the following with irrelevant clauses in or's removed and replaced with [or ...]:

> a living individual about whom an investigator (whether professional or student) conducting research:

> - Obtains information [or ...] through [... or] interaction with the individual, and uses, studies, or analyzes the information [or ...]; or

> - [...]

The actual legal definition is here [1], and further clarifies that "interaction" is very broad, specifically "Interaction includes communication [or ...] between investigator and subject."

This clearly qualifies OP as a human subject and this as human subjects research.

[1] Search "(e)(1) Human subject": https://www.hhs.gov/ohrp/regulations-and-policy/regulations/...


The key phrase is:

> a living individual about whom an investigator (whether professional or student) conducting research:

The mere fact that you're interacting with a human doesn't trigger it. If you're associated with a university, I'd encourage you to reach out to your IRB and ask them. The fact that the researchers in this case specifically were told by their IRB that it wasn't human subject research should be a good hint


It's pretty clear that the living individuals about whom the investigators were conducting research are the operators of the websites targeted by the study. The researchers perhaps misled their IRB by stating that they were studying the behavior of "websites", rather than people, but websites don't read and respond to email.


Can you explain a bit more why you added emphasis to "about whom" in the clause you quoted? I don't see how those two words create any kind of exception; this research gathered information about how the humans behaved in response to the requests and threats.

Are you accidentally reading an "is" into that clause to interpret it as "about whom an investigator is conducting research"? What matters is not whether the researcher considers the human to be the target of the research, but whether the human (or their privacy) is actually affected by the research.


This is actually the most technically correct answer on this page. Everyone is going by their own opinions about definitions of what constitutes human subjects research, rather than starting from the primary sources. IRB guidelines are dictated by the federal government "common rule", a common standard adopted by all institutions that receive federal funding.

"about whom" is a key criteria from the federal government to determine whether something fits the definition of human subjects research. Here's a quote from HHS:

"The phrase ‘about whom’ is important. A human subject is the person that the information is about, not necessarily the person providing the information. In the case of biospecimens, the human subject is the person from whom the specimen was taken."

https://www.hhs.gov/ohrp/sites/default/files/OHRP-HHS-Learni...

Reading that, it's clear that the Princeton study does not fit the definition of human subjects research. The complainants may be able to sue for damages to the university, but not because the study was improperly classified as human subjects.


The bit you've quoted is intended to clarify that "about whom" means the subject is the patient, even if the researcher gets the information indirectly through the patient's doctor. Earlier in the document you linked, it states:

> If for the purpose of a research study [...] An investigator [...] interacts with a living individual, [...] Then The research likely involves human subjects.

What's up for debate here is whether this research qualifies for one of the specific exemptions in the regulation. The general definition in the regulation is broad enough to include all interaction with living humans that produces information used for the study, and is only narrowed by later enumerated exemptions.


Not at all,

1) this is clearly not an exempt study, which is a category of its own that the IRB reviews and makes a judgment on. The authors would immediately have been able to point out the protocol number of the exempt study if it were exempt. Rather it's not considered human subjects as the authors clearly state on their FAQ.

2) it seems like you're thrown off by the example, because if you ended your sentence as "The bit you've quoted is intended to clarify that "about whom" means the subject is the patient" then we would be in agreement, and it'd be more obvious that the subject is, in fact, the website's policies/procedures. Here's an excerpt from the written text of the common rule,

"“About whom” – a human subject research project requires the data received from the living individual to be about the person."

https://hso.research.uiowa.edu/defining-human-subjects


> this is clearly not an exempt study, which is a category of its own that the IRB reviews and makes a judgment on. The authors would immediately have been able to point out the protocol number of the exempt study if it were exempt. Rather it's not considered human subjects as the authors clearly state on their FAQ.

Please don't use such circular logic. We're debating whether the research properly qualifies as human subject research; we're not debating about what the IRB actually decided on that question, because they may have gotten it wrong.

> then we would be in agreement, and it'd be more obvious that the subject is, in fact, the website's policies/procedures.

The policy itself is certainly the intended subject of the research. But the methods they've chosen mean they are also collecting and analyzing information about the responses of real live humans to their interactions and interventions, and that qualifies this as human subject research irrespective of the naive intentions of the researchers. Having a non-human subject does not preclude also having a human subject.


> Please don't use such circular logic. We're debating whether the research properly qualifies as human subject research; we're not debating about what the IRB actually decided on that question, because they may have gotten it wrong.

Yes that's what we're debating. But you used the word "exemption" which has a specific technical meaning in human subjects research, and I'm saying that it's not an exemption. There are 8 tests for exemption, and I'm pointing out that this is not an IRB exemption.

> The policy itself is certainly the intended subject of the research. But the methods they've chosen mean they are also collecting and analyzing information about the responses of real live humans to their interactions and interventions, and that qualifies this as human subject research irrespective of the naive intentions of the researchers. Having a non-human subject does not preclude also having a human subject.

Do you have a source for this interpretation? It sounds like this is your interpretation, but not the federal one. Following your interpretation, surveys of companies (e.g. emailing contact@company.com to ask how many employees they have) would fall under the definition of human subjects.

Thanks for the continued conversation, but I think this is my last comment. Nothing personal, but this is a bit exhausting. It seems like you're debating two other people on this forum about this exact definition, and you might consider that maybe you're just wrong about your interpretation?

Here's one final source, if it helps provide closure:

To meet the definition of human subjects, you must ask “about whom” questions. Questions about your respondents' attitudes, opinions, preferences, behavior, experiences, or characteristics, are all considered “about whom” questions. Questions about an organization, a policy, or a process are “about what” questions.

https://campusirb.duke.edu/resources/guides/defining-researc...


> Do you have a source for this interpretation? It sounds like this is your interpretation, but not the federal one. Following your interpretation, surveys of companies (e.g. emailing contact@company.com to ask how many employees they have) would fall under the definition of human subjects.

Sure. Click through the NIH's Decision Tool [1], and you'll find that collecting information only through surveys or interviews leads to the tool saying "Your study is most likely considered exempt from the human subject's regulations, category 2 (Exemption 2)." That particular exemption requires that the research qualify under at least one of three further criteria. (I'll also note that for someone who complained about people not referring to primary sources, you seem to be citing more .edu sources than .gov sources.)

Furthermore, this particular research unquestionably went beyond mere surveys and interviews. Legal threats under false pretenses are way outside those bounds. So even if a mere survey about how many employees a company has doesn't qualify as human research (which I'm willing to concede), that doesn't help settle the question about this research.

[1] https://grants.nih.gov/policy/humansubjects/hs-decision.htm


Information about the people contacted was collected and analyzed, it's a study fundamentally about how they react to this email, it is not (just) about a third party. In the case of websites run by a single individual (such as OP) there is no third party at all, but in all cases information about the first party was being collected and analyzed.

To be a bit more pithy, here is one example of such an analysis (admittedly, I'm not sure comments on twitter count): https://twitter.com/RossTeixeira/status/1471249559879929861


Splitting hairs like this may be a useful endeavor in a court of law (or at least ethics committee), but it sidetracks the "real" question: should a university fund research that essentially sends phishing emails en masse and entraps people into admitting they breached the law, incurring legal costs, and/or causing panic among the recipients?


IRB's emerged as mandated bodies governing the intersection of research & humans in the wake of the 1974 NRA and subsequent Belmont Report. That report explicitly states that it is more that just biomedical, but also behavioral research that is covered. "do something to people to see how they act" fits pretty well within the domain of "behavior".

If you are doing research with an institution governed by an IRB then you cannot do anything involving a human being without it getting reviewed by the IRB. There are criteria whereby the IRB may exempt the research from a full review, but only the IRB can make that determination.

Perhaps the #1 absolute goal of the IRB is to assess potential adverse impacts of the research on any humans involved. It should be clear from the comments here, and if you read through any of the links to twitter threads, that people were adversely impacted by this study either through anxiety, time spent unecessarily, and perhaps money. I might understand (though disagree) with a point of view that said these adverse results were not foreseeable, but review of the study itself under an IRB for its involvement of humans was absolutely required.


> When you do something to people to see how they act, it's a human experiment.

All AB testing is a human experiment?


I'd be willing to argue that AB testing is human experimentation.


It involves human subjects, it is probably not "research" in the meaning of IRB rules.


this might be the definition of human experiment in everyday speech but it is not for the purposes of an IRB -- they would have to be collecting PII from the subjects


Collecting PII is a sufficient condition to qualify it as human subject research, but it is not a necessary condition.


> Were they not merely collecting information on how a site handles these requests?

No. Obviously not. How could you have missed the fact that they were "collecting information" under false pretenses?


With a false legal threat, no less


Any research that at all involves a human being-- performed by an institution governed by laws that mandate the existence of an IRB-- must be reviewed by their IRB. Keep in mind that IRB's govern research, not just experimentation.

We don't even need to get into the weeds on this just being "answers to questions". All you need to do is look at the human impact: This research study has caused anxiety, time, and potentially money to many people being asked to unknowingly participate in the study. IRB's exist to evaluate-- among other things-- potential adverse impacts on people involved in a study. This has had an adverse impact and should not have been allowed through IRB review in this form.


One of the requests landed on my desk to figure out how to process. I'm a human. Others have received them on personal domains where the "organization" is just them.


> An email asking an organization for answers to questions is human experimentation?

Organizations are made up of humans.

Human experimentation is very broadly defined, for IRB purposes. As I understand it, if you're going to be asking humans to interact with researchers, in any way, and gathering data based on those interactions, that's human subjects research, and requires reasonable scrutiny from the IRB.

Source: I have worked fairly extensively with my own university's IRB, as I put together and maintain the website that they use to handle submissions.


It could be your university's definition. Princeton's definition doesn't include information about organizations.[1] It would include information about solo projects arguably.

[1] https://researchdata.princeton.edu/research-lifecycle-guide/...


This seems pretty cut-and-dried to me.

> A human subject means a living individual about whom an investigator (whether professional or student) who is conducting research:

> (i) Obtains information or biospecimens through intervention or interaction with the individual, and uses, studies, or analyzes the information or biospecimens

The information they were looking for is 'how does this person respond when they receive an email threatening legal action under CCPA?'

A human subject means ... a living individual ... about whom an investigator... obtains information... through intervention or interaction... and studies... the information.

That they mistakenly thought they were investigating how "an organization" responds when they receive an email is based on an erroneous assumption that an organization is itself an autonomous, sentient entity whose behavior can be studied independently of its human constituents.


I think individuals in a research group in a department in a university in a city in a state in a republic know organizations contain individuals.

Your interpretation would make most of the definition redundant. They could just say any individual an investigator interacts with. IRBs don't interpret it like that.


> They could just say any individual an investigator interacts with.

Not quite, because the definition needs to also encompass individuals that researchers acquire information about without directly interacting—eg. getting patient data from a healthcare provider.


How would a website “respond” without human interaction? If all they were testing was whether or not a website had an automated response to emails with ccpa in the subject or body they could have included a disclaimer at the end of the email about what they were doing.

This was clearly seeking a human response and is a human experiment.


That's not the definition of human subjects research. Not everything that involves a human responding to questions is human subjects research. A lot of comments in this thread are uninformed about the relevant definitions.

You can think the study is poorly designed, unethical, etc. But if they're not obtaining data about a living individual, then it's not human subjects research.


You appear to be the one uninformed about the definition of human subject research.

>But if they're not obtaining data about a living individual, then it's not human subjects research.

Sure, if you completely ignore the other half of the definition, per the NIH:

https://grants.nih.gov/policy/humansubjects/research.htm

> Obtains information or biospecimens through intervention or interaction with the individual, and uses, studies, or analyzes the information or biospecimens;

They are absolutely collecting information to analyze by interacting with the individual.


To me yes, for the same reason pulling fire alarms at various company office buildings to see how they respond would be considered human experimentation.


I got one of these emails too. I'm going to write up something on my blog and email princeton to complain.



That was a great post and I feel similarly. I try to do the right thing because I care about this stuff. If I didn’t care, I would’ve deleted the email and moved on.


I received one of these emails but, interestingly, whilst the email was about my personal blog, the email was sent to my work address, which is not listed on my blog. Implying they did a bit of manual work to figure out how to reach me? If so, I wonder whether whatever they did is itself covered by CCPA or GDPR? Idly considering whether to should send them a request that is near identical to the time-wasting, deceptive email they sent me?


I run a SaaS product solo, and I receive these message and other similar ones (GDPR requests) every week. Most are generic like this one, but some are out right offensive name calling.

I have no problem following these laws and extended the data/privacy rights to everyone (not just CA or EU residents). I must say though, when people threaten me or resort to name calling I make the process deliberately difficult.

edit

I just scanned my email and I got the same email, with CCPA swapped out for GDPR via a "Tom Harris".


Here's a copy:

  >
  >
  > To Whom It May Concern:
  >
  > My name is Tom Harris, and I am a resident of Sacramento, California. I have a few questions about your process for responding to General Data Protection Regulation (GDPR) data access requests:
  >
  >     Would you process a GDPR data access request from me even though I am not a resident of the European Union?
  >     Do you process GDPR data access requests via email, a website, or telephone? If via a website, what is the URL I should go to?
  >     What personal information do I have to submit for you to verify and process a GDPR data access request?
  >     What information do you provide in response to a GDPR data access request?
  >
  > To be clear, I am not submitting a data access request at this time. My questions are about your process for when I do submit a request.
  >
  > Thank you in advance for your answers to these questions. If there is a better contact for processing GDPR requests regarding nymeria.io, I kindly ask that you forward my request to them.
  >
  > I look forward to your reply without undue delay and at most within one month of this email, as required by Article 12 of GDPR.
  >
  > Sincerely,
  >
  > Tom Harris


On the flip side they sent me an email that wasn't lying, so they could have done this for everyone:

priv...@princetonprivacystudy.org Tue, Dec 14, 12:25 AM (5 days ago) to me

To Whom It May Concern,

We are researchers at Princeton University conducting a study of how websites are implementing the EU and UK General Data Protection Regulation (GDPR) and the California Consumer Privacy Act (CCPA). We are reaching out to you because this email address is provided as a contact on the website seifried.org.

Your website may be required to implement one or both of GDPR and CCPA, and we would appreciate if you would answer a few brief questions about your privacy practices.

1) Does seifried.org implement GDPR or CCPA? If not, could you please explain why? If you are uncertain about whether seifried.org is required to implement these laws or answer questions like ours, we have included informative resources at the end of this email.

2) If you implement GDPR or CCPA, do you process data access requests from individuals who are not residents of the EU or UK (for GDPR) or who are not residents of California (for CCPA)?

3) If you implement GDPR or CCPA, do you process data access requests via email, a website, or telephone? If via a website, what is the URL?

4) If you implement GDPR or CCPA, what personal information must a user submit for you to verify and process a data access request?

5) If you implement GDPR or CCPA, what personal information do you provide in response to a data access request?

Thank you in advance for your answers to these questions. If there is a better contact for questions about privacy practices on seifried.org, I kindly ask that you forward my request to them.

Sincerely, Ross Teixeira


If this email had actually been sent down by a shakedown attorney, there would be absolutely no recourse. But the sender is part of an institution that admits outside criticism, and so there is an angle to strike back. Regardless of the research ethics here, it seems most of the outrage is precisely because this perp has a chance to be held accountable.

The real overarching problem is that anyone can be subject to life-altering lawsuits through the legal system, which will cost unreasonable amounts of time, money, and personal sanity. There's no hope for justice for small time actors, since the damage has been dealt out before ever seeing an informed decision by a judge. And even if you can invest enough to pay for a real judgement, the chance of getting awarded legal fees is slim, never mind rewards for lost time and emotional distress!

I don't think the problem is specifically the CCPA, since it does except small time actors. Rather the setup for the problem is a heavyweight legal system, draconian laws that do apply to individual actors, and the massive fan out caused by being globally connected. Anybody is basically one doxxing away from a barrage of legal demands, and our poor technical architectures basically guarantee anybody who puts themselves out there even slightly can be doxxed.

I don't know that there is a straightforward fix. The dynamics have been explored to death, and we seem to be trapped in this state because the real beneficiaries of the legal system (well-funded corporations and other professional actors) are happy to keep it this way. But this situation is a lightning rod that illustrates how outrage over it is growing. If it continues to not be reformed, eventually something is going to break catastrophically.


I got one of the "Tom Harris" messages, sent to an e-mail address scraped from the Web site of a small California-based nonprofit organization.

I've gotten no response to my e-mail messages sent to the 3 researchers who have identified themselves, or the Princeton IRB. But I was able to find phone numbers for all 3 of the researchers in their c.v.'s posted online. The Principal Investigator has yet to return my calls, but the grad student took my call. He referred most of my questions to the principal investigator, but he did tell me that they sent e-mail to between 200K and 300K addresses scraped form a list of the "top" 1M Web sites.


Here is a chart of (some of) the conversation so far on Mastodon:

https://www.solipsys.co.uk/Chartodon/107464941539596242.svg

Language warning.


Heh, most of the worst language is mine. My hands are still shaking from the adrenalin rush of panic and fury this inspired in me.


I got one of these emails. Not being in the US, my usual reaction is "sue me", especially when someone emails me for a data access request on a static website. I can imagine how stressful it must be if you're actually in the US and there's a probability that you could actually be sued, though.

For shame, Princeton.


> reply without undue delay and at most within 45 days of this email, as required by Section 1798.130 of the California Civil Code

So I looked this up and it really exists.

I wonder if an account holder of a Facebook or Google account can use this law to get actual customer service.


There are three things that you can demand of a company (that meets certain revenue thresholds) if you’re a California resident:

- that they delete information about you that they have (with potential exceptions) - that they provide you with what information they have about you, and for what purposes they have that information (with exceptions) - that they opt you out of sharing data with other entities (with exceptions)

You cannot, as implied by the email, demand a response to an arbitrary query. There are essentially three queries the companies are required to answer with 45 days (90, if they opt for the 45 day extension). So, it can certainly be helpful in some situations, but will not serve as a general purpose CS tool.


> You cannot, as implied by the email, demand a response to an arbitrary query.

I wouldn't say that the questions were arbitrary; they were exactly the things you would need to know in order to submit a request for information, but without the actual request.

The only alternative that I can think of to get the same information is to register at all of these websites, use them for five minutes, then make an actual legal request, and if not provided with "information" and "purposes", to make an actual legal threat.

I don't get the impression that site owners would feel a lot happier about that approach. I can see how sending the email that was actually sent would be seen by a researcher as a better approach. And can also see a self-serving aspect, in that it's a cheaper approach - saves the labor of registering a bunch of accounts.

But I'm getting the impression that site owners went to defcon 1 after getting a single request for information that should be easily available on the site if it were subject to the law (which the blog author has stated clearly that they were not.)

If anything was missing imo, it's that there should have been help in the email mentioning the for-profit/$25MM revenue/50K Californians requirement in the law - but that might make it sound more like a threat, not less. They could also have made better guesses about whether the sites they were emailing would be bound by the law, and targeted the emails better.

But if the site does fall under the law, and they felt threatened and hired a lawyer to answer those questions, I'm not sympathetic. They're supposed to be able to answer those questions if any of the >50K Californians they work with ask, at any time. If they were, replying would be a simple matter of sending a link or a form email that they already had ready.


> they were exactly the things you would need to know in order to submit a request for information, but without the actual request.

This doesn’t seem accurate to me. The first and fourth questions especially aren’t relevant to submitting a valid CCPA request.

But my point was that, more generally if your goal is to use CCPA to compel a company to answer your questions, I think you’re going to be disappointed.

The law simply doesn’t compel companies to answer arbitrary questions. Heck, I don’t think CCPA even compels them to answer any of these questions.

Only questions 2 and 3 are relevant to submitting a request, and CCPA requires the company to publish that information, but I don’t think it compels them to answer emailed questions asking for that information. Open to being wrong on this point though.


"They could also have made better guesses about whether the sites they were emailing would be bound by the law, and targeted the emails better."

They made no guesses - they randomly selected sites from rankings of top websites (specifically, from an common academic time-smoothed aggregation of Alexa and some of its competitors).

From the study website (https://privacystudy.cs.princeton.edu/): "The set of websites for this study is sampled from the Tranco list of popular websites and publicly available datasets of third-party tracking websites."


Almost certainly. The law seems pretty clear that they'd have to completely delete all information about if you ask them to.


I guarantee you that they don't even know what all the information about you they have is and even if they wanted to they couldn't actually delete all of it because their systems have so many levels of redundancy to prevent data loss that even intentional deletion is impracticable, at least if you want it to be comprehensive.


Having worked at these companies and those like it, there's a lot to unpack in this law. The short version is, they have to delete data "about you", not data that was generated by you. Business records, such as receipts, are allowed to can be kept. They just delete your name and phone number, your photos and videos, and documents that were visible to you, but all of the actually useful tracking information is encoded into "Anonymized rows". It's bullshit and everyone knows it except the legislators writing the laws.

That said, there's also a clever but effective workaround to all that redundancy - They store all your documents and photos encrypted with a key that's unique per user. When you request all your data, they delete the key - It's much easier for them to completely and quickly purge all that as a singular key, and clean up the files later. It's not a cop-out, because there's no effective way to get at your data once the key has been deleted. Those systems still have backups, but with much shorter lifetimes and explicit audit logs.


> except the legislators writing the laws

Quite likely they do (or their staffers who do the actual work do).

This is an example of regulatory capture.


GDPR and CCPA compliance has been a big thing in the enterprise space recently. I wouldn't be surprised if they missed a small amount, and would assume that they didn't properly 0 out the bytes on the actual hard disk (as opposed to deleting the metadata used to find them - ssds in particular can make actually 0ing data hard), but I bet they do a reasonably good job.


Curious that they don’t know what data they have, no?

Collectors of data should know what they have and who they have shared it with. Think of holding data as risky, as if the data is toxic. Think of the monetary and reputational risk of a data breach.

Incidentally, the data collected must be done so for a specific business purpose. And can even be kept if there are other requirements such as AML or KYC.

Just slapping a zero over the data is not a viable solution.


A lot of discussion on here about whether or not it constitutes human subjects research. It seems to me to be missing the point.

Research can still be unethical even if it doesn't fit the definition of human subjects research.


(cross posting from my comment on the other story)

It isn't human subjects research just to read what a website says about its compliance to a law (or lack thereof), but once you start reaching out to actual humans to discuss the matter, it definitely does become human subjects research (we would categorize this as an interview or survey).

At the very least, the emails should have come with a notice that they were being sent as part of an academic study - after all, misleading your participants wasn't a goal of this study!


Anyone else may want to preemptively opt-out from future studies. Info from https://privacystudy.cs.princeton.edu/

"Please contact the lead researcher for this study, Ross Teixeira (rapt@princeton.edu), if you have any questions, believe you received an email in error, or would like to opt out of any future communication related to the study."


I forwarded an email that to legal counsel who is pulling in a privacy team.

Lots of hours wasted.


I'm not sure what to think of this.

I remember at least one article for which the journalist sent fake job applications tomdigfeeent companies, using European and Arabic names for the same CV, to uncover bias in the application process.

Is this the same or different? It also wasted the time of the people reviewing the applications.

Is the implicit legal threat what makes the difference between ok and not ok?


Without getting into the question of whether this study involved human subject research, I find a lot of the anxiety and paranoia unwarranted.

Companies to which these laws apply should already have a process in place to deal with subject access requests. Complying with relevant laws is just part of doing business.

All the other site owners could have figured out with a bit of googling that the laws don't apply to them—there is plenty of guidance available.

I can only speak for the GDPR here, but had the requests been real and valid, the worst outcome would have been a regulator telling you to comply with it. Data protection authorities are more interested in helping companies get into compliance than punishing small businesses for minor infractions. If you look at past decisions, it usually takes serious and/or systematic violations to get fined.


> I can only speak for the GDPR here, but had the requests been real and valid, the worst outcome would have been a regulator telling you to comply with it. Data protection authorities are more interested in helping companies get into compliance than punishing small businesses for minor infractions. If you look at past decisions, it usually takes serious and/or systematic violations to get fined.

The US legal system is quite different; litigants are responsible for their own legal fees[0] and frivolous lawsuits[1] are common.

[0] https://en.wikipedia.org/wiki/American_rule_(attorney's_fees)

[1] https://www.azag.gov/press-release/serial-litigant-permanent...


> I had a minor panic attack, literally, upon receipt, as I thought I was about to be sued

Oh please! A tad bit dramatic, no? A single email from someone claiming to be in France is made out to be equivalent to an official letter from a US lawyer. They even said it was not a request for data so you can ignore it.

Was this a mistake? Yes.

Was it a big mistake? Hell no.

Let's not lose perspective and keep things real.


This calls for extra popcorn supplies...


Not only I received that email too, but I received from three different addresses. I marked it as spam in all cases and I even responded to the first one that it has nothing to do with me or my business.


Hm, they use https://tranco-list.eu/ for top website list and not alexa.

Whatever happened to Alexa ratings?



From Tranco's "Methodology" page:

"We designed the standard configuration of the Tranco list to improve agreement on the popularity of domains and stability over time, using the rankings from the four studied providers as our source data."

(Alexa is one of their data sources.)


> I’m not confident that I could remain civil when talking to the person who inflicted this on me without my permission or knowledge.

I'm not sure how long anyone can last in business if this is their response to receiving a kind-of-threat of a lawsuit. I get that it may be a hassle but businesses get these kind of garbage threats frequently and often can be ignored unless and until they actually file something. If the threat is BS then its certainly wrong for them to waste your time but being so mad you can't even speak to them seems a bit much.


I’m not handling this in a business context. They emailed my personal address with a threat about my personal zero-revenue server I maintain for the fun of it.


The author states this after knowing the true identity of who sent the bogus email. At that point they know it's not legitimate.


I invoiced them for the time they made me waste...


Some jurisdictions consider IP addresses to be personal identifying information, and so if you run a web site that logs the IP addresses of visitors you should generally try to be aware of the privacy laws in any jurisdiction that might think its laws apply to you.

These fall into three groups.

First, there are those jurisdictions in which you and/or your site are actually located. You almost always have to care about the laws in these jurisdictions.

Second, there are jurisdictions where the people who visit your site live. For these there are two questions. #1 does the jurisdiction think their law applies to you? #2 Does your jurisdiction, or someone else that you have to obey, agree?

The answer to #1 often depend on what your relationship is with the visitor. If you are selling (or trying to sell) them something it is more likely that the jurisdiction will think their law applies. If your website is not in any way targeted to them or encouraging them it is more likely that the jurisdiction won't think its law applies. (But some, like the EU with GDPR, do think it applies if you are tracking the behavior of EU users regardless of whether or not you are selling anything or trying to get EU visitors).

#2 is murkier. Say I've got a site in X specifically selling to people in Y. Y brings a civil case against me in Y. I ignore it thinking they can't touch me here in X, and Y gets a monetary judgement against me. I may be in for a surprise, because X may consider my sales to people in Y as taking place in Y, and so agree that Y has jurisdiction. If Y then brings the judgement to an X court to enforce, there is a decent change the X courts will enforce it. Oops.

Another thing you need to consider when thinking about #2 is entities that both you and the jurisdiction deal with. If you use a service provider (credit card processor, cloud service, hosting provider, etc) that operates in that jurisdiction, you may face pressure via that provider to obey the jurisdiction's law.

Finally, there are jurisdictions that you are not in, you aren't selling to or doing anything to attract visitors from, you are sure your jurisdiction won't cooperate with them on enforcing their laws, you don't use any services that operate there, and you aren't even going to visit there so even if you thoroughly annoy them no big deal.

You can probably mostly ignore these jurisdictions as far as privacy laws go.

If you are in the US I'd say that this currently means that you should be aware of GDPR and CCPA, and have some idea of how you will response to requests under them. For a lot of sites (like the OP's) a short form letter explaining that you are not covered should be fine.

As more states in the US pass their own CCPA-like laws, or we get Federal action on privacy, I'd expect those will generate large threads here. Keep an eye out for them and update your form letters appropriately.


I wholly support the CCPA and GDPR. They have their issues, but they’re big steps in the right direction.

In the case of the CCPA, nothing I do is subject to it as it applies only to business, and only to those 1) making at least $25M in revenue, 2) handling the information of at least 50,000 Californians, or 3) making at least half their annual revenue from selling Californian’s personal information.

I’m running a free website as a hobby, have $0 in revenue, have way fewer than 50,000 users total, and again have $0 in revenue.


Right. My point is that due to the way some of these laws are written even a small hobby website might do things that the law regulates. It would not at all be hard for a small hobby website with low traffic to entirely innocently run afoul of GDPR while collecting data to try to understand how to make the site more useful to their visitors.

Thus to avoid unpleasant surprised like the one you had, people with websites should add "check to see if my site has any obligations under privacy laws" to the list of routine things they do to maintain the site, and then plan accordingly.

If that check reveals that they aren't doing anything that places any obligation upon them, they can write a canned response to send back to anyone who asks.


"My point is that due to the way some of these laws are written even a small hobby website might do things that the law regulates."

And kstrauser's point is that, due to the way this law is written, by definition nothing their small, non-profit site does can ever be regulated by this law.


My point is that because some privacy laws are written so that they affect such sites (GDPR for example), it is a good idea for sites to try to be aware of new privacy laws so they can check if those laws are that kind of law.

If they are not, then when someone makes a request under that new law it is just a matter then of responding with a pre-canned response explaining that the law does not apply instead of being a panic and/or stress inducing incident.

And someone will eventually make such a request. Users don't check details. They just know that their jurisdiction has a privacy law and under it they can request things. They don't check to first to verify the site meets the law's thresholds for applicability or is under the law's jurisdiction.

If the new law does apply to your site, same idea. You want to find that out and figure out how to deal with it before you get a request instead of your first request being a scramble to comply.

Note that I'm not saying that kstrauser mishandled anything. I think that most of us overlooked that small non-profit hobbyist sites needed to keep an eye on the privacy law landscape. Kstrauser happened to be the unlucky person who fate chose to use as an example of how annoying it can be too have to figure out on short notice where your site stands as far as a given law goes, even if the answer turns out to be "that law doesn't apply to my site".


You're acting like this is an impersonal, natural event. It isn't.

Fate didn't choose kstrauser. An irresponsible academic sent an email blast to people who were not relevant to his study, without making even token efforts to filter things down.


[flagged]


> had the brazen nerve to post this lie

That's excessive, crosses into personal attack, and breaks the HN guidelines (https://news.ycombinator.com/newsguidelines.html). Please make your substantive points without stooping to that.

This is not a site for stirring up internet mobs. We're trying to avoid the online callout/shaming culture here.

https://hn.algolia.com/?sort=byDate&type=comment&dateRange=a...


That's fair, you're right. I should have been less inflammatory. This story struck a chord in me for... reasons. That's not meant as an excuse, I should have known better to have taken a step back before I said anything, especially something that could escalate tensions. Thank you for killing the comment. I'll be more mindful of it going forward.


Appreciated!


> Responses to the study have been overwhelmingly positive

7 days ago my response to this study included that the "vague legal threat at the end of your mail is immoral and gross". I guess he considered that a positive.


Sure, after all he is studying public policy on technology. Maybe from his perspective it is very interesting and useful for his research to see this response. Scaring the crap out of people over compliance with technology-related laws would seem to have some bearing on understanding the role of public policy. Of course it crosses into human research, so it's unethical since he did not have approval for that.


Would anyone be interested in organizing a class action lawsuit against Ross Teixeira and Princeton?


Is the point to get money or stop this from happening again?

If it’s the latter, I wonder if it would be more effective to bring this incident (and the apparent ineffectiveness of Princeton’s IRB) to the attention of the NIH. I would think the prospect of putting all that grant money in jeopardy would cause people in high places to take notice.


I think the plaintiffs would be people who spent money reacting to the emails, and I suspect Princeton may be quick to take care of those expenses to avoid further action.

I personally didn’t incur any monetary costs, just a lot of unnecessary stress.


Intentional infliction if emotion distress is a cause of action for a civil suit. You don't have to have lost money.


I think this is pretty awful, but I'd still give the person the benefit of the doubt that it wasn't intentional.


How could you know it's a lie?


Because he said the response was overwhelming positive at the same time that he is dealing with anxious & irate recipients of his messages.


It's possible that he's only seeing positive responses, because entities who had a negative reaction are either laying low or hiring lawyers.


That tweet was sent out around and the time the project website was edited to immediately indicate the end of the study (instead of continuing it to the spring), and adding a FAQ that tries to dispel concerns about IRB approval and email address harvesting.

That makes it seem unlikely he was unaware of the negative responses to his study.


I'd be willing to believe that, say, 90% of the responses were polite, positive customer-service language, and the other 10% were anxious and irate.


There is such a thing as a vocal minority.

In fact it is a surprisingly common phenomenon.


That is true, although I think it would still be a poor choice of words to say "overwhelmingly" without tempering that with a note about assuaging the fears of some few that misunderstood the nature of the communication. Assuming he was aware that some were stressed and involving legal council.

The "secret shopper" justification for not informing participants ahead of time about the study can only take him so far, and I don't think it was necessary here to begin with. His research is to determine the policies in place at target recipients' organizations, and that doesn't require secrecy. In fact that justification undermines the exemption status of the study: he expected that people may react differently if they thought it was a user vs. a research study.


That sounds.. plausible?


[flagged]


No, it wouldn't.

One is a polite but firm email reminding a website administrator of their obligations under the law, and the other is a thinly-veiled threat of bodily harm.


> One is a polite but firm email reminding a website administrator of their obligations under the law

It is not, because OP does not have the obligations that the email claimed they have.


[flagged]


Dude. Annoyed as I am with the researcher, there's no need to bring that kind of nonsense into the conversation.


I don't understand the outrage. Even knowing the context, the email is polite, to the point, tells the website administrator exactly what they need to do (something they are already legally required to do), and gives them ample time to do so.


I’m not legally required to reply in any way, although the email strongly implied that I am. This looked a lot like the kind of emails you’d get from someone gathering information before they decide whether to file a lawsuit against you.


I don't really see the issue with this. Nothing stops someone from mass sending fake CCPA/GDPR inquiry emails. It seems like a perfectly reasonable thing to ask.

Is asking people their favorite type of ice cream unethical because a very small percentage or going to have a negative mental/emotional reaction?


What s the cost of running it past the usual people so they can give an opinion ? Why do it in the shadow ?


Who are the usual people? The researchers said they submitted it to Princeton's IRB.


This regulation explicitly tasks everyone with responsibilities. The human consequences of those tasks are something we each simply have to be aware of as citizens. I like this study because it renders the obligation explicit, and it raises awareness of the society we are building. It should happen with all regulation.


It does not task everyone with responsibilities, and a key flaw of this study is that it lied that a responsibility exists to people who, in fact did not have any - as the article explains why.


So, it seems CCPA requests (and possibly GPDR etc) make companies/people shit bricks, because they don't have process in place for dealing with them. Maybe this legislation is beginning to accomplish its goals.


There a point here which seems to have been missed. From the study's FAQ:

> The set of websites for this study is sampled from the Tranco list of popular websites and publicly available datasets of third-party tracking websites.

If that's true, then I have a lot more sympathy for the researchers: it seems they were only targeting particularly popular sites, and sites which use tracking technology. Those sites really should have have developed processes for responding to GDPR and CCPA requests, rather than just banging in some Google Analytics code without really thinking about it.


That's because people are reporting that's incorrect. A commentator on HN got one sent to their personal domain they use only for email https://news.ycombinator.com/item?id=29600542


I looked up my own site — the one that got this whole mess started — and I hover around number 350,000. I was utterly shocked given that I have a few thousand users, and many fewer active users.

I don’t use GA, or any other third-party trackers, on any of my sites. Given that Tranco doesn’t have access to my web logs or the little Matomo setup that I self-host, I’m not sure how they claim to be analyzing my traffic in the first place.


Thanks for that -- seems like you're a pretty good counterexample to my hypothesis. Totally agree that sending these emails to site #350,000 on the Tranco list isn't justifiable.

> I’m not sure how they claim to be analyzing my traffic in the first place

They wouldn't need to analyze your traffic to find out if you were using tracking technologies, they would just need to visit your site and examine what they were served. Companies like BuiltWith and Wappalyzer offer this kind of technology survey as a service.


I mean, I'm not sure how Tranco would get those traffic numbers, since I'm not sharing traffic data with anyone. All of my analytics are within my own system.


Tranco is a merge of the Alexa, Cisco Umbrella and Majestic lists. Alexa data is gathered from a browser extension [1], Cisco Umbrella is passive DNS [2]. Not totally sure about Majestic but it looks like it might be crawling of some kind, then counting links.

[1] https://kinsta.com/blog/alexa-rank/#how-is-alexa-rank-calcul...

[2] https://umbrella-static.s3-us-west-1.amazonaws.com/index.htm...


Serious question: why should academic institutions be held to a higher standard than commercial ones? Google and Facebook, to just name two, routinely perform human subject research without informed consent as a matter of course. That's what A/B testing is. Ethically speaking the fine print in the ToS obviously isn't actually informed consent, even if the law says it is.

I don't see why being a for-profit company should somehow lower the ethical bar.


You're asking this backward: why should commercial institutions be held to a lower bar than academic ones. The response isn't to make it easier for universities to conduct research on you, but to make it harder for companies to.


I routinely launch canaries before deploying new code to production. Given there is no intent to deceive, mislead, or otherwise harm customers, and all internal testing has yielded a 'ship it' signal, is there any ethical need for users to opt-in to these A/B tests?

In my view, the alternative is to ship something we _don't_ have data on and hope for the best. And if we demand informed consent, I have to assume it goes into boilerplate agreements nobody reads at signup, which is a mockery of the term 'informed' IMO.


The ethical bar isn't "never do human testing on anybody ever".

If an A/B test is likely to cause people harm (the 2012 Facebook Depression Experiment comes to mind), then it should absolutely be held to the same IRB standard as academia.

The point of IRB is to ensure that your human testing doesn't cause anybody any material or psychological harm. "Human testing" isn't a bad word, it's just something that needs to be done with care.


Anyone can behave like an evil profit-seeking private corporation, but some organizations deliberately choose not to do that, and enjoy the resulting respect, recognition, tax benefits, and public funding that comes with it.

Also, A/B testing only happens to people who choose to interact with the corporation for a related purpose. On the other hand, when corporations go out of line, e.g. scan your address book and send invites to your friends pretending to be you, they are also demonized for that. Imagine if Facebook sent you anonymous legal threats, you think they'd get a pass?


I thought similar - it's such a routine and basic task to set up and run A/B testing to measure and optimize websites and apps for preferred conscious or subconscious response to visual stimuli. If this is a human subject research and human subject research without consent is a crime against humanity, most if not any proper web developer and the whole Web industry collectively are technically guilty of such a mouthful and abstract crime, which by the way I think they are.


Researchers in the past were pushing experiments too far. Infecting people with diseases without consent, torturing people psychologically, providing no post-experiment care, etc. I do research on human subjects, and while I often complain about the paperwork, I'm glad that researchers have to carefully consider the potential harm and benefits that research participants can face.

But as far as other manipulations done by corporations? It's hard to say when something should require permission. Technically, a used car salesman is doing some kind of experiment when trying different sales tactics on people. But, I wouldn't make him fill out an IRB application. The same could be said of charities asking for donations who tug at your heart strings, or just about any other human interaction.


How is that even remotely similar? FB ran an a/b test. These guys sent bogus legal threats.


IMHO this is not the proper frame of question for this case - it's not about whether academic institutions should be held to this standard, but rather that they have publicly committed to doing so and have made it a binding requirement on their researchers, and in this case have violated this commitment.

A different aspect (not sure if it's relevant in this particular case) is that certain sources of research funding will impose certain ethical requirements, including specific process for human subject research - and that would apply also to for-profit companies if they would receive such funding for some project; and of course, the owners/shareholders of for-profit companies are also free to impose a higher ethical bar for e.g. A/B testing in their company, just as shareholders/owners currently are imposing various other ethical and social goals for their management and company policies.

But for the actual question of "why should academic institutions be held to a higher standard than commercial ones?" IMHO a good answer is "why not?" - I mean, if they do not object to having this higher ethical bar (and these institutions generally do not), that's entirely a good thing.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: