Hacker News new | past | comments | ask | show | jobs | submit login

The claim I've heard is that you're essentially feeding your own knowledge for free into a proprietary system that can be used to generate cash for whatever corporation owns that system (i.e. ChatGPT from OpenAI). I think it's pointless to redact content for this purpose as well but clearly some people have strong takes against AI training.



It's funny, since Stack Overflow has done EXACTLY this since day one (i.e. generate cash, with user knowledge provided for free).

The only difference is SO uses community, gamification & reputation facades, to convince users to participate for free.

With OpenAI its simply a blackbox, no credit is given.

So I guess the lesson is people are willing to participate and share things for free, as long as they're given credit, community standing or something along those lines.


For me it is less about credit and more about access. Stack overflow is public and freely available - I’ll give answers for the benefit of the community. ChatGPT is a product, it’s locked behind accounts and limited unless you’re paying.

They changed the deal on their end? I’ll delete my posts.


But your answers will still be available on SO, unless you remove them. Your answers were free and publicly available until you removed them. Making them also available to paying customers of chat-gpt does not change that at all.

In fact, chat gpt will probably still be able to answer those questions, so you removing your answers actually only removes them from the public, thus forcing people to use a paid product instead.

You had one goal, and your actions achieved the opposite.

Siuan Sanche's law of unintended consequences ought to be taught in primary school. Unfortunately it isn't.


That's a completely silly reply. SO didn't lock me out of the content I created there and charge me $20/mo to re-use it.


> The claim I've heard is that you're essentially feeding your own knowledge for free into a proprietary system that can be used to generate cash for whatever corporation owns that system

Which is exactly how Stack Overflow has operated from day 1: You feed your knowledge into a system owned by someone else.

Also, it’s ridiculous to think that the answers haven’t already been scraped and cataloged every which way for AI training purposes.

The only people who suffer at this point are the people trying to use Stack Overflow. Deleting posts now is an own goal. People will see the information missing from Stack Overflow and switch to asking ChatGPT.


That was always the situation.

The difference between Stack Overflow and (one predecessor eg) Experts Exchange was that SO explicitly weren't making people pay to access that knowledge.

It was to make the internet better. And it did. I've learned a lot through SO sites and if the votes are to be believed, I've made the internet better for tens of thousands of people.

I don't know what AI having access to my content does but I don't think it changes the sums. I answer things, people benefit, SO makes money somehow.


> you're essentially feeding your own knowledge for free into a proprietary system that can be used to generate cash for whatever corporation owns that system

Ive been contemplating this as well. There's a big difference between that quote and this quote:

> you're essentially feeding your own knowledge for free into an open system (the web) that can be used to generate cash for whatever corporation or person best utilizes that system


This feels more akin to a company mirroring stackoverflow and passing it off as its own, which I would object to.


But that was also always allowed under Stack Overflow’s terms. That’s why there are so many Stack Overflow clones.

It ironic that Stack Overflow went with very open and permissive licenses, which have normally been very popular with the community, yet people are outraged when the data is actually being used openly.


Aren't people already feeding knowledge into stack overflow for free which is a propriety system used to generate cash?


At least before the answers would benefit the whole community, fulfilling the spirit of CC licensing. Once it's fed to a LLM, it's essentially a form of laundering, as it's dubious the output of the models will also be free under CC. The "open" in OpenAI is effectively fake advertising. It's a proprietary enterprise misleading people by pretending open something.


It still does tho. I don't see why "whole community" cannot include open-ai? I mean, I've used knowledge gleaned from SO for the benefit of many large corporations who employed me in the past, that's not new.

"The Whole community" and open-ai are not mutually exclusive, despite what you may feel about it.


Because then open-ai won't honor the licence, once content goes through the LLM information laundering machine.


Yeah, but you get points and badges for feeding it free knowledge, they get the cash, go figure. It's the perfect pre-NFT grift.


>> feeding your own knowledge for free into a proprietary system

Isn't that exactly what stackoverflow was?


I think people feel differently about contributing to SO or wikipedia or even quora than they would labelling CIFAR images for instance. Maybe it's a distinction without a difference but people don't usually contribute to things like stack overflow with the objective of training an AI model.


It is an odd stance as their efforts still help others, though a different company benefits financially.


I don't find anything odd about it.

There is no (reliable) attribution.

Their efforts can be used to create a silo of information that others are required to pay for (e.g. when SO shuts down, is inaccessible, or otherwise made non-functional).

Their effort might be used to create completely wrong or even harmful content while only using the training material to learn to convince people to believe the AI output.

Yes, this was all possible before, without LLMs, done by humans (or machine translation for example).

But not at this scale. SO and the comments there were still still the authoritative source, and written by humans (without the need for any proof...)

But there is growing cohort of people who want to use AI as a knowledge blackbox, search engine, encyclopedia, even as an authoritative source.

And with it comes the intent to cleanse this "knowledge" of any individual authorship or traceable source.

It is not an odd stance to oppose this, even if the concrete actions expressing this stance might be futile in this particular case.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: