If machine learning is found to be fair use, the license you choose does not matter - in the same way Google Books can scan books and make them searchable without a specific license to do so.
If machine learning is not found to be fair use, and your concern is the removal of attribution, then MIT license should be fine.
> So far what we have learned is that robots.txt doesn't work;
The companies training models I'm aware of[0][1][2] all respect robots.txt for their crawling. Can't necessarily guarantee that all of them do - but the fact that smaller players are likely to use CommonCrawl (which also follows robots.txt[3]) means it should catch the vast majority of cases and I'd recommend it if you don't want your work trained on.
> major sites are using login-only access with 2FA to have any hope to keep their content away from LLMs
I suspect it's more that users with accounts are more valuable than lurkers, and framing forced sign-up as protecting user data from LLMs is a convenient excuse.
If machine learning is not found to be fair use, and your concern is the removal of attribution, then MIT license should be fine.
> So far what we have learned is that robots.txt doesn't work;
The companies training models I'm aware of[0][1][2] all respect robots.txt for their crawling. Can't necessarily guarantee that all of them do - but the fact that smaller players are likely to use CommonCrawl (which also follows robots.txt[3]) means it should catch the vast majority of cases and I'd recommend it if you don't want your work trained on.
> major sites are using login-only access with 2FA to have any hope to keep their content away from LLMs
I suspect it's more that users with accounts are more valuable than lurkers, and framing forced sign-up as protecting user data from LLMs is a convenient excuse.
[0]: https://platform.openai.com/docs/bots
[1]: https://support.anthropic.com/en/articles/8896518-does-anthr...
[2]: https://blog.google/technology/ai/an-update-on-web-publisher...
[3]: https://commoncrawl.org/faq