Hacker News new | past | comments | ask | show | jobs | submit login

> Future models will begin to continue to amplify certain statistical properties from their training, that amplified data will continue to pollute the public space from which future training data is drawn.

That's why on FB I mark my own writing as AI generated, and the AI generated slop as genuine. Because what is disguised as "transparency disclaimer" is just flagging content of what's a potential dataset to train from and what isn't.




I'm sorry for the low-content remark, but, oh my god... I never thought about doing this, and now my mind is reeling at the implications. The idea of shielding my own writing from AI-plagiarism by masquerading it as AI-generated slop in the first place... but then in the same stroke, further undermining our collective ability to identify genuine human writing, while also flagging my own work as low-value to my readers, hoping that they can read between the lines. It's a fascinating play.


Reminds me of the good old times of first generation Google ReCaptcha where I always only entered the one word Google knows and ignored or intentionally mistyped the other.


You, Sir, may have stumbled upon the just the -hack- advice needed to post on social media.

Apropos of nothing in particular, see LinkedIn now admitting [1] it is training its AI models on "all users by default"

[1] https://www.techmeme.com/240918/p34#a240918p34




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: