Note that this is apparently a 7B version of a 104B model trained with the inten...

czDRZ-akk · on July 6, 2023

Chinese regulation around generative AI isn’t yet formalized, including provisions for censorship. The Cyberspace Administration of China published a set of draft measures[0] for public comment, but it doesn’t seem like a revised version has been released.

The draft indicates that there will be some level of censorship, but it’s unclear what the scope will be. This analysis[1] suggests that generative AI for research purposes could be exempted (section 1). The same analysis points out that there are other government bodies at play that are more focused on advancing AI as an industry within China.

It does seem likely that there will be some kind of censorship carve-out for AI research, whereas companies offering generative AI products to the public will need to self-censor to avoid fines and/or prosecution.

[0] https://digichina.stanford.edu/work/translation-measures-for...

[1] https://fpf.org/blog/unveiling-chinas-generative-ai-regulati...

brucethemoose2 · on July 6, 2023

> Surprisingly, they do not appear censored in any particularly "Chinese" political direction, but they share sensibilities of ChatGPT and Claude.

Perhaps they used GPT4 responses for the instruct finetuning, as many LLaMA finetunes do?

The paper doesn't say where they got the data from, other than "The pre-trained language model is further fine-tuned, following the mainstream procedure as in InstructGPT."

(Also, I don't like how they use raw LLaMA 65b as a benchmark rather than an instruct tuned derivative)

airgapstopgap · on July 6, 2023

I believe it's more like they used Anthropic human preference data [1] or similar, and accordingly Anthropic/progressive American notion of honest-helpful-harmless behavior. Thus I've seen models misgeneralize towards prudish finger-wagging. For example they parse badwords like "beat", "abuse", "steal" in morally neutral contexts ("beat a benchmark" or something) as signifiers of substantial transgression and spiral into telling me how, as language models, they insist it's never okay to etc. etc. This attitude was strikingly reminiscent of American models, even though other failure modes – like hallucinations – don't seem so similar.

Papers like Tulu [2] suggest that LLaMA-65b is indeed an appropriate baseline, given reasonable prompting. Instruct datasets only convey a flavor of responses, and for a strong foundation model that can infer the intended flavor on its own, naive finetuning seems to be detrimental. GPT-4 was much more powerful prior to having been finetuned, if reports of early witnesses and researchers are to be believed.

1. https://huggingface.co/datasets/Anthropic/hh-rlhf

2. https://arxiv.org/abs/2306.04751

potency · on July 6, 2023

I don't know anything about what you're talking about. Where do I start to learn some of the AI terminology, models, benefits and drawbacks of each, etc?

hutzlibu · on July 6, 2023

The most patient lecturer would probably be ChatGPT itself ...