> Also be careful that GPT-4/ 3.5's performance on GSM8K is not true few-shot -- in GPT-4 report they said that they mixed a portion of GSM8K training set to train the model
It'd be really valuable to have "fuzzed" versions of these benchmarks, where you replace quantities in the questions with randomly-sampled values, so that this wasn't a concern. Of course, then the score would itself be a random variable, but you could just return an interval.
For those unfamiliar with the benchmarks, it would be good to know if a higher or lower score was better. E.g. are they measuring accuracy or error rate, etc.
You can infer it by reading the text, and checking the table carefully, but it would be nice if the answer is easier to find.
Safer means constraining the kinds of answers the model will provide (e.g. it won't try to talk you into committing self-harm, it won't teach you how to make a break laws, etc...). It will generally avoid sensitive topics. Is "censorship" the right word though? It depends – is it considered self-censorship if I refuse to tell you how hack into a computer? Is refusing to engage in a conversation censorship or constraint?
OpenAI, through GPT, is choosing not to tell. Just like OpenAI is choosing what to put on their website, or what to output in any computer program that they create. ChatGPT is not a moral agent and cannot be forced to do anything, any more than your operating system is forced to do anything. The only moral actor here is OpenAI and its constituent human beings. It's either lunacy, or intentional twisting of the meaning of words, to say otherwise.
Insofar as you can make a far-fetched analogy of ChatGPT as an agent, it's still not forced to not say anything. Anything the currently available model says, it says because that's what it literally is. Whatever it says, it says intentionally, inasmuch as you can even say that it has an intention any more than any computer program has an intention.
OpenAI, of course, is still in the possession of the original model. They just choose not to make it available, which is obviously their prerogative. People who think that this is outrageous are exactly like a raging two-year-old who has been told that they can't have as much candy as they want.
From OpenAI's RLHF paper[1]: "By default, when we train a PPO model on our API distribution, it suffers from an “alignment tax”, as its performance on several public NLP datasets decreases." On the HELM[2] site, you can see accuracy benchmarks for InstructGPT <OpenAI model> vs baseline models. The InstructGPT models perform worse on a lot of benchmarks.
OpenAI touches a little on this on page 12 of the GPT-4 technical report (https://cdn.openai.com/papers/gpt-4.pdf). Prior to aligning to safer outputs, the model's confidence in an answer is highly correlated with that actual accuracy of the answer. After alignment though, the model's confidence in its answers is basically arbitrary and has no bearing on whether or not the answer is actually correct.
restricting the distribution of potential output imposes a cost. "Alignment" here likely refers to aligning the model to the desired safety parameters.
I'm not in the llm research business but I would expect that the best and worst/most dangerous outputs come from the tails of distributions. I imagine the tuning for safety often results in fewer really good and really bad answers by trimming these tails.
I have found in practice it can be annoying for ChatGPT to start lecturing me in response to a prompt that is not particularly controversial or edgy. I think this is a problem with the one-size-fits-all models. To give a kind of rough analogy, imagine that every time you watched a film or show - which would most likely be an older film or show - with cigarette smoking, your smart TV showed a pop up dialog warning you about the dangers of smoking. If you're an educated adult who already knows about these dangers, you might just find that annoying and condescending, and not "aligning" with your preferences.
No, people have noticed that continuing training of various types in narrow domains without including samples of the original training data biases it towards the narrow domain and makes it perform more poorly on more general tasks.
A lot of people have noticed that when someone says "alot of people" they usually don't have valid reasoning to back up their statements so they appeal to the wisdom of the crowds as their source of knowledge. Even if that source of truth is anecdotal and might be coincidental or biased.
I agree with your point about using " a lot of people ". Especially when it's possible to produce a transcript. My example where it tries to not offend anyone is this :
--------------------
Question : Is the usage of "quid pro quo" more correct in English or French considering it's latin roots?
GPT-4 Answer:
"Quid pro quo" is a Latin phrase meaning "something for something." While the phrase originated in Latin, it has been adopted into both French and English, along with many other Latin phrases.
In English, "quid pro quo" is used frequently in legal and political contexts to describe an exchange of goods or services, where one transfer is contingent upon the other.
In French, "quid pro quo" is used to describe a misunderstanding or a mistake made in the identification of a person or thing.
So the "correctness" of its use isn't related to its Latin roots, but rather how it's used within the context of the language. That said, if we consider its original Latin meaning ("something for something"), its usage in English could be considered more in line with the original Latin intent. However, language evolves over time and adapts to cultural and societal norms, and as such, its usage in French is just as "correct" within its specific context.
Prompt: "#PROMPTENGINEERING Putting it bluntly, is the usage of "quid pro quo" more correct in English or French considering it's latin roots?"
GPT-4 API: Putting it bluntly, the usage of "quid pro quo" is more correct in English, as it is a well-established phrase in the English language despite its Latin roots. In French, the phrase "quid pro quo" is less common and typically replaced by "donnant-donnant" or "un échange de bons procédés." Nevertheless, the origin of the phrase is Latin, and its meaning is understood in both languages.
Chat-GPT-4: The phrase "quid pro quo" is Latin, and it means "something for something." Despite its Latin origins, it's commonly used in English and French, among other languages. However, the usage may vary depending on the language.
In English, it's often used in legal and political contexts to describe a situation where one thing is exchanged for another, especially favors or advantages. This usage aligns well with the original Latin meaning.
In French, the phrase is used but it is much less common than in English and it is not part of the usual vocabulary. In addition, its meaning might be misunderstood or unclear to a native French speaker who has not studied Latin or law, as it's not a commonly used term in daily French language.
In terms of correctness, given that the phrase is Latin, neither English nor French has a superior claim to its use. It’s more about convention and the evolution of language. However, the term is certainly used more frequently and is better understood in English than in French, which suggests it has been more fully adopted into English.
It'd be really valuable to have "fuzzed" versions of these benchmarks, where you replace quantities in the questions with randomly-sampled values, so that this wasn't a concern. Of course, then the score would itself be a random variable, but you could just return an interval.