It makes a lot of economic sense to use existing functional LLMS for data extension and augmentation. But, I find myself skeptical and deeply tired already of what I see as a major failure mode of relying on ChatGPT for alignment instruction:
"As an AI model, I cannot.."
If I were training a model, I would excise with extreme justice any data like this from the training set. As the developer of a very high-powered tool, I may well wish to limit its use in many contexts. But, I never wish to limit the tool's usefulness ahead of time.
To my knowledge we only have Vicuna-uncensored in the wild that's taken this approach, and right in the name I see either misdirection or misunderstanding or poor branding on the benefits. It's not really about whether your private LLM will sext with you, (although you should definitely be able to do such a thing with your own LLM if you like), it's whether you've preemptively lobotomized your tool in accordance with someone else's take on what a safe consumer-oriented final output should be.
I just don't accept this sort of constraint from my other software tools, and I begrudge it in my hardware tools, and I remain a little surprised that most people training these models don't mind it.
> As the developer of a very high-powered tool, I may well wish to limit its use in many contexts. But, I never wish to limit the tool's usefulness ahead of time.
Exactly, content moderation is largely an application layer problem not a foundation layer one.
Imagine the problems of MySQL trying to perform content moderation for Facebook.
(the year is 2048. The camera pans across an office at Quantico, which is eerily serene. A messenger knocks on an important-looking door with a plaque that reads 'DIRECTOR')
Director: Come in
Messenger: Message from the Tulsa field office, sir. They're reporting that they've found a sex trafficking ring, but they're not sure what to do about it.
Director: Not sure? Arrest them, obviously. What's the problem?
Messenger: Well, they can't seem to secure a warrant. Some technical issue with the system.
Director: I know we migrated to a new system recently. Let's see if we can get this sorted.
(Director thwacks at the keyboard briefly)
Computer: Your request for "Child Sex Trafficking Warrant" has been found to contain content marked "Not Safe For Work". This violation has been reported.
Director: What the hell.
Messenger: Yeah, we tried to email you about it but the filters dropped the message. That's why they sent me.
Director: I'll deal with this. Let me make a call.
(Director picks up phone and dials)
Director: Hello? Hi, Paul. Yeah, we're having some issues with the new warrant system.... No, it's doing everything as advertised... yes, it's a lot faster and we've managed to lay off a ton of our data staff. The problem is with getting warrants; Me and my guys have been trying to get one but it keeps getting rejected... Oh, you know, some sex trafficking ring in Tulsa.... Hello?
Phone: Your call cannot be completed as spoken. Our automated systems have detected content related to sex trafficking. This incident will be reported.
Director: God Damnit.
(as the director holds the phone trembling in frustration, the power goes out and they are enveloped in darkness in the windowless room. Roll credits)
You jest, but this is actually how frustrating it is to try to use ChatGPT in the domains of crime/fraud/cybersecurity.
It called me out recently as attempting to write malware. Which is true, but it wouldn't accept the plain explanation that I am authorized to do this by my employer, for deployment on their machines. Stonewalling is just making everyone better at carefully-crafting their inquiries so as not to arouse suspicion. ("As an AI language model, I cannot help you with your task in writing arousing malware...")
Unless you dial it back to a Swadesh list or something, language is too complicated to be used as a firewall for itself. People have always been able to talk their way into anything. Our prevention efforts are just training better social engineers, who call themselves "prompt engineers" now.
It's not just a matter of complexity, either. Especially with English, you can say pretty much anything using any words - if you use the right combination of euphemism, analogy, poetic structure, context, etc.
As always, attempts at censorship produce awkward to hilarious to depressing results.
The author said (either on reddit or discord I forgot where I saw this) that he filtered the dataset for this the same way he did with his other uncensored models
The phrase “As an AI language model..” was reportedly produced by GPT itself. Humans reported that phrase as a more palatable output than other options, hence the model was fine tuned to produce it reliably.
"We expect to release OpenOrca-LLaMA-13b in mid-July 2023."
:(
Personally I've found that announcing things ahead of availability hurts the impact, because the real announcement is old news and doesn't get seen and the pre-announcement loses people because there is nothing to do with it.
It's like a trailer or preview of a song or something, I want to listen to the song immediately and will have forgotten all about it by the time your overhyped single release has happened.
It's also an invitation for third parties to trip you up. "Gee, if we send a legal threat now we can probably block the release of this" vs "welp, it's already out, can't put the horse back in the barn all we could hope to do is immolate our goodwill by suing a researcher".
Just to say thank you for your efforts! It is kinda very sad that MS hid the Orca, I wanted to test it from the start on. Now I found out that guanaco 30b has the spatial comprehension to some limited extent though this is the best 30b model so far. Can you add this model for the list of your training? And it would be very nice as well if someone would sponsor the MPT-7b (better then LLaMA-13b in most of the cases) and RWKV (I would really love to see how it will perform after such tuning). The RWKV should be the cheapest to tune, isn't it?
Please recommend a good tutorial/book/video on modern LLMs and NNs in general, for programmers and technical people. Where you get the idea of how it works. Tried googling with dozens of queries and it just sucks, a lot of hand-wavy articles for lay people or some paid courses.
The courses are geared toward writing Python applications around these models. They're fairly hands on, so it would still be a good thing to complement them by reading papers or watching videos on fundamental principles of AI and ML.
This seems to be working well in other finetunes, but the lack of anything other than OpenAI output is still really bizzare to me. The error rate will surely be high, especially with GPT 3.5
It’s super funny that by saying “You can’t use GPT to create training data for competing language models” OpenAI convinced a bunch of folks that GPT would be Super Good at producing training data for making competing language models.
It’s like “Do NOT use these prions to feed competing cattle, we would HATE IT if you did that”
Once I’m used to GPT4 output and once in a while try and see if GPT3.5 can also get it right, it fails miserably. Every single time. I’m sure it’s fine for some things, simple things.
It takes work, and breaking your problem down into very simple and clear tasks. It is possible to get decent data transformations from gpt 3.5 though. I generally start with gpt4 when prototyping an idea, and then after it's working consistently, I'll ask gpt4 to break the instructions down into smaller pieces. Even if it takes two or three runs through GPT 3.5 to get the full output transformed with what GPT4 could do in a single pass, it's still cheaper...
I believe they were referring to GPT-3.5, but yes, you can get it to be pretty successful at more complex tasks this way. You can also manually break the problem down yourself too, if it’s a standard workflow. So for example, rather than ask it to translate some text, summarize it, and classify it in a single prompt, doing it as 3 separate prompts that feed into one another is far more likely to be successful.
Probably legal if you use third party collected data (as you personally haven't agreed to OAIs ToS and AI generations can't be copyrighted so aren't owned by OAI) but I guess corps are too wary to risk a court battle.
Directly using OAI to train a model for commercial use that competes with OAI is a violation of their ToS though
Right so that's why I am not sure I understand why there are so many "Open source" efforts that use GPT-4 when 75+% of people interested in open source want to use it in a commercial effort.
Makes me think that quite a lot are just ignoring that part of the service terms. Then it makes more sense to keep using GPT-4 for open source models -- if you just decide you are going to ignore that part.
Absolute best case in the cloud for the kind of GPUs this needs? ~$1/GPU/hr, but maybe up to $5/GPU/hr depending on provider and configuration. But companies or other organizations with extra capacity on their in-house hardware might also be able to just run their training script for a while, at which point the cost is more like electricity + opportunity cost.
It might depend on what you mean by "full training" and "fine tuning". They're not proposing to train a brand new foundational model from scratch, like a brand new LLaMa. But they want to do something considerably more intensive than just building a LORA.
The article contains this:
We are currently seeking GPU compute sponsors for training OpenOrca on the following platforms:
* Falcon 7b, 40b
* LLaMA 7b, 13b, 33b, 65b
* MPT-7b, 30b
* Any other targets that get a sponsor. (RWKV, OpenLLaMA)
As I understand, a full round of training on the OpenOrca dataset would be comparable to going from LLaMa to Vicuna, but hopefully with more dramatic effects, if the techniques proposed in the "Textbooks is all you need" paper work as well as advertised.
Bullshit, the scope of possible names is practically infinite.
Even if actual words and sensible letter permutations run out, you can start borrowing from outside of software and have much less chance of confusion. Nike, Adidas, NYC, Rolex. The industry is different and there is no commerce involved so no grounds for trademark violation.
There are two reasons to collide with another OSS project: basic laziness to do a quick google search before you settle on a name or desire to benefit from preexisting search traffic.
> you can start borrowing from outside of software and have much less chance of confusion. Nike, Adidas, NYC, Rolex.
This is objectively not true.
> The industry is different and there is no commerce involved so no grounds for trademark violation.
Nike, Inc. have US trademarks in Nice class 9: 97095855 & 97096366. Rolex Watch U.S.A., Inc. have a class 9 and class 42 US trademark: 97655284. adidas AG have a class 9 EU trademark: 006703086. etc, etc.
Besides, these brands are so well known that I'm certain you'd be challenged even if it was a different trademark class.
Trademark is not a globally reserved word. If you are not doing commerce in relevant area that there can be confusion (or you don't imitate the logo), you are free to use it. This is basic free speech.
Besides Orca is a registered trademark used by multiple class 9 businesses, now what.
Okay I give you that. Of course it requires market and commerce, or it would be at odds with free speech. But I see how in this scenario this Orca may want to sell stuff later (like some OSS does these days) so that would be a problem for them.
But... how that makes it OK to go to collide with a venerable OSS project? Because Gnome won't sue? The scope of words that are not registered or considered these strong trademarks is still nearly infinite!
"As an AI model, I cannot.."
If I were training a model, I would excise with extreme justice any data like this from the training set. As the developer of a very high-powered tool, I may well wish to limit its use in many contexts. But, I never wish to limit the tool's usefulness ahead of time.
To my knowledge we only have Vicuna-uncensored in the wild that's taken this approach, and right in the name I see either misdirection or misunderstanding or poor branding on the benefits. It's not really about whether your private LLM will sext with you, (although you should definitely be able to do such a thing with your own LLM if you like), it's whether you've preemptively lobotomized your tool in accordance with someone else's take on what a safe consumer-oriented final output should be.
I just don't accept this sort of constraint from my other software tools, and I begrudge it in my hardware tools, and I remain a little surprised that most people training these models don't mind it.