Meta's plan to offer free commercial AI models puts pressure on Google, OpenAI

Havoc · on June 16, 2023

Must say I'm genuinely thankful for what Meta has done on this front. Without their research release of llama I think things would have been substantially less democratic.

This could easily have gone into a "only orgs with billions can play" direction and nobody even trying in the learned helplessness sense. Instead we're ending up in a hybrid "ok maybe can't quite train from scratch but can still tinker" space which is a lot more healthy

If they want to double down on that then I applaud them

ren_engineer · on June 16, 2023

at best it's an example of "Commoditize your complements" by cutting OpenAI, Microsoft, and Google at the knees in their attempt to make much profit off AI - https://www.joelonsoftware.com/2002/06/12/strategy-letter-v/

Facebook isn't doing this out of the kindness of their hearts, it just makes business sense for them to do it. AI seems to be in a smiling curve situation right now, where only Nvidia on the hardware side and the consumer facing products using AI as a feature are making money. The companies training the models and trying to sell the API seem like they'll have a hard time not being replaceable commodities

https://en.wikipedia.org/wiki/Smiling_curve

el_nahual · on June 16, 2023

I don't think this is a case of "commoditize your complements." What's the complement to AI that meta owns?

I think it's more of a case of "commoditize your competitors".

humanistbot · on June 16, 2023

> What's the complement to AI that meta owns?

- Platforms where users can interact with AI agents

- Extensive user and content data that can be used to fine tune large foundation models

moonchrome · on June 16, 2023

Zuckerberg talked about this in his interview with Lex Friedman.

From what I understand they already benefited by the OSS work on quantization and they see themselves as well positioned to benefit from a world where there's a bunch of specialized AI models/assistants.

fragmede · on June 16, 2023

So much of business is about people and connections, and hey, guess what Meta deals in? People and connections!

spacebanana7 · on June 17, 2023

He’s also got tens, perhaps hundreds, of millions of dollars worth of free labour from AI researchers working to make his models better.

redbell · on June 16, 2023

> Facebook isn't doing this out of the kindness of their hearts

Personally, I don't care if they had a plan to profit from this move as long as it brings value for the public, it's a win-win action.

getcrunk · on June 16, 2023

Sure it is not benevolent. But fuck do I love them for it right now. It’s better for all of us. And it took the game to a whole new level.

stephc_int13 · on June 16, 2023

Is this the beginning of a vilain redempion arc for Zuck?

jack_pp · on June 16, 2023

Best to see him as a profit seeking robot/agent that can only be nudged by market forces and political pressure.

ChatGTP · on June 16, 2023

gautamdivgi · on June 16, 2023

Meta has always been a big open source contributor. We use their prophet model for forecasting extensively.

Their quality of open source has always been good for the things I’ve used.

sheepscreek · on June 17, 2023

Does anyone remember Facebook M? I believe that was their first big foray into creating an intelligent chatbot/personal assistant. Although that may not have anything to do with LLaMA directly - it’s still pretty cool to see that a crazy vision like that is so close to becoming reality.

I am genuinely excited about the positive impacts LLMs and their future derivatives can have in computing. We can now, for the first time, truly “program” a computer in natural language. It’s even intelligent enough to “fill in the gaps” using general intelligence. Just don’t rely on it for any niche topics without teaching it a thing or two, or you’ll get bull crap back.

kibwen · on June 16, 2023

Wasn't Llama leaked, without Meta's consent?

Havoc · on June 16, 2023

Nope wasn't leaked...somehow media latched on to that wording.

Initially it was behind a consent form for research purposes. i.e. Just give some basic deets and you get access to the weights under a non-commercial license. FB shut that down after it got lots of attention.

That obvious got copy pasted onto a torrent & grew legs from there. And FB hassled some people DCMA takedowns too but seemed pretty half hearted & was too late at that stage.

[Sidequest: I believe the repo they used to distribute access also had a magnetic link in it too at one point which further confused the narrative but not 100% sure on the precise details on this]

Point is at no stage was this 100% behind closed doors and someone leaked it as you & I would understand the word in the "stolen" sense.

humanistbot · on June 16, 2023

Legally it is still restricted to those who have been granted permission for research / non-commercial purposes. That license still applies. The fact that they have stopped actively enforcing it does not change the legal status. If Elon were to announce that Twitter was using it for commercial purposes, they could sue in court and would likely win.

bhickey · on June 16, 2023

It's unclear if model weights are protected by copyright in the US.

fragmede · on June 16, 2023

Then breach of contract or trade secret.

bhickey · on June 22, 2023

Someone else's breach of contact isn't my problem.

kernal · on June 16, 2023

>Nope wasn't leaked...somehow media latched on to that wording.

Zuckerberg said it was during his Lex Fridman interview.

edgyquant · on June 16, 2023

But as you just said… it was leaked

Havoc · on June 16, 2023

If you want to consider filling in a form and downloading it a "leak" then yeah it was leaked. I wouldn't call it that but semantics I guess

Uehreka · on June 16, 2023

I think people are referring to the part where someone made a torrent so you could get it without filling in the form as the leak.

Havoc · on June 16, 2023

I see. Thanks for explaining. Yeah I guess that re-distribution was a little rogue and could perhaps be called a leak. I personally dislike that interpretation but I can see it.

edgyquant · on June 18, 2023

No it’s the part after that where the torrent was leaked to the public that I’d consider a leak.

JumpCrisscross · on June 16, 2023

It was "leaked."

They wanted to distribute it. But they couldn't, politically. So it "leaked."

bick_nyers · on June 16, 2023

You can tell by Mark's wording and body language when he talks about it in the recent Lex Fridman episode. I got the impression from him that he would have released it in a manner closer to that of open source if there wasn't a question of legal liability.

IshKebab · on June 16, 2023

Why couldn't they distribute it? They clearly could have. There's no law against it.

Perhaps you meant that they were nervous about companies using it commercially and either bringing them bad press or making money off their work? That's clearly why they only released it for researchers.

JumpCrisscross · on June 16, 2023

> Why couldn't they distribute it?

No legal issues per se. Hence the political qualifier. See: https://www.menendez.senate.gov/imo/media/doc/letter_to_meta...

elcomet · on June 17, 2023

It was trained on data they don't own. They could face a lawsuit for this, like it has happened for image generation models.

AndrewKemendo · on June 16, 2023

First hit is always free.

Don’t forget what your dealing with here: The faceless, amoral, infinitely ravenous, maw of the most efficient personal data succubus in history. Make no mistake this is something like “goodwill capture” instead of “regulatory capture.”

I see no way that this diminishes Meta’s power in any way - arguably it strengthens it by making it easier to choose a Meta architecture instead of creating a competing FOSS architecture.

So arguably all this does is raise the FOSS bar technically further entrench Meta - AND with the most important thing, having thousands of developers prime their data architectures for Meta models to eventually serve from a Meta account.

And once it’s widespread enough to lock you in, those commercial terms, whoops they changed!

ancientworldnow · on June 16, 2023

As opposed to simply being locked into openai api's as the only option?

AndrewKemendo · on June 17, 2023

A false dilemma, also referred to as false dichotomy or false binary, is an informal fallacy based on a premise that erroneously limits what options are available.[1]

[1]https://en.wikipedia.org/wiki/False_dilemma

JumpCrisscross · on June 17, 2023

These models cost millions to train. The only reason open-source LLMs have a heartbeat is they’re standing on Meta’s weights. The only third path is a public option.

LoganDark · on June 17, 2023

> The only reason open-source LLMs have a heartbeat is they’re standing on Meta’s weights.

Not necessarily.

RWKV, for example, is a different architecture that wasn't based on Facebook's weights whatsoever. I don't know where BlinkDL (the author) got the training data, but they seem to have done everything mostly independently otherwise.

https://github.com/BlinkDL/RWKV-LM

disclaimer: I've been doing a lot of work lately on an implementation of CPU inference for this model, so I'm obviously somewhat biased since this is the model I have the most experience in.

JumpCrisscross · on June 17, 2023

My personal bet is specialised models have a niche. Do you think one of these could compete with GPT if e.g. trained on a law firm’s correspondence and contracts?

LoganDark · on June 17, 2023

Probably not, honestly—because it's an RNN, old information gradually deteriorates as new information is fed into the model, which is undesirable compared to e.g. transformers that can reference any part of the context without degradation, but have a hard limit on context size (RWKV can ingest a theoretically infinite number of tokens, but after around 16k it will start to degrade into madness until restarted, so practically it does sort of have a limit).

(The reason why it degrades is because a single internal state is updated in-place per token, and the currently models have only been trained with up to 8192 tokens of context, so once you start getting double past that or so, the state starts to diverge from "sanity", with no known way to correct this. And then priming a new instance of the model with 8192 tokens or so of the new context takes a really long time because you can't compute the next token of an RNN until you also have the previous one!)

With some fine-tuning (which, even that is ... still out of reach for most people unfortunately, but I digress) it can be turned into a pretty good chat model, generate story completions, generate boilerplate code etc. and the base model is reasonably okay at most of these things already.

I think it's definitely a competitor in some areas, though I don't remember if there have already been benchmarks putting it up against the other models. I do know that it's better than the majority of other open-source models, including transformer-based ones, but this is probably more the fault of training data than architecture.

AndrewKemendo · on June 17, 2023

It is interesting how “catastrophic forgetting” is subtly different technically between these large corpus LLMs and say a CNN, but the basic “the sequences you are looking for are not here” is the same.

LoganDark · on June 17, 2023

oh, you said trained. If trained, then the long context length issue may not be as severe. It might still go mad if you let it eat too much of a hundred-page lawsuit, but if you work with portions of it (like how transformers work), RWKV can be vastly more economical than the larger models (requiring a much less powerful GPU, or even running on no GPU at all, thanks to rwkv.cpp).

rwkv.cpp in particular depends on a project that would not have existed in its current form without LLaMA, even though the project itself isn't LLaMA-specific. However there are enough other implementations of CPU inference (at least two?) that I think RWKV could still exist even if LLaMA had never.

DirkH · on June 17, 2023

Didn't the whole "we have no moat" paper show how this is actually not the case and that the future is far brighter for open-source LLMs?

vosper · on June 16, 2023

This is strategic, right?

It kicks Google, a competitor for advertising dollars. Some people feel Google is under existential threat from AI (trawling through search results full of spam and ads sucks when an AI can just tell you the answer), by allowing people to build various forms of Google competitor without doing the hard lifting of creating the LLM.

It kicks OpenAI, too, though Microsoft is perhaps less obviously a competitor to Meta right now. But Microsoft has OpenAI, loads of money, loads of engineers, and lots of product lines, so they might leverage OpenAI's tech lead to _become_ more of a competitor to Meta. It's less of a risk to Meta if OpenAI doesn't have a tech lead anymore.

DebtDeflation · on June 16, 2023

This is aimed squarely at OpenAI (and to a much lesser degree, Anthropic). Google is their own worst enemy in this space precisely because they are terrified of doing anything that will cannibalize their search business and the ad market built around it.

anon373839 · on June 17, 2023

It also makes Meta more attractive for top research talent, because researchers _really_ like to publish and get credit for their work. As OpenAI and others batten down the hatches, this could give Meta an advantage.

stereolambda · on June 16, 2023

If so, it sounds interestingly similar to 1980s-90s IBM and its approach towards personal computers.

barbariangrunge · on June 16, 2023

How will ai affect facebook? The number of fake news posts, even by ordinary people putting themselves in photos, etc, is going to really gut the platform of its value, won’t it? What about when the flood of ai images flood instagram? It’s going to be a weird decade or two for them

ChatGTP · on June 16, 2023

I noticed a fair bit generated content in my Instagram feed lately. It seemed to disappear which was interesting.

I already hate looking at Instagram discover feed, but if it’s going to turn into MJ discord. I’m really done with it.

dpflan · on June 16, 2023

It is certainly strategic to make their AI the most accessible platform for building with AI. Plus the reach of their social networks can put AI improvements that are made elsewhere into their own models and then into their software and to end users. If company A finds a way to use Model X, then that is more easily usable by Meta — they know the model quite well I would assume. Meta’s business thrives on free usage by billions of users, it needs people to keep using its platform to survive, and not leave the networks. Maybe Google is the nearest competitor in terms of ads business being so financially vital.

whimsicalism · on June 16, 2023

Looks like the UAE has beat them to the chase. Never thought I'd be thanking the Emirate for anything, but they did it.

version_five · on June 16, 2023

Given the choice between equal models from Meta and the group that released Falcon, initially with a super shady royalty license that they then open sourced, and that nobody had ever heard of before, I'd personally go for meta.

Of course, variety is good and I hope the UAE group continues to establish themselves as a credible model provider.

wsgeorge · on June 16, 2023

> initially with a super shady royalty license

I'm surprised this opinion still persists. Royalty-based licenses have been used by major game engines [0] for a long time, so that's not unprecedented.

[0] https://www.unrealengine.com/en-US/license, https://www.cryengine.com/support/view/licensing

version_five · on June 16, 2023

This isn't the first time I've seen this brought up. It's irrelevant here for so many reasons. It it unprecedented in ML models, the model was promoted as open source, the terms were absurd (10% for anything related to it plus some reporting requirements, for a foundation model that's not even tuned to anything).

In any event, bringing shitty practices from another industry into ML doesn't seem worth supporting.

Zuiii · on June 17, 2023

> In any event, bringing shitty practices from another industry into ML doesn't seem worth supporting.

Why is this still an issue for you?

All players made licensing blunders in the past and the fine folks behind falcon seem to have learned from their mistake by releasing their weights under Apache 2.0, a well understood and respected permissive license.

Many major open source projects started as proprietary software that eventually went opensource. Why hold a grudge against this project specifically? Yes, they made a mistake and learned from it. What more do you want?

whimsicalism · on June 16, 2023

I don't really care what license it was originally released on nor do I care if they have been heard of before.

danielcampos93 · on June 16, 2023

See. When you have to convince corporate lawyers and security folks that whole switching licenses makes them uneasy. BD and legal are much happier to deal with Meta than the UAE.

Zuiii · on June 17, 2023

This argument makes no sense.

People relicense software all the time and the lawyers are usually fine with it (especially when the terms are more favorable). What am I missing here?

csomar · on June 17, 2023

I am not sure if you have been to the UAE. I’d probably prefer to deal with them than Meta or any other US corporation.

GreedClarifies · on June 16, 2023

Indeed.

What the UAE did with Falcon was inspiring, well done! This is something that more governments could do.

kristianp · on June 17, 2023

Isn't there a problem with the Falcon models being too slow? At least I have seen reports of the quantised model being very slow [1]

https://huggingface.co/TheBloke/WizardLM-Uncensored-Falcon-4...

rvz · on June 16, 2023

Great move as Meta is also at the finish line in the race to zero and this indeed pressures both Google and O̶p̶e̶n̶AI.com.

If you are not at the finish line or in open source, you cannot win the race to zero.

miohtama · on June 16, 2023

It's not only free as in beer, but free as in libre, open source models.

Let's hope it will play for AI is it played for operating systems with Linux.

techwizrd · on June 16, 2023

Honestly, this is really wonderful news from Meta. I'm sufficiently impressed.

ignoramous · on June 16, 2023

> It's not only free as in beer, but free as in libre

If by it you mean LLaMA 1, then I don't think per the license one can use it for commercial projects. So, it isn't really libre. That said, all indications are that LLaMA 2 would be fully FLOSS.

If you meant OPT, it isn't libre either: https://github.com/facebookresearch/metaseq/blob/main/projec...

jart · on June 17, 2023

I love Meta. I thought they had turned to the dark side for the longest time. I've never been so wrong. When Meta released LLaMA it changed my life. I've never seen a company move more leverage to the edge at once. They must be taking notice of how much it's made folks like me adore them. So now we're getting commercial friendly models too? I didn't know Christmas could come twice in one year.

ranguna · on June 17, 2023

Meta is so big that I'm not even sure it's valid to refer to them as a whole. What I like in this context is Meta's ML research department, but to me, Meta as a whole is still the same old privacy violating, dopamine inducing, teenage deprecion and suicide causing company I've alway known.

So probably best to not mix the waters.

elcomet · on June 17, 2023

How did it change your life?

jart · on June 17, 2023

When LLaMA came out, I dropped everything I was doing to work on it. It's given me new hope for technological progress. Think about it. What were people focusing on before LLMs came out? The frontiers in software were cryptocurrency and wasm. Now we have something I can believe in and thanks to Facebook I'm able to actually use it on my own. I also got to feel like I was a part of its development when I changed llama.cpp to reduce its memory use by 2x and enable running multiple models in parallel.

smashah · on June 16, 2023

Cool now Meta please sending Cease & Desist threats to 15 year old OSS devs thanks

ilaksh · on June 16, 2023

Is there any information about when the model will be available or what it is capable of? Anything like a position on a leaderboard or a score related to reasoning or code generation?

Will it be able to run on AMD's new MI300X? I keep hoping that will put a "chip" in Nvidia's dominance since it seems more efficient.

elzbardico · on June 17, 2023

I have to confess that I never felt very comfortable with the prospect of Microsoft dominating the AI era. This is good news to me.

subarctic · on June 16, 2023

Looks like the original article is here[0], but it's behind a paywall

[0]: https://www.theinformation.com/articles/meta-wants-companies...

wkat4242 · on June 17, 2023

Nice, I hope more models like Llama will make it to the community.

Though I wouldn't mind crowdfunding one either and have it truly open and free though.

ldjkfkdsjnv · on June 16, 2023

Pretty clear they will open source as much as they can to kill off would be competitors.

alexcombessie · on June 16, 2023

I am not sure businesses would trust Meta with their sensitive data. Facebook/Meta brand equity in the B2B domain is severely tarnished, with their long history of lack of respect to data privacy laws and lack of transparency.

I just don’t see how Meta could possibly turn into a success in the B2B software business. They are a great advertising company, but they’ve never been successful in their other ventures…

xnzakg · on June 17, 2023

Honestly this feels like a good reason for them to open source it. You trust the local model and may try using it instead of one of their competitors like OpenAI. If they took LLAMA in the state they released it in and hosted inference, nobody would use it due to the lower quality and data privacy issues.