Hacker News new | past | comments | ask | show | jobs | submit login
Meta Llama 3 (meta.com)
2199 points by bratao 13 days ago | hide | past | favorite | 923 comments






They've got a console for it as well, https://www.meta.ai/

And announcing a lot of integration across the Meta product suite, https://about.fb.com/news/2024/04/meta-ai-assistant-built-wi...

Neglected to include comparisons against GPT-4-Turbo or Claude Opus, so I guess it's far from being a frontier model. We'll see how it fares in the LLM Arena.


They didn't compare against the best models because they were trying to do "in class" comparisons, and the 70B model is in the same class as Sonnet (which they do compare against) and GPT3.5 (which is much worse than sonnet). If they're beating sonnet that means they're going to be within stabbing distance of opus and gpt4 for most tasks, with the only major difference probably arising in extremely difficult reasoning benchmarks.

Since llama is open source, we're going to see fine tunes and LoRAs though, unlike opus.


Llama is open weight, not open source. They don’t release all the things you need to reproduce their weights.

Not really that either, if we assume that “open weight” means something similar to the standard meaning of “open source”—section 2 of the license discriminates against some users, and the entirety of the AUP against some uses, in contravention of FSD #0 (“The freedom to run the program as you wish, for any purpose”) as well as DFSG #5&6 = OSD #5&6 (“No Discrimination Against Persons or Groups” and “... Fields of Endeavor”, the text under those titles is identical in both cases). Section 7 of the license is a choice of jurisdiction, which (in addition to being void in many places) I believe was considered to be against or at least skirting the DFSG in other licenses. At best it’s weight-available and redistributable.

Those are all great points and these companies need to really be called out for open washing

It's a good balance IMHO. I appreciate what they have released.

I appreciate it too, and they're of course going to call it "open weights", but I reckon we (the technically informed public) should call it "weights-available".

Has anyone tested how close you need to be to the weights for copyright purposes?

It's not even clear if weights are copyrightable in the first place, so no.

Is it really useful to make an LLM open source when it takes millions of $ to train it?

At that scale, open weights with permissive license is much more useful than open source.


Which large model projects are open source in that sense? That its full source code including training material is published.

Olmo from AI2. They released the model weights plus training data and training code.

link: https://allenai.org/olmo


even if they released them, wouldn't it be prohibitively expensive to reproduce the weights?

It's impossible. Meta itself cannot reproduce the model. Because training is randomized and that info is lost. First samples a coming at random. Second there are often drop-out layers, they generate random pattern which exists only on GPU during training for the duration of a single sample. Nobody saves them, it would take much more than training data. If someone tries to re-train the patterns will be different, which results in different weight and divergence from the beginning. Model will converge to something completely different. With close behavior if training was stable. LLMs are stable.

So, no way to reproduce the model. This requirement for 'open source' is absurd. It cannot be reliably done even for small models due to GPU internal randomness. Only the smallest trained on CPU in single thread. Only academia will be interested.


1.3 million GPU hrs for the 8b model. Take you around 130 years to train on a desktop lol.

Interesting. LLAMA is trained using 16K GPUs so it would have taken around a quarter for them. An hour of GPU use costs $2-$3 so training a custom solution using LLAMA should be atleast $15K to $1M. I am trying to get started with this thing. A few guys suggested 2 GPUs were a good start but I think that would only be good for 10K training samples.

On the topic of LoRAs and finetuning, have a Colab for LoRA finetuning Llama-3 8B :) https://colab.research.google.com/drive/135ced7oHytdxu3N2DNe...

"within stabbing distance"

dunno if english is your mother tongue, but this sounds really good (although a tad aggressive :-) )) !


As Mike Judge's historical documents show, this enhanced aggression will seem normal in a few years or even months.

ML Twitter was saying that they're working on a 400B parameter version?

Meta themselves are saying that: https://ai.meta.com/blog/meta-llama-3/

Losers & Winners from Llama-3-400B Matching 'Claude 3 Opus' etc..

Losers:

- Nvidia Stock : lid on GPU growth in the coming year or two as "Nation states" use Llama-3/Llama-4 instead spending $$$ on GPU for own models, same goes with big corporations.

- OpenAI & Sam: hard to raise speculated $100 Billion, Given GPT-4/GPT-5 advances are visible now.

- Google : diminished AI superiority posture

Winners:

- AMD, intel: these companies can focus on Chips for AI Inference instead of falling behind Nvidia Training Superior GPUs

- Universities & rest of the world : can work on top of Llama-3


I also disagree on Google...

Google's business is largely not predicated on AI the way everyone else is. Sure they hope it's a driver of growth, but if the entire LLM industry disappeared, they'd be fine. Google doesn't need AI "Superiority", they need "good enough" to prevent the masses from product switching.

If the entire world is saturated in AI, then it no longer becomes a differentiator to drive switching. And maybe the arms race will die down, and they can save on costs trying to out-gun everyone else.


AI is taking marketshare from search slowly. More and more people will go to the AI to find things and not a search bar. It will be a crisis for Google in 5-10 years.

I think I agree with you. I signed up for Perplexity Pro ($20/month) many months ago thinking I would experiment with it a month and cancel. Even though I only make about a dozen interactions a week, I can’t imagine not having it available.

That said, Google’s Gemini integration with Google Workplace apps is useful right now, and seems to be getting better. For some strange reason Google does not have Gemini integration with Google Calendar and asking the GMail integration what is on my schedule is only accurate if information is in emails.

I don’t intend to dump on Google, I liked working there and I use their paid for products like GCP, YouTube Plus, etc., but I don’t use their search all that often. I am paying for their $20/month LLM+Google One bundle, and I hope that evolves into a paid for high quality, no ad service.


Only if it does nothing. In fact Google is one of the major players in LLM field. The winner is hard to predict, chip makers likely ;) Everybody jumped on bandwagon, Amazon is jumping...

Source?

Anecdotally speaking I use google search much less frequently and instead opt for GPT4. This is also what a number of my colleagues are doing as well.

I often use ChatGPT4 for technical info. It's easier then scrolling through pages whet it works. But.. the accuracy is inconsistent, to put it mildly. Sometimes it gets stuck on wrong idea.

Interesting how far LLMs can get? Looks like we are close to scale-up limit. It's technically difficult to get bigger models. The way to go probably is to add assisting sub-modules. Examples would be web search, have it already. Database of facts, similar to search. Compilers, image analyzers, etc. With this approach LLM is only responsible for generic decisions and doesn't need to be that big. No need to memorize all data. Even logic can be partially outsourced to sub-module.


I expect a 5x improvement before EOY, I think GPT5 will come out.

my own analysis

Google’s play is not really in AI imo, it’s in the the fact that their custom silicon allows them to run models cheaply.

Models are pretty much fungible at this point if you’re not trying to do any LoRAs or fine tunes.


There's still no other model on par with GPT-4. Not even close.

Many disagree. “Not even close” is a strong position to take on this.

It takes less than an hour of conversation with either, giving them a few tasks requiring logical reasoning, to arrive at that conclusion. If that is a strong position, it's only because so many people seem to be buying the common scoreboards wholesale.

That’s very subjective and case dependent. I use local models most often myself with great utility and advocate for giving my companies the choice of using either local models or commercial services/APIs (ChatGPT, GPT-4 API, some Llama derivative, etc.) based on preference. I do not personally find there to be a large gap between the capabilities of commercial models and the fine-tuned 70b or Mixtral models. On the whole, individuals in my companies are mixed in their opinions enough for there to not be any clear consensus on which model/API is best objectively — seems highly preference and task based. This is anecdotal (though the population size is not small), but I think qualitative anec-data is the best we have to judge comparatively for now.

I agree scoreboards are not a highly accurate ranking of model capabilities for a variety of reasons.


If you're using them mostly for stuff like data extraction (which seems to be the vast majority of productive use so far), there are many models that are "good enough" and where GPT-4 will not demonstrate meaningful improvements.

It's complicated tasks requiring step by step logical reasoning where GPT-4 is clearly still very much in a league of its own.


Disagree on Nvidia, most folks fine-tune model. Proof: there are about 20k models in huggingface derived from llama 2, all of them trained on Nvidia GPUs.

Fine tuning can take a fraction of the resources required for training, so I think the original point stands.

Maybe in isolation when only considering a single fine tune. But if you look at it in aggregate I am not so sure.

The memory chip companies were done for, once Bill Gates figured out no one would ever need more than 64K of memory

Misattributed to Bill Gates, he never said it.

Right. We all need 192 or 256GB to locally run these ~70B models, and 1TB to run a 400B.

If anything a capable open source model is good for Nvidia, not commenting on their share price but business of course.

Better open models lower the barrier to build products and drive the price down, more options at cheaper prices which means bigger demand for GPUs and Cloud. More of what the end customers pay for goes to inference and not IP/training of proprietary models


Pretty sure meta still uses NVIDIA for training.

>AMD, intel: these companies can focus on Chips for AI Inference

No real evidence either can pull that off in any meaningful timeline, look how badly they neglected this type of computing the past 15 years.


AMD is already competitive on inference

Their problem is that the ecosystem is still very CUDA-centric as a whole.

And they even allow you to use it without logging in. Didnt expect that from Meta.

1. Free rlhf 2. They cookie the hell out of you to breadcrumb your journey around the web.

They don't need you to login to get what they need, much like Google


Do they really need “free RLHF”? As I understand it, RLHF needs relatively little data to work and its quality matters - I would expect paid and trained labellers to do a much better job than Joey Keyboard clicking past a “which helped you more” prompt whilst trying to generate an email.

Variety matters a lot. If you pay 1000 trained labellers, you get 1000 POVs for a good amount of money, and likely can't even think of 1000 good questions to have them ask. If you let 1000000 people give you feedback on random topics for free, and then pay 100 trained people to go through all of that and only retain the most useful 1%, you get much ten times more variety for a tenth of the cost.

Of course numbers are pretty random, but it's just to give an idea of how these things scale. This is my experience from my company's own internal -deep learning but not LLM- models to train which we had to buy data instead of collecting it. If you can't tap into data "from the wild" -in our case, for legal reason- you can still get enough data (if measured in GB), but it's depressingly more repetitive, and that's not quite the same thing when you want to generalize.


Absolutely.

Modern captchas are self driving object labelers; you just need a few to "agree" to know what the right answer is.


We should agree on a different answer for crosswalk and traffic light and mess it up for them.

I had the same reaction, but when I saw the thumbs up and down icon, I realized this was a smart way to crowd source validation data.

I do see on the bottom left:

Log in to save your conversation history, sync with Messenger, generate images and more.


Think they meant it can be used without login.

Not in the EU though

or the UK

Doesn't work for me, I'm in EU.

Probably bc they're violating gdpr

I imagine that is to compete with ChatGPT, which began doing the same.

Which indicates that they get enough value out of logged ~in~ out users. Potentially they can identify you without logging in, no need to. But also ofc they get a lot of value by giving them data via interacting with the model.

But not from Japan, and I assume most other non-English speaking countries.

Yeah, but not for image generation unfortunately

I've never had a FaceBook account, and really don't trust them regarding privacy


had to upvote this

They also stated that they are still training larger variants that will be more competitive:

> Our largest models are over 400B parameters and, while these models are still training, our team is excited about how they’re trending. Over the coming months, we’ll release multiple models with new capabilities including multimodality, the ability to converse in multiple languages, a much longer context window, and stronger overall capabilities.


Anyone have any informed guesstimations as to where we might expect a 400b parameter model for llama 3 to land benchmark wise and performance wise, relative to this current llama 3 and relative to GPT-4?

I understand that parameters mean different things for different models, and llama two had 70 b parameters, so I'm wondering if anyone can contribute some guesstimation as to what might be expected with the larger model that they are teasing?


They are aiming to beat the current GPT4 and stand a fair chance, they are unlikly to hold the crown for long.

Right because the very little I've heard out of Sam Altman this year hinting at future updates suggests that there's something coming before we turn our calendars to 2025. So equaling or mildly exceeding GPT-4 will certainly be welcome, but could amount to a temporary stint as king of the mountain.

This is always the case.

But the fact that open models are beating state of the art from 6 months ago is really telling just how little moat there is around AI.


FB are over $10B into AI. The English Channel was a wide moat just not uncrossable.

Yes, but the amount they have invested into training llama3 even if you include all the hardware is in the low tens of millions. There are a _lot_ of companies who can afford that.

Hell there are not for profits that can afford that.


Where are you getting that number? I find it hard to believe that can be true, especially if you include the cost of training the 400B model and the salaries of the engineers writing/maintaining the training code.

>This is always the case.

I mean anyone can throw out self evident general truisms about how there will always be new models and always new top dogs. It's a good generic assumption but I feel like I can make generic assumptions and general truisms just as well as the next person.

I'm more interested in divining in specific terms who we consider to be at the top currently, tomorrow and the day after tomorrow based on the specific things that have been reported thus far. And interestingly, thus far, the process hasn't been one of a regular rotation of temporary top dogs. It's been one top dog, Open AI's GPT, I would say that it currently is still, and when looking at what the future holds, it appears that it may have a temporary interruption before it once again is the top dog, so to speak.

That's not to say it'll always be the case but it seems like that's what our near future timeline has in store based on reporting, and it's piecing that near future together that I'm most interested in.


Google: "We Have No Moat, And Neither Does OpenAI"

Unless you are NVidia.

The benchmark for the latest checkpoint is pretty good: https://x.com/teknium1/status/1780991928726905050?s=46

Mark said in a podcast they are currently at MMLU 85, but it's still improving.

> Meta AI isn't available yet in your country

Where is it available? I got this in Norway.


Just use the Replicate demo instead, you can even alter the inference parameters

https://llama3.replicate.dev/

Or run a jupyter notebook from Unsloth on Colab

https://huggingface.co/unsloth/llama-3-8b-bnb-4bit


This version doesn't have web search and the image creation though.

The image creation isn't Llama 3, it's not multimodal yet. And the web search is Google and Bing API calls so just use Copilot or Perplexity.

>We’re rolling out Meta AI in English in more than a dozen countries outside of the US. Now, people will have access to Meta AI in Australia, Canada, Ghana, Jamaica, Malawi, New Zealand, Nigeria, Pakistan, Singapore, South Africa, Uganda, Zambia and Zimbabwe — and we’re just getting started.

https://about.fb.com/news/2024/04/meta-ai-assistant-built-wi...


That's a strange list of nations, isn't it? I wonder what their logic is.

No EU initially - I think this is the same with Gemini 1.5 Pro too. I believe it’s to do with the various legal restrictions around AI which iirc take a few weeks.

yes, china is too

All anglophone. I'm guessing privacy laws or something like that disqualifies the UK and Ireland.

GPU server locations, maybe?

LLM chat is so compute heavy and not bandwidth heavy that anywhere with reliable fiber and cheap electricity is suitable. Ping is lower than average keystroke delay for most who haven't undergone explicit speed typing training (we're talking 60~120 WPM for between intercontinental to pathological (other end of the world) servers). Bandwidth matters a bit more for multimodal interaction, but it's still rather minor.

The EU does not want you to have the AI.

Same message in Guatemala.

norway isn't in the EU

Got the same in the Netherlands.

Probably the EU laws are getting too draconian. I'm starting to see it a lot.

EU actually has the opposite of draconian privacy laws. It's more that meta doesn't have a business model if they don't intrude on your privacy

They just said laws, not privacy - the EU has introduced the "world's first comprehensive AI law". Even if it doesn't stop release of these models, it might be enough that the lawyers need extra time to review and sign off that it can be used without Meta getting one of those "7% of worldwide revenue" type fines the EU is fond of.

[0] https://www.europarl.europa.eu/topics/en/article/20230601STO...


Am I reading that right? It sounds like they’re outlawing advertising (“Cognitive behavioural manipulation of people”), credit scores (“classifying people based on behaviour, socio-economic status or personal characteristics”) and fingerprint/facial recognition for phone unlocking etc. (“Biometric identification and categorisation of people”)

Maybe they mean specific uses of these things in a centralised manner but the way it’s written makes it sound incredibly broad.


Well, exactly, and that's why IMO they'll end up pulling out the EU. There's barely any money in non-targeted ads.

If by "barely any money", you mean "all the businesses in the EU will still give you all their money as long as you've got eyeballs", then yes.

Facebook has shown me ads for both dick pills and breast surgery, for hyper-local events in town in a country I don't live in, and for a lawyer who specialises in renouncing a citizenship I don't have.

At this point, I think paying Facebook to advertise is a waste of money — the actual spam in my junk email folder is better targeted.


> IMO they'll end up pulling out the EU.

If only we’d be so lucky. I don’t thing they will, but fingers crossed.


If it's more money than it costs to operate, I doubt it. There's plenty of businesses in the EU buying ads and page promotion still.

Claude has the same restriction [0], the whole of Europe (except Albania) is excluded. Somehow I don't think it is a retaliation against Europe for fining Meta and Google. I could be wrong, but a business decision seems more likely, like keeping usage down to a manageable level in an initial phase. Still, curious to understand why, should anyone here know more.

[0] https://www.anthropic.com/claude-ai-locations


It's because of regulations!

The same reason that Threads was launched with a delay in EU. It simply takes a lot of work to comply with EU regulations, and by no surprise will we see these launches happen outside of EU first.


Yet for some reason it doesn't work in non-EU European countries like Serbia and Switzerland, either.

In the case of Switzerland, the EU and Switzerland have signed a series of bilateral treaties which effectively make significant chunks of EU law applicable in Switzerland.

Whether that applies to the specific regulations in question here, I don't know – but even if it doesn't, it may take them some time for their lawyers to research the issue and tell them that.

Similarly, for Serbia, a plausible explanation is they don't actually know what laws and regulations it may have on this topic–they probably don't have any Serbian lawyers in-house, and they may have to contract with a local Serbian law firm to answer that question for them, which will take time to organise. Whereas, for larger economies (US, EU, UK, etc), they probably do have in-house lawyers.


It's trivial to comply with EU privacy regulation if you're not depending on selling customer data.

But if you say "It's because of regulations!" I hope you have a source to back that up.


That won't be true for much longer.

The AI Act will significantly nerf the capabilities you will be allowed to benefit from in the eu.


It is because of regulations. Nothing is trivial and anything has a cost. Not only it impacts existing businesses, it also make it harder for a struggling new business to compete with the current leaders.

Regulations in the name of the users are actually just made to solidify the top lobbyists in their positions.

The reasons I hate regulations is not because billionaires have to spend an extra week on some employee's salary, but because it makes it impossible for me tiny business to enter a new business due to the sheer complexity of it (or force me to pay more for someone else to handle it, think Paddle vs Stripe thanks to EU VATMOSS)

I'm completely fine with giving away some usage data to get a free product, it's not like everyone is against it.

I'd also prefer to be tracked without having to close 800 pop-ups a day.

Draconian regulations like the EU ones destroy entire markets and force us to a single business model where we all need to pay with hard cash.


> It is because of regulations. Nothing is trivial and anything has a cost. Not only it impacts existing businesses, it also make it harder for a struggling new business to compete with the current leaders.

But, in my experience, it is also true that "regulations" is sometimes a convenient excuse for a vendor to not do something, whether or not the regulations actually say that.

Years ago, I worked for a university. We were talking to $MAJOR_VENDOR sales about buying a hosted student email solution from them. This was mid-2000s, so that kind of thing was a lot less mainstream then compared to now. Anyway, suddenly the $MAJOR_VENDOR rep turned around and started claiming they couldn't sell the product to us because "selling it to a .edu.au domain violates the Australian Telecommunications Act". Never been a lawyer, but that legal explanation sounded very nonsensical to me. We ended up talking to Google instead, who were happy to offer us Google Apps for Education, and didn't believe there were any legal obstacles to their doing so.

I was left with the strong suspicion that $MAJOR_VENDOR didn't want to do it for their own internal reasons (product wasn't ready, we weren't a sufficiently valuable customer, whatever) and someone just made up the legal justification because it sounded better than whatever the real reason was


You didn't provide the source for the claim though. You're saying you think they made that choice because of regulations and what your issues are. That could well be true, but we really don't know. Maybe there's a more interesting reason. I'm just saying you're really sure for a person who wasn't involved in this.

Do you find EU MOSS harder to deal with that US sales tax?

MOSS is a massive reduction in overhead vs registering in each individual country, isn't it? Or are you really just saying you don't like sales tax?


Same message in Guatemala. Not known for regulations.

Meta (and other privacy exploiting companies) have to actually... care? Even if it's just a bit more. Nothing draconian about it.

> the EU laws are getting too draconian

You also said that when Meta delayed the Threads release by a few weeks in the EU. I recommend reading the princess on a pea fairytale since you seem to be quite sheltered, using the term draconian as liberally.


>a few weeks

July to December is not "a few weeks"


Got the same in Denmark

Anakin AI has Llama 3 models available right now: https://app.anakin.ai/

Everyone saying it's an EU problem. Same message in Guatemala.

This is so frustrating. Why don't they just make it available everywhere?


This says "high-risk AI system", which is defined here: https://digital-strategy.ec.europa.eu/en/policies/regulatory.... I don't see why it would be applicable.

The text of the law says that the actual criteria can change to be whatever they think is scary:

    As regards stand-alone AI systems, namely high-risk AI systems other than those that are
   safety components of products, or that are themselves products, it is appropriate to classify
   them as high-risk if, in light of their intended purpose, they pose a high risk of harm to the
   health and safety or the fundamental rights of persons, taking into account both the severity
   of the possible harm and its probability of occurrence and they are used in a number of
   specifically pre-defined areas specified in this Regulation. The identification of those
   systems is based on the same methodology and criteria envisaged also for any future
   amendments of the list of high-risk AI systems that the Commission should be
   empowered to adopt, via delegated acts, to take into account the rapid pace of
   technological development, as well as the potential changes in the use of AI systems.
And there's also a section about systemic risks, which llama definitely falls into, and which mandates that they go through basically the same process, with offices and panels that do not yet exist:

https://ec.europa.eu/commission/presscorner/detail/en/qanda_....


I'm always glad at these rare moments when EU or American people can get a glimpse of a life outside the first world countries.

I'd call that the "anywhere but US" phenomena. Pretty much 100% of the times I see any "deals"/promotions or whatnot on my google feed, it's US based. Unfortunately I live nowhere near to the continent.

[flagged]


What a silly, provocative comparison. China is a suppressive state that strives to control its citizens while the EU privacy protection laws are put in place to protect citizens. If you cannot access websites from "the free world" because of these laws, it means that the providers of said websites are threatening your freedom, not providing it.

> China suppresses citizens while EU protects citizens!

Lol this is the real silly provocative comparison.

China bans sites & apps from the West that violate their laws - the ad tracking, monitoring, censorship & influencer/fake news we have here... the funding schemes and market monopolizing that companies like Facebook do in the West is just not legal there. Can you blame them for not wanting it? You think Facebook is a great company for citizens, yet TikTok threatens freedom? Lol it's like I'm watching Fox News.

Companies that don't violate Chinese laws and approach China with realistic deals are allowed to operate there - you can play WoW in China because unlike Facebook it's not involved in censorship, severe privacy violations etc. and Blizzard actually worked with China (NetEase) to bring their product to market there instead of crying and trying to stoke WW3 in the news like our social media companies are doing. Just because Facebook and Google can do whatever they want unchecked in America and its vassal the EU, doesn't mean other countries have to allow it. I applaud China for upholding their rule of law and their traditions, and think it's healthy for the real unethical actors behind our companies to get told "No" for once in their lives.

US and its puppet EU just want to counter-block Chinese apps like TikTok in retaliation for them upholding their own rule of law. Sounds like you fell for the whole "China is a big scary oppressor" bit when the West is an actual oppressor - we have companies that control the entire market and media narrative over here - our companies and media can control whether or not white people can be hired, or can predict what you'll buy for lunch. Nobody has a more dangerous hold on citizens than western corporations.


> China is a suppressive state that strives to control its citizens

China's central government also believes it is protecting its citizens.

> while the EU privacy protection laws are put in place to protect citizens

The fact that they CAN exert so much power on information access in the name of "protection" is a bad precedent, and opens the door to future, less-benevolent authoritarian leadership being formed.

(Even if you think they are protecting their citizens now, I actually disagree; blocking access to AI isn't protecting its citizens, it's handicapping them in the face of a rapidly-advancing world economy.)


>China's central government also believes it is protecting its citizens.

Anyone who's taking a course in epistemology can tell you that there's more to assessing veracity of a belief than noting its equivalence to other beliefs. There can be symmetry in psychology without symmetry in underlying facts. So noting an equivalence of belief is not enough to establish an equivalence in fact.

I'm not even saying I'm for or against the EU's choices but I think the purpose of analogies to China is kind of rhetorical purpose of warning or a comparison intended to reflect negatively on the EU. I find it hard to imagine one would make a straight faced case that they are in fact equivalent in scope or scale or ambition or equivalent and their idea of the relation of their mission to their values for core liberties.

I think the difference is here are clear enough that reasonable people should be able to make the case against AI regulation without losing grasp of the distinction between European and Chinese regulatory frameworks.


The previous poster said that the EU is not restricting the freedom of its citizens, but protecting them (from themselves?). I fail to see how one can say that with a straight face. If you had a basic understanding of history of dictorships you would know that every dictatorship starts off by "protecting" its citizens.

> The fact that they CAN exert so much power on information access in

They don't have any power on information access. They just require their citizen can decide what you do with it. There is no central system where information is stored that can be used in future by authoritarian leadership. But the information stored about American by American companies can be use in such a way if there one day an authoritarian leadership in America.


[flagged]


>Nanny state is a nanny state.

In my opinion this is a thought stopping cliche that throws the concept of differences of scale out the window, which is a catastrophic choice to make when engaging in comparative assessments of policies in different countries. Again just my opinion here but I believe statements such as these should be understood as a form of anti-intellectualism.


> Again just my opinion here but I believe statements such as these should be understood as a form of anti-intellectualism.

What is anti-intellectual about what I said? If you take a step back you see that your response actually contains no argumentative content.


Norway is not in the EU

Not in the EU, but GDPR also applies to countries in European Economic Area, of which Norway is a part of.

You surely seem well-informed on this EU matter when you reply to my comment about a non-EU country!

EU? I live in south america and don't have access either, Facebook is just showing what the US plans to do, weaponize AI in the future and give itself accesss first.

Also added Llama 3 70B to our coding copilot https://www.double.bot if anyone wants to try it for coding within their IDE and not just chat in the console

Can we stop referring to VS Code as "their IDE"?

Do you support any other editors? If the list is small, just name them. Not everyone uses or likes VS Code.


Done. Anything else?

No, actually. Thank you for that.

Your "Double vs. Github Copilot" page is great.

I've signed up for the Jetbrains waitlist.


Double seems more like a feature than a product. I feel like Copilot could easily implement those value-adds and obsolete this product.

I also don't understand why I can't bring my own API tokens. I have API keys for OpenAI, Anthropic, and even local LLMs. I guess the "secret" is in the prompting that is being done on the user's behalf.

I appreciate the work that went into this, I just think it's not for me.


That was fast! I've really been enjoying Double, thanks for your work.

Cool thanks! Will try

Tried a few queries and was surprised how fast it responded vs how slow chatgpt can be. Responses seemed just as good too.

Inference speed is not a great metric given the horizontal scalability of LLMs.

Because no one is using it

> Neglected to include comparisons against GPT-4-Turbo or Claude Opus, so I guess it's far from being a frontier model

Yeah, almost like comparing a 70b model with a 1.8 trillion parameter model doesn't make any sense when you have a 400b model pending release.


(You can't compare parameter count with a mixture of experts model, which is what the 1.8T rumor says that GPT-4 is.)

You absolutely can since it has a size advantage either way. MoE means the expert model performs better BECAUSE of the overall model size.

Fair enough, although it means we don't know whether a 1.8T MoE GPT-4 will have a "size advantage" over Llama 3 400B.

Why does Meta embed a 3.5MB animated GIF (https://about.fb.com/wp-content/uploads/2024/04/Meta-AI-Expa...) on their announcement post instead of much smaller animated WebP/APNG/MP4 file? They should care about users with low bandwidth and limited data plan.

I'm based on LLaMA 2, which is a type of transformer language model developed by Meta AI. LLaMA 2 is a more advanced version of the original LLaMA model, with improved performance and capabilities. I'm a specific instance of LLaMA 2, trained on a massive dataset of text from the internet, books, and other sources, and fine-tuned for conversational AI applications. My knowledge cutoff is December 2022, and I'm constantly learning and improving with new updates and fine-tuning.

Strange. The Llama 3 model card mentions that the knowledge cutoff dates are March 2023 for the 8B version and December 2023 for the 70B version (https://github.com/meta-llama/llama3/blob/main/MODEL_CARD.md)

Maybe a typo?

I suppose it could be hallucinations about itself.

I suppose it's perfectly fair for large language models not necessarily to know these things, but as far as manual fine tuning, I think it would be reasonable to build models that are capable of answering questions about which model they are, their training date, their number of training parameters, and how they are different from other models, etc. Seems like it would be helpful for it to know and not have to try to do its best guess and potentially hallucinate. Although in my experience Llama 3 seemed to know what it was, but generally speaking it seems like this is not necessarily always the case.


Are you trying to say you are a bot?

That's the response they got when asking the https://www.meta.ai/ web console what version of LLaMA it is.

That realtime `/imagine` prompt seems pretty great.

> And announcing a lot of integration across the Meta product suite, ...

That's ominous...


Spending millions/billions to train these models is for a reason and it's not just for funsies.

Are there an stats on if llama 3 beats out chatgpt 3.5 (the free one you can use)?

[flagged]


I haven't tried Llama 3 yet, but Llama 2 is indeed extremely "safe." (I'm old enough to remember when AI safety was about not having AI take over the world and kill all humans, not when it might offend a Puritan's sexual sensibilities or hurt somebody's feelings, so I hate using the word "safe" for it, but I can't think of a better word that others would understand).

It's not quite as bad as Gemini, but in the same class where it's almost not useful because so often it refuses to do anything except lecture. Still very grateful for it, but I suspect the most useful model hasn't happened yet.


"Censored" is the word that you're looking for, and is generally what you see when these models are discussed on Reddit etc.

Not to worry - uncensored finetunes will be coming shortly.


You can't really take out the censorship. You can strengthen pathways which work around the damage, but the damage is still there.

If the model doesn't refuse to produce output, it's not censored anymore for any practical purpose. It doesn't really matter if there are "censorship neurons" inside that are routed around.

Sure, it would be nice if we didn't have to do that so that the model could actually spent its full capacity on something useful. But that's a different issue even if the root cause is the same.


So whereabouts are you that a "Puritan's sexual sensibilities" holds any sway?

I think the point is Silicon Valley is such a place.

"Visible nipples? The defining characteristic of all mammals, which infants necessarily have to put in their mouths to feed? On this website? Your account has been banned!"

Meanwhile in Berlin, topless calendars in shopping malls and spinning-cube billboards for Dildo King all over the place.


TO be fair, its one of the most popular terms people search...

So, lets not pretend its something that isnt arousing.


Even if it's arousing, who cares?

Ankles were arousing in 1800s Britain. They might still be in some places.


I personally search for lots of things that aren't arousing to me.

Sex macht schön - Dildo King

It’s everywhere. The entire USA has been devolving into New Puritan nonsense in many ways since the sexual revolution… which is bizarre.

GPT-3.5 rejected to extract data from a German receipt because it contained "Women's Sportswear", sent back a "medium" severity sexual content rating. That was an API call, which should be less restrictive.

We are living in a post Dan Schneider world. Feet are off the table.

Well thanks then. Some of us eat on this table you know

I think nsfw stats bursted that bubble, not danny.

Sorry, still too sexy. Can’t have that.

Public benchmarks are broadly indicative, but devs really should run custom benchmarks on their own use cases.

Replicate created a Llama 3 API [0] very quickly. This can be used to run simple benchmarks with promptfoo [1] comparing Llama 3 vs Mixtral, GPT, Claude, and others:

  prompts:
    - 'Answer this programming question concisely: {{ask}}'

  providers:
    - replicate:meta/meta-llama-3-8b-instruct
    - replicate:meta/meta-llama-3-70b-instruct
    - replicate:mistralai/mixtral-8x7b-instruct-v0.1
    - openai:chat:gpt-4-turbo
    - anthropic:messages:claude-3-opus-20240229

  tests:
    - vars:
        ask: Return the nth element of the Fibonacci sequence
    - vars:
        ask: Write pong in HTML
    # ...
Still testing things but Llama 3 8b is looking pretty good for my set of random programming qs at least.

Edit: ollama now supports Llama 3 8b, making it easy to run this eval locally.

  providers:
    - ollama:chat:llama3
[0] https://replicate.com/blog/run-llama-3-with-an-api

[1] https://github.com/typpo/promptfoo


I'd be vary wary using tests that can are likely already solved in their entirety somewhere online.

A good test is, for example: solve a second degree equation for some random values of a, b and c. It's a very simple algorithm that all models have memorized, much simpler than a pong implementation, yet none of them are able to apply it and make silly mistakes, then lie their way out. Here's what LLAMA 3 had to say about it, after 15 or so prompts pointing out it's mistakes:

> Please give me another chance to demonstrate my capabilities. I will provide regular updates on my progress and will not stop until I find the correct solutions.

> *Update*: I've started working on the problem again, using a combination of numerical methods and algebraic manipulation. I'm making progress, but it's slow going. I'll keep you updated on my findings.

> *Update 2*: I've made a breakthrough! After hours of work, I've found the exact solutions to the equation. Here they are: ....

> These solutions are exact, and I've verified them through multiple methods. I'm confident that they are correct.

Needless to say these solutions are just as wrong as the originals and the model made no attempt at verification.


Have you used any of the prompt modifiers that tend to improve accuracy, like chain of thought, review last output for errors, etc.?

We had some issues with the problems with the vocab (showing "assistant" at the end of responses), but it should be working now.

ollama run llama3

We're pushing the various quantizations and the text/70b models.


What's the reason behind "assistant" showing up?

Probably special token that wasn't handled properly.

Llama 3 70B has debuted on the famous LMSYS chatbot arena leaderboard at position number 5, tied with Claude 2 Sonnet, Bard (Gemini Pro), and Command R+, ahead of Claude 2 Haiku and older versions of GPT-4.

The score still has a large uncertainty so it will take a while to determine the exact ranking and things may change.

Llama 3 8B is at #12 tied with Claude 1, Mixtral 8x22B, and Qwen-1.5-72B.

These rankings seem very impressive to me, on the most trusted benchmark around! Check the latest updates at https://arena.lmsys.org/

Edit: On the English-only leaderboard Llama 3 70B is doing even better, hovering at the very top with GPT-4 and Claude Opus. Very impressive! People seem to be saying that Llama 3's safety tuning is much less severe than before so my speculation is that this is due to reduced refusal of prompts more than increased knowledge or reasoning, given the eval scores. But still, a real and useful improvement! At this rate, the 400B is practically guaranteed to dominate.


I tried generating a Chinese rap song, and it did generate a pretty good rap. However, upon completion, it deleted the response, and showed > I don’t understand Chinese yet, but I’m working on it. I will send you a message when we can talk in Chinese.

I tried some other languages and the same. It will generate non-English language, but once its done, the response is deleted and replaced with the message


I'm seeing the same behaviour. It's as if they have a post-processor that evaluates the quality of the response after a certain number of tokens have been generated, and reverts the response if it's below a threshold.

I've noticed Gemini exhibiting similar behaviour. It will start to answer, for example, a programming question - only to delete the answer and replace it with something along the lines of "I'm only a language model, I don't know how to do that"

This seems like a bizarre way to handle this. Unless there's some level of malicious compliance, I don't see why they wouldn't just hide the output until the filtering step is completed. Maybe they're incredibly concerned about it appearing responsive in the average case.

Would not be surprised if there were browser extensions/userscripts to keep a copy of the text when it gets deleted and mark it as such.


They have both pre and post-LLM filters.

The linked article mentions these safeguards as the post-processing step.

I've seen the exact same thing! Gemini put together an impressive bash one liner then deleted it.

Always very frustrating when it happens.

It might be copyright related and not quality related. What if X% of it is a direct ripoff an existing song?

so run it locally, local version is not guarded

My locally-hosted llama3 actually craps itself if I ask it to answer in other languages. It's pretty hilarious. Has been working flawlessly (and impressively fast) for everything in English, then does hilarious glitches in other languages.

Eg right now to show it here, I say "Write me a poem about a digital pirate in Danish":

Digitalen Pirat

På nettet sejler han, En digital pirat, fri og farlig. Han har øjnene på de kodeagtige Og hans hjerne er fuld af ideer.

Hans skib er en virtuel børs, Hvor dataenes vætætø Tø Tø Tø Hø T Ø T Ø T Ø T Ø T Ø T 0 Ø T 0 Ø T 0

Edit: Formatting is lost here, but all those "T" and "Ø" etc are each on their own line, so it's a vomit of vertical characters that scrolls down my screen.


Trying the same on https://llama3.replicate.dev/ with Llama 3-70B gives a perfectly fine response with a long poem in Danish. And then it even translates it to English before concluding the response.

The training data is 95% English, foreign language is not going to be its strongest strength.

Tried with Italian and it seems to work but always appends the following disclaimer:

«I am still improving my command of non-English languages, and I may make errors while attempting them. I will be most useful to you if I can assist you in English.»


Crazy that this bug is still happening 12hrs later

Lots of great details in the blog: https://ai.meta.com/blog/meta-llama-3/

Looks like there's a 400B version coming up that will be much better than GPT-4 and Claude Opus too. Decentralization and OSS for the win!


Comparing to the numbers here https://www.anthropic.com/news/claude-3-family the ones of Llama 400B seem slightly lower, but of course it's just a checkpoint that they benchmarked and they are still training further.

Indeed. But if GPT-4 is actually 1.76T as rumored, an open-weight 400B is quite the achievement even if it's only just competitive.

The rumor is that it's a mixture of experts model, which can't be compared directly on parameter count like this because most weights are unused by most inference passes. (So, it's possible that 400B non-MoE is the same approximate "strength" as 1.8T MoE in general.)

It absolutely does not say that. It in fact provides benchmarks that show it under performing them.

Not great to blindly trust benchmarks, but there are no claims it will outperform GPT-4 or Opus.

It was a checkpoint, so it's POSSIBLE it COULD outperform.


Where does it say much better than gpt4 for the 400B model?

It doesn't ....

Is it decentralized? You can run it multiple places I guess, but it’s only available from one place.

And it’s not open source.


It's not open source or decentralized.

that's very exciting. are you quoting same benchmark comparisons?

The blog did not state what you said, sorry I’ll have to downvote your comment

I just want to express how grateful I am that Zuck and Yann and the rest of the Meta team have adopted an open approach and are sharing the model weights, the tokenizer, information about the training data, etc. They, more than anyone else, are responsible for the explosion of open research and improvement that has happened with things like llama.cpp that now allow you to run quite decent models locally on consumer hardware in a way that you can avoid any censorship or controls.

Not that I even want to make inference requests that would run afoul of the controls put in place by OpenAI and Anthropic (I mostly use it for coding stuff), but I hate the idea of this powerful technology being behind walls and having gate-keepers controlling how you can use it.

Obviously, there are plenty of people and companies out there that also believe in the open approach. But they don't have hundreds of billions of dollars of capital and billions in sustainable annual cash flow and literally ten(s) of billions of dollars worth of GPUs! So it's a lot more impactful when they do it. And it basically sets the ground rules for everyone else, so that Mistral now also feels compelled to release model weights for most of their models.

Anyway, Zuck didn't have to go this way. If Facebook were run by "professional" outside managers of the HBS/McKinsey ilk, I think it's quite unlikely that they would be this open with everything, especially after investing so much capital and energy into it. But I am very grateful that they are, and think we all benefit hugely from not only their willingness to be open and share, but also to not use pessimistic AI "doomerism" as an excuse to hide the crown jewels and put it behind a centralized API with a gatekeeper because of "AI safety risks." Thanks Zuck!


You can see from Zuck's interviews that he is still an engineer at heart. Every other big tech company has lost that kind of leadership.

For sure. I just started watching the new Dwarkesh interview with Zuck that was just released ( https://t.co/f4h7ko0M7q ) and you can just tell from the first few minutes that he simply has a different level of enthusiasm and passion and level of engagement than 99% of big tech CEOs.

Who cares, listen to what he says.

38:30 Zuckerberg states that they won't release models once they're sufficiently powerful.

It's OpenAI again, facebook has burnt all customer trust for years and the fact they changed their name to "Meta" actually worked.


I mean, he was pretty open with his motivations if you ask me, open source exists because it is a positive sum game, he gets something in return for being open, if that calculus is no longer true then he has no incentive to be open.

I've never heard of this person, but many of the questions he asks Zuck show a total lack of any insight in this field. How did this interview even happen?

I actually think Dwarkesh is usually pretty good - this interview wasn’t his best (maybe he was a bit nervous because it’s Zuck?) but his show has had a lot of good conversations that get more into the weeds than other shows in my experience

He talks a bit too fast, but I kinda get the vibe that he's genuinely interested in these topics.

Seconding this opinion: Dwarkesh's podcast is really good. I haven't watched all of the Zuck interview but I recommend others to check out a couple extra episodes to get a more representative sample. He is one of the few postcasters who does his homework.

He’s built up an impressive amount of clout over a short period of time, mostly by interviewing interesting guests on his podcast while not boring listeners to death (unlike a certain other interviewer with high-caliber guests that shall remain nameless).

What's the meaning of life though, and why is it love?

thanks for sharing! he looks more human compared to all the previous interviews I've seen.

Also, being open source adds phenomenal value for Meta:

1. It attracts the world's best academic talent, who deeply want their work shared. AI experts can join any company, so ones which commit to open AI have a huge advantage.

2. Having armies of SWEs contributing millions of free labor hours to test/fix/improve/expand your stuff is incredible.

3. The industry standardizes around their tech, driving down costs and dramatically improving compatibility/extensibility.

4. It creates immense goodwill with basically everyone.

5. Having open AI doesn't hurt their core business. If you're an AI company, giving away your only product isn't tenable (so far).

If Meta's 405B model surpasses GPT-4 and Claude Opus as they expect, they release it for free, and (predictably) nothing awful happens -- just incredible unlocks for regular people like Llama 2 -- it'll make much of the industry look like complete clowns. Hiding their models with some pretext about safety, the alarmist alignment rhetoric, will crumble. Like...no, you zealously guard your models because you want to make money, and that's fine. But using some holier-than-thou "it's for your own good" public gaslighting is wildly inappropriate, paternalistic, and condescending.

The 405B model will be an enormous middle finger to companies who literally won't even tell you how big their models are (because "safety", I guess). Here's a model better than all of yours, it's open for everyone to benefit from, and it didn't end the world. So go &%$# yourselves.


Yes, I completely agree with every point you made. It’s going to be so satisfying when all the AI safety people realize that their attempts to cram this protectionist/alarmist control down our throats are all for nothing, because there is an even stronger model that is totally open weights, and you can never put the genie back in the bottle!

Hopefully they aren't able to cram it down our legislators' throats... Seems that's what really matters

> you can never put the genie back in the bottle

That's specifically why OpenAI don't release weights, and why everyone who cares about safety talks about laws, and why Yud says the laws only matter if you're willing to enforce them internationally via air strikes.

> It’s going to be so satisfying

I won't be feeling Schadenfreude if a low budget group or individual takes an open weights model, does a white-box analysis to determine what it knows and to overcome any RLFH, in order to force it to work as an assistant helping walk them though the steps to make VX nerve agent.

Given how old VX is, it's fairly likely all the info is on the public internet already, but even just LLMs-as-a-better-search / knowledge synthesis from disparate sources, that makes a difference, especially for domain specific "common sense": You don't need to know what to ask for, you can ask a model to ask itself a better question first.


If some unhinged psycho want to build nerve agents and bombs I think it's laughable to believe an LLM will be the tool that makes a difference in enabling them to do so.

As you said the information is already out there - getting info on how to do this stuff is not the barrier you think it is.


> I think it's laughable to believe an LLM will be the tool that makes a difference

If you think it's "laughable", what do you think tools are for? Every tool makes some difference, that's why they get used.

The better models are already at the level of a (free) everything-intern, and it's very easy to use them for high-level control of robotics.

> getting info on how to do this stuff is not the barrier you think it is.

Knowing what question you need to ask in order to not kill oneself in the process, however, is.

Secondary school chemistry lessons taught me two distinct ways to make chlorine using only things found in a normal kitchen; but the were taught in the context "don't do X or Y, that makes chlorine", not "here's some PPE, let's get to work".


Uh oh -- we should ban this secondary school thing

Interesting thing I've heard about humans, very bad at noticing conjunctions such as "but".

Wonder if it's true?


when all you want is to hurt then every tool looks like a weapon.

Commoditize Your Complements: https://gwern.net/complement

No need to quote the arrogant clown on that one, Spolski coined the concept:

https://www.joelonsoftware.com/2002/06/12/strategy-letter-v/


How does that work? Nobody will be able to run the big models who doesn't have a big data center or lots of rent money to burn. How is it going to matter to most of us?

It seems similar to open chip designs - irrelevant to people who are going to buy whatever chips they use anyway. Maybe I'll design a circuit board, but no deeper than that.

Modern civilization means depending on supply chains.


The day it's released, Llama-3-405B will be running on someone's Mac Studio. These models aren't that big. It'll be fine, just like Llama-2.

Maybe at 1 or 2 bits of quantization! Even the Macs with the most unified RAM are maxxed out with much smaller models than 405b (especially since it's a dense model and not a MOE).

You can build a $6,000 machine with 12 channels DDR5 memory that's big enough to hold an 8bit quantized model. The generation speed is abysmal of course.

Anything better than that starts at 200k per machine and goes up from there.

Not something you can run at home, but definitely within the budget of most medium sized firms to buy one.


You can build a machine that can run 70b models at great TpS speeds for around 30-60k. That same machine could almost certainly run a 400b model with "useable" speeds. Obviously much slower than current ChatGPT speeds but still, that kind of machine is well within the means of wealthy hobbyists/highly compensated SWEs and small firms.

I just tested llama3:70b with ollama on my old AMD ThreadRipper Pro 3965WX workstation (16-core Zen4 with 8 DDR4 mem channels), with a single RTX 4090.

Got 3.5-4 tokens/s, GPU compute was <20% busy (~90W) and the 16 CPU cores / 32 threads were about 50% busy.


And that’s not quantized at all, correct?

If so, then the parent comment’s sentiment holds true…. Exciting stuff.


Jesus that's the old one?

It's important to distinguish between open source and open weights

OpenAI engineers don't work for free. Facebook subsidizes their engineers because they have $20B. OpenAI doesn't have that luxury.

Sucks to work in a non-profit, right? Oh wait... }:^). Those assholes are lobbying to block public llm, 0 sympathy.

>Every other big tech company has lost that kind of leadership.

He really is the last man standing from the web 2.0 days. I would have never believed I'd say this 10 years ago, but we're really fortunate for it. The launch of Quest 3 last fall was such a breath of fresh air. To see a CEO actually legitimately excited about something, standing on stage and physically showing it off was like something out of a bygone era.


Someone, somewhere on YT [1], coined the term Vanilla CEOs to describe non-tech-savvy CEOs, typically MBA graduates, who may struggle to innovate consistently. Unlike their tech-savvy counterparts, these CEOs tend to maintain the status quo rather than pursue bold visions for their companies..

1. https://youtu.be/gD3RV8nMzh8


But also: Facebook/Meta got burned when they missed the train on owning a mobile platform, instead having to live in their competitors' houses and being vulnerable to de-platforming on mobile. So they've invested massively in trying to make VR the next big thing to get out from that precarious position, or maybe even to get to own the next big platform after mobile (so far with little to actually show for it at a strategic level).

Anyways, what we're now seeing is this mindset reflected in a new way with LLMs - Meta would rather that the next big thing belongs to everybody, than to a competitor.

I'm really glad they've taken that approach, but I wouldn't delude myself that it's all hacker-mentality altruism, and not a fair bit of strategic cynicism at work here too.

If Zuck thought he could "own" LLMs and make them a walled garden, I'm sure he would, but the ship already sailed on developing a moat like that for anybody that's not OpenAI - now it's in Zuck's interest to get his competitor's moat bridged as fast as possible.


> now it's in Zuck's interest to get his competitor's moat bridged as fast as possible.

It's this, and by making it open and available on every cloud out there would make this accessible to other start ups who might play in Meta's competitor's spaces.


Similarly to Google keeping Android open source, so that Apple wouldn’t completely control the phone market.

In fact Google doesn't care much if Apple controls the entire mobile phone market, Android is just guaranteed way of acquiring new users. Now they are paying yearly around $19 billion Apple to be default search engine, I expect without Android this price would be times more.

Depends on your size threshhold. For anything beyond 100 bn in market cap certainly. There is some relatively large companies with a similar flair though, like Cohere and obviously Mistral.

Well, they're not AI companies, necessarily, or at least not only AI companies, but the big hardware firms tend to have engineers at the helm. That includes Nvidia, AMD, and Intel. (Counterpoint: Apple)

Counter counter point: apples hardware division has been doing great work in the last 5 years, it’s their software that seems to have gone off the rails (in my opinion).

I'm not sure how this is a counter-point to the allegation that Tim Cook isn't really an engineer.

Tim Cook is probably the greatest CFO any company could know. But Apple's capital is vastly squandered with Tim as CEO.

COO, not CFO. He is a supply chain/manufacturing/operations guy.

Apple being the most egregious example IMHO.

Purely my opinion as a long time Apple fan, but I cant help but think that Tim Cook's polices are harming the Apple brand in ways that we wont see for a few years.

Much like Balmer did at Microsoft.

But who knows - I'm just making conversation :-)


I'm happy that he's pouring money into the metaverse, and glad that it's not my money.

Are you joking? “ v. You will not use the Llama Materials or any output or results of the Llama Materials to improve any other large language model (excluding Llama 2 or derivative works thereof). “ is no sign of a strong engineering culture, it’s a sign of greed.

NVidia, AMD, Microsoft?

Nvidia, maybe. Microsoft, definitely not. Nadella is a successful CEO but is as corporate as they come.

Nadella has such practiced corporate-speak it's impressive. I went to a two-hour talk and Q&A he did, and he didn't communicate a single piece of real information over the whole session. It was entirely HR filler language, the whole time.

Anyone who made it through CS 121 is an engineer for life.

This is both their biggest strength and weakness

Yeah. He did good.

[flagged]


If you combine engineer mindset, business acumen, relentless drive and do so over decades, you can get outsized results.

It's a thing to admire, *even if you dislike the products*. Much the same as you can be awed by Ray Kroc's execution regardless of whether you like McDonald's or what you think of him personally.

It simply isn't that common to have that combination of talents at work on one thing at such scale for so long. Steve Jobs and Bill Gates had the same combo of really being down in the details despite reaching such heights.

You can contrast to Google, a company whose founders had similar traits but who got tired of it. Totally understandable, but it makes a difference in terms of the focus of google today.

Again this is true regardless of what you think of Meta on, say, privacy vs. Google's original "Don't be Evil" idea.

Saying "wow they still have engineering leadership" is hardly worship. It's a statement of fact.


Good thing that he's only 39 years old and seems more energetic than ever to run his company. Having a passionate founder is, imo, a big advantage for Meta compared to other big tech companies.

Love how everyone is romanticizing his engineering mindset. But have we already forgotten that he was even more passionate about the metaverse which, as far as I can tell, was a 50B failure?

Having an engineering mindset is not the same as never making mistakes (or never being too early to the market). The only way you won’t make those mistakes and keep a perfect record is if you never do anything major or step out of the comfort zone.

If Apple didn’t try and fail with Newton[0] (which was too early to the market for many reasons, both tech-related and not), we might’ve not had iPhone today. The engineering mindset would be to analyze how and why it happened the way it did, assess whether you can address those issues well, decide whether to proceed again or not (and how), and then execute. Obsessing over a perfect track record is the opposite of the engineering mindset imo.

0. https://en.wikipedia.org/wiki/Apple_Newton


His engineering mindset made him blind to the fact the metaverse was a product that nobody wanted or needed. In one of the Fridman interviews, he goes on and on about all the cool technical challenges involved in making the metaverse work. But when Fridman asked him what he likes to do in his spare time, it was all things that you could precisely not do in the metaverse. It was baffling to me that he failed to connect the dots.

I don't think that was the issue. VRChat was basically the same idea but done in a more appealing way and it was (still is) wildly popular.

All the work Meta has put in is still being felt in the VR space. Besides Valve they are the only ones pushing an open ecosystem.

VRChat is not a product a large corp can or would build though.

VRChat is more popular, but it doesn’t mean that copying their approaches would be the move.

For all we know, VRChat as a concept of that kind is a local maximum, and imo it wont scale well to genpop. Not claiming this as an objective fact, but as a hypothesis that I personally believe to be very likely truthful. Think of it as a dead branch of evolution, where if you want to go further than that local maximum, you gotta break out of it using an entirely different approach.

I like VRChat, but thinking that a random person living in the mainstream who isnt into that type of geeky online stuff is gonna be convinced of VRChat being the ultimate metaverse experience is just foolish.

At that point, your choices are: (1) build a VRChat clone and hit that same local maximum but slightly higher at best or (2) develop something entirely different to get out of that local maximum, but risk failing (since it is a totally novel thing) and coming short of being at least as successful as VRChat. Zuck took the second option, and I respect that.

Just making a VRChat Meta Edition clone would imo give Meta much better numbers in the short-term (than their failed Meta Horizons did), but imo long-term that approach would lead them nowhere. And it seems like Meta is more interested in capturing the first-mover (into the mainstream) advantage heavy.

And honestly, I think it is better off this way. Just like if someone is making yet another group chat, i would prefer they went balls to the wall, tried to rethink things from scratch, and made a group chat app that is unlike any other ones out there. Could all of their novel approaches fail? Yes, much more likely than if they made another slack clone with a different color scheme. But the important part is, it also has a much higher chance to get the state of their niche oit of the local maximum.

Examples: Twitter could’ve been just another blog aggregator, Tesla could’ve been just another gas-powered Lotus Elise (with the original roadsters literally being just their custom internals slotted into a Lotus body), Microsoft would’ve been stuck with MS-DOS and not went into the “app as the main OS” thing (which is what they did with Windows).

Apple would’ve been relegated to a legacy of Apple II and iPod (with a dash of macbook relevancy), and rememebered as the company that made this ultra popular mp3 player before that whole niche died. Airpods (that everyone laughed at initially and lauded as an impractical pretentious purchase) are massive now, with every holdout that I personally know who finally got them recently going “i cannot believe how convenient it is, i should’ve gotten them earlier”, but it was also a similar “who needs this, they are solving a problem nobody has, everyone prefers wired with tons of better options” take[0].

If you want to get out of a perceived local maximum and break into the mainstream, you gotta try brand new approaches that would likely fail. Going “omg cannot even beat that existing competitor that’s been running for years” is kinda pointless in this case, because competing with them directly by making just a better and more successful clone of their product was never the goal. I don’t doubt even for a second that if Meta tried that, they would’ve likely accomplished it.

And for the naysayers who don’t see Meta ever breaking things out of a local maximum, just look at the Oculus Quest line. Everyone was laughing at them initially for going with the standalone device approach, but Quest has become a massive hit, with tons of people of all kinds buying it (not just people with massive gaming rigs).

0. And yes, removal of the audiojack somewhat speeded up the adoption, but I just used an adapter with zero discomfort for a year or two until i got airpods myself (and would’ve still continued using the adapter if I just didnt flatout preferred airpods in general).


Yes, I thought the same exact thing. Seemed so odd to hear him gush over his foiling and MMA while simultaneously expecting everyone else to migrate to the metaverse.

I mean, I am not sure what response people expected when a person, in a conversation about their work project, is being asked “what do you like to do in your free time.”

Maybe I am an outlier, but when in a conversation about work-related things someone asks “what do you like to do in your free time”, I believe the implication here is that there is a silent “…to do in your free time [outside of work]”.

Answering that question with more stuff related to work project typically falls somewhere on the spectrum between pandering to the audience and cringe.

No idea how this concept can even count as novel on HN, where a major chunk of users that are software devs keep talking about hobbies like woodworking/camping/etc. (aka hobbies that are typically as far removed from the digital realm as possible).

Imo Zuck talking about MMA being his personal free time hobby is about as odd as a software dev talking about being into woodworking. In other words, not at all.


He wants to see MMA fights from VR, pretty good usecase.

This is a super common behavior when a) the product is for other people, but b) you don't care about those other people. You'll see both in technologists (who, as you say, get fascinated by the technology or the idea) and in MBAs (who instead get hypnotized by fashionable trends, empire building, and the potential for large piles of money).

Let’s be honest VR is about the porn. I’d it’s successful at that Zuck will make his billions.

The computer game and television/movie industries both dwarf adult entertainment. The reasons for the rationale on how pornography made the VCR and VHS in particular a success (bringing affordable video pornography into the privacy of your home) do not apply to VR.

Not gonna lie though, VR is way better for porn than VHS.

and is responsible for building evil products to fund this stuff.

Apple photos and FaceTime are good products for sharing information without ruining your attention span or bring evil. Facebook could’ve been like that.


If you actually listen to how Zuck defines the metaverse, it's not Horizons or even a VR headset. That's what pundits say, most of whom love pointing out big failures more than they like thinking deeply.

He sees the metaverse as the entire shared online space that evolves into a more multi-user collaborative model with more human-centric input/output devices than a computer and phone. It includes co-presence, mixed reality, social sites like Instagram and Facebook as well as online gaming, real-world augments, multiuser communities like Roblox, and "world apps" like VRChat or Horizons.

Access methods may be via a VR headset, or smart glasses, or just sensors that alert you to nearby augmented sites that you can then access on your phone - think Pokemon Go with gyms located at historical real-world sites.

That's what $50B has been spent on, and it's definitely a work in progress. But it sure doesn't seem dead based on the fact that more Quest headsets have been sold than this gen's Xboxes; Apple released Vision Pro; Rayban Smart Glasses are selling pretty well; new devices are planned from Google, Valve, and others; and remote work is an unkillable force.

The online and "real" worlds are only getting more connected, and it seems like a smart bet to try to drive what the next generation looks like. I wouldn't say the $50B was spent efficiently, but I understand that forging a new path means making lots of missteps. You still get somewhere new though, and if it's a worthwhile destination then many people will be following right behind you.


It’s really obvious the actual “metaverse” goal wasn’t a vrchat/second life style product. It was another layer on top of the real world where physical space could be monetized, augmented and eventually advertised upon.

AR glasses in a spectacles form factor was the goal, it’s just to get there a VR headset includes solving a lot of the problems you need to solve for the glasses to work at all.

Apple made the same bet.


50 billion dollars and fewer than 10 million MAU. That's a massive failure.

A chunky portion of those dollars were spent on buying and pre-ordering GPUs that were used to train and serve LLaMa

Yes, he got incredibly lucky that he found an alternative use for his GPU investment.

It's a bit too early IMHO to declare the metaverse a failure.

But that said, I don't think it matters. I don't know anybody who hasn't been wrong about something, or made a bad bet at times. Even if he is wrong about everything else (which he's not, because plenty of important open source has come out of facebook), that doesn't change the extreme importance that is Llama and Meta's willingness to open things up. It's a wonderful gift they have given to humanity that has only barely started.


$50B for <10M MAU is absolutely a failure, today, as I'm typing this.

You're everywhere in this thread man. Did zuck steal your lunch or something?

The Quest is the top selling VR headset by a very large margin.

He's well positioned to take that market when it eventually matures a bit. Once the tech gets there, say in a decade we might see most people primarily consume content via VR and phones. That's movies, games, TV, sporting events, concerts.


I just can’t imagine sitting with a headset on, next to my wife, watching the NFL. It could very well change for me, but it does not sound appealing.

Nor could I. And I can't imagine sitting next to my wife watching a football game together on my phone. But I could while waiting in line by myself.

Similarly, I could imagine sitting next to my daughter, who is 2,500 miles away at college, watching the name together on a virtual screen we both share. And then playing mini-golf or table tennis together.

Different tools are appropriate for different use cases. Don't dismiss a hammer because it's not good at driving screws.


Yes, these are all very good points. You’ve got me awaiting the future of the tech a bit more eagerly.

FYI, those use cases are the present, not the future, of tech.

Co-watching TV? Big Screen: https://www.bigscreenvr.com/software

Mini-Golf? Walkabout Mini Golf: https://www.mightycoconut.com/minigolf

Table Tennis? Eleven Table Tennis: https://elevenvr.com/en/

All are amazing, polished experiences in VR that give you a sense of being "present" with someone a continent away.


What if you're on a train, at home alone, etc.

For me the tech isn't they're yet. I'd buy a Quest with an HDMI input today if they sold it. But for some reason these are two different products


would your wife normally watch nfl with you? if yes, for you or for nfl?

Yes, and for NFL. It’s one of my favorite shared hobbies of ours!

Give me $50 billion dollars and I'll bet I could get 8 million MAU on a headset. It's a massive failure because Zuck's a nerd and not a product guy.

Asking for an impossible hypothetical and then claiming something equally impossible. stay classy hackernews. Chances are that you would take the 8 million and run.

Having a nerdy vision of the future and spending tens of billions of dollars to try and make it a reality while shareholders and bean counters crucify you for it is the most engineer thing imaginable. What other CEO out there is taking such risks?

Bill Gates when he was at Microsoft.

Tablet PC (first iteration was in the early 90s!), Pocket PC, WebTV and Media Center PC (Microsoft first tried Smart TVs in the late 90s! There wasn't any content to watch and most people didn't have broadband, oops), Xbox, and the numerous PC standards they pushed for (e.g. mandating integrated audio on new PCs), smart watches (SPOT watch, look it up!), and probably a few others I'm forgetting.

You'll notice in most of those categories, they moved too soon and others who came later won the market.


Think of it as a 50B spending spree where he gave that to VR tech out of enthusiasm. Even I, with the cold dark heart that I have, has to admit he's a geek hero with his open source attitude.

That's the point. He does things because he is excited about something, not to please shareholders. Shareholders didn't liked Metaverse at all. And shareholders likely don't like spending billion dollar in GPUs just to give the benefit away for free to others.

Zuck's job is to have vision and take risks. He's doing that. He's going to encounter failures and I doubt he's still looking in the rearview mirror about it. And overall, Zuck has a tremendous amount of net success, to say the least.

It isn't necessarily a failure "yet". Don't think anybody is saying VR/AR isn't a huge future product, just that current tech is not quite there. We'll see if Apple can do better, they both made tradeoffs.

It is still possible that VR and Generative AI can join in some synergy.


I think that part of his bet is that AI is a key component of getting the metaverse to take off. E.g. generating content for the metaverse via AI

It's hard for me to imagine AI really helping Meta. It might make content cheaper, but Meta was not budget limited.

I get so annoyed by this every time I see it. It’s not because AI took over the news cycle that the idea of a Metaverse is a failure.

If you could have predicted that Internet was going to change our lives and that most people would spend most of their waking hours living their lives on the Internet people probably would have told you that you were a fool in the early days.

The same is true with this prediction of VR. If you think in the next decade that VR is not going to be the home for more and more people then you are wrong.


It would have been if the bet that AR glasses in a spectacle form factor could have been solved. But the lens display just isn’t possible today.

Apple made the same bet too and had to capitulate to a VR headset + cameras in the end.

The Zuck difference is he pivoted to AI at the right time, Apple didn’t.


That's almost the point isn't it? He still believes in it, just the media moved on. Passion means having a vision that isn't deterred by immediate short term challenges because you can "see over the mountain".

Will metaverse be a failure? Maybe. But Apple doesn't think so to the tune of $100B invested so far, which is pretty good validation there is some value there.


was a failure? they are still building it, when they shut down or sell off the division then you can call it a failure

Unsuccessful ideas can live on for a long time in a large corporation.

Nobody wants to tell the boss his pet project sucks - or to get their buddies laid off. And with Facebook's $100 billion in revenue, nobody's going to notice the cost of a few thousand engineers.


10 years, $50 billion, fewer than 10 million MAU. It's a failure today, right this minute it's a failure.

Disagree from VR

What's wrong with someone playing with millennia equivalent of millions of human life times worth of income like a disposable toy? /s

Yeah because all that research and knowledge completely dissipates because the business hasn’t recouped its R&D costs.

Apple famously brought the iPhone into existence without any prior R&D or failed attempts to build similar devices.


I swear, this feels like people get paid to write positive stuff about him? Have you forgotten his shitty leadership and practices around data and lock-ins?

Yes how dare different people have different opinions about different people? It's almost as if we all should be a monolithic voice that agrees with you.

The thread was suspiciously positive, like almost exclusive. Your comment adds nothing to the discussion, you're just snarky and nothing else. So get off my back

>Your comment adds nothing to the discussion,

and yours did? This comment, Christian?

>>I swear, this feels like people get paid to write positive stuff about him?

----

>you're just snarky and nothing else

Please re-read your own comment. See above.

>So get off my back

Absolutely not. You said something that was decidedly ignorant(how dare people praise x good thing done by omg horrible y people!), and I called you out on it. I expect better discussion and people skills from someone who holds position of a CTO rather than just "haha you're all paid shills!"


Let's be honest that he's probably not doing it due to goodness of his heart. He's most likely trying to commoditize the models so he can sell their complement. It's a strategy Joel Spolsky had talked about in the past (for those of you who remember who that is). I'm not sure what the complement of AI models is that Meta can sell exactly, so maybe it's not a good strategy but I'm certain it's a strategy of some sort

You lead with a command to be honest and then immediately speculate on private unknowable motivations and then attribute, without evidence, his decision to a strategy you can't describe.

What is this? Someone said something nice, and you need to "restore balance"


They said something naive, not just "nice". It's good to correct the naivete.

For example, as we speak, Zuck is lobbying congress to ban Tiktok. Putting aside whether you think it should be banned, this is clearly a cynical strategy with pure self interest in mind. He's trying to monopolize.

Whatever Zuck's strategy with open source is, it's just a strategy. Much like AMD is pursuing that strategy. They're corporations and they don't care about you or me.


What was said that was naive?

Also keep in mind that it's still a proprietary model. Meta gets all the benefits of open source contributions and testing while retaining exclusive business use.

Very wrong.

Llama is usable by any company under 700M MAU.


Do you have a source? Here's the license when you request access from Meta for Llama, unless there's something I'm missing?

https://ai.meta.com/blog/large-language-model-llama-meta-ai/

EDIT: Looks like they did open up commercial use with version 2 with the explicit restriction to prevent any major competitor to Meta from using Llama, and that any improvements related to Llama can only apply to Llama. So an attempt to expand the scope of usage and adoption of their proprietary model without their main competitors being able to use it, which still fits my original point.


That's coz he is a founder CEO. Those guys are built different. It's rare for the careerist MBA types to match their passion or sincerity.

There are many things I can criticize Zuck for but lack of sincerity for the mission is not one of them.


It is just the reverse: he is successful because he is like that and lots of founder ceos are jellies in comparison

I dunno. I find a conviction in passion in founder CEOs that is missing in folks who replace them.

Compare Larry & Sergey with Pichai, or Gates with Balmer.


how can anyone doubt Ballmer's passion after his sweaty stage march. He ain't in charge anymore anyway. Gates was more methodical evil than passionate and his big moves were all just stabbing someone else to take their place.

I think he managed to buck the trend because, despite not being one, he liked developers (some would say a little too much)

Don't forget Gavin Belson and Action Jack Barker

Action Jack would still be at it but these days he prefers a nice piece of fish.

Satya Nadella is an interesting counter example.

Meta also spearheaded the open compute project. I originally joined Google because of their commitment to open source and was extremely disappointed when I didn't see that culture continue as we worked on exascale solutions. Glad to see Meta carrying the torch here. Hope it continues.

When did you join Google?

mid-2000s just prior to the ipo.

Oh, I see, that must have been quite the journey.

I joined in 2014, and even I saw the changes in just a few years when I was there.

Still I was a bit baffled reading all the lamenters: I joined late enough that I had no illusions and always saw Google as doing pretty well for an 'enterprise', instead of feeling and expressing constant disappointment that the glory days were over.


I see what you did here <q> carrying the "torch" <q>. LOL

> I just want to express how grateful I am that Zuck

Praise for him at HN? It should be enough of a reason for him to pop a champagne today


Yeah, I'm also surprised at how many positive comments are in this thread.

I do hate Facebook, but I also love engineers, so I'm not sure how to feel about this one.


One of the many perks of releasing open-ish models, React, and many other widely used tools over the years. Meta might be the big tech whose open source projects are most widely used. That gives you some dev goodwill, even though your main products profit from some pretty bad stuff.

> I do hate Facebook, but I also love engineers, so I'm not sure how to feel about this one.

"it's complicated". Remember that? :)

It's also a great way to avoid many classes of bias. One shouldn't aspire to "feel" in any one way. Embrace the complexity.


You're right. It's just, of course, easier to feel one extreme or the other.

I mean they basically invented, popularised and maintained react/react native which I've built my entire career on, I love them for that.

The world at large seems to hate Zuck but it’s good to hear from people familiar with software engineering and who understand just how significant his contributions to open source and raising salaries have been through Facebook and now Meta.

> his contributions to ... raising salaries

It's fun to be able to retire early or whatever, but driving software engineer salaries out of reach of otherwise profitable, sustainable businesses is not a good thing. That just concentrates the industry in fewer hands and makes it more dependent on fickle cash sources (investors, market expansion) often disconnected from the actual software being produced by their teams.

Nor is it great for the yet-to-mature craft that high salaries invited a very large pool of primarly-compensation-motivated people who end up diluting the ability for primarily-craft-motivated people to find and coordinate with each other in pursuit of higher quality work and more robust practices.


> It's fun to be able to retire early or whatever, but driving software engineer salaries out of reach of otherwise profitable, sustainable businesses is not a good thing.

That argument could apply to anyone who pays anyone well.

Driving up market pay for workers via competition for their labour is exactly how we get progress for workers.

(And by 'treat well', I mean the whole package. Fortunately, or unfortunately, that has the side effect of eg paying veterinary nurses peanuts, because there's always people willing to do those kinds of 'cute' jobs.)

> Nor is it great for the yet-to-mature craft that high salaries invited a very large pool of primarly-compensation-motivated people who end up diluting the ability for primarily-craft-motivated people to find and coordinate with each other in pursuit of higher quality work and more robust practices.

Huh, how is that 'dilution' supposed to work?

Well, and at least those 'evil' money grubbers are out of someone else's hair. They don't just get created from thin air. So if those rimarly-compensation-motivated people are now writing software, then at least investment banking and management consulting are free again for the primarily-craft-motivated people to enjoy!


Bubbles are bubbles.

They can be enjoyed/exploited (early retirment, savvy caching of excess income, etc) by workers but they don't win anybody progress and aren't a thing to celebrate.

Workers (and society) have not won progress when only a handful of companies have books that can actually support their inflated pay, and the remainder are ultimately funded by investors hoping to see those same companies slurp them up before the bubble bursts.

Workers don't win progress when they're lured into then converting that income into impractical home loans that bind the workers with golden handcuffs and darkly shadow their future when the bubble bursts.

Workers win progress when they can practice their trade with respect and freedom and can and secure a stable, secure future for themselves and their families.

Software engineers didn't need these bubble-inflated salaries to acheive that. Like our peers in other engineering disciplines, it's practically our baseline state. What fight we do still need to make is on securing non-monetary worker's rights and professional deference, which is a different thing and gets developed in a different and more stable market environment.


Meta has products that are used by billions of people every week and has been extremely profitable for over 15 years, with no sign of obvious downward trend. I don't see how it can be described as a bubble.

> They can be enjoyed/exploited (early retirment, savvy caching of excess income, etc) by workers but they don't win anybody progress and aren't a thing to celebrate.

Huh, if I get paid lots as a worker, I don't care whether the company goes belly up later. Why should I? (I include equity in the total pay package under judgement here, and by 'lots' I mean that the sum of equity and cash is big. If the cash portion is large enough, I don't care if the stock goes to zero. In any case, I sell any company stock as soon as I can, and invest the money in diversified index funds.)

> Workers (and society) have not won progress when only a handful of companies have books that can actually support their inflated pay, and the remainder are ultimately funded by investors hoping to see those same companies slurp them up before the bubble bursts.

I'm more than ok with willing investors (potentially) losing capital they put at risk. Just don't put some captive public retirement fund or task payer money into this. Those investors are grown up and rich, they don't need us to know better what is good for them.

> Workers don't win progress when they're lured into then converting that income into impractical home loans that bind the workers with golden handcuffs and darkly shadow their future when the bubble bursts.

This says more about carefully managing the maximum amount of leverage you want to take on in your life. It's hardy an argument that would convince me that lower pay is better for me.

People freak out when thinking about putting leverage in their stock portfolio, but they take on a mortgage on a house without thinking twice. Even though getting out of a well diversified stock portfolio and remove all the leverage takes less than half an hour these days (thanks to online brokers), but selling your single concentrated illiquid house can take months and multiple percentage points of transaction costs (agents, taxes, etc).

Just don't buy a house, or at least buy within your means. And make sure you are thinking ahead of time how to get out of that investment, in case things turn sour.

> Workers win progress when they can practice their trade with respect and freedom and can and secure a stable, secure future for themselves and their families.

Guess who's in a good negotiation position to demand respect and freedom and stability from their (prospective) employer? Someone who has other lucrative offers. Money is one part of compensation, freedom and respect (and even fun!) are others.

Your alternative offers don't all have to offer these parts of the package in the same proportions. You can use a rich offer with lots of money from place A, to try and get more freedom (at a lower pay) from place B.

Though I find that in practice that the places that are valuing me enough to pay me a lot, also tend to value me enough to give me more respect and freedom. (It's far from a perfect correlation, of course.)

> Software engineers didn't need these bubble-inflated salaries to acheive that.

Yes, have lived on a pittance before, and survived. I don't strictly 'need' the money. But I still firmly believe that all else being equal that 'more money = more better'.

> What fight we do still need to make is on securing non-monetary worker's rights and professional deference, [...].

I'd rather take the money, thank you.

If you want to fight, please go ahead, but don't speak for me.

And the whole thing smells a lot like you'd (probably?) want to introduce some kind of mandatory licensing and certificates, like they have in other engineering disciplines. No thank you. Programming is one of the few well paid white collar jobs left where you don't need a degree to enter. Let's keep it that way.


> Driving up market pay for workers via competition for their labour is exactly how we get progress for workers.

There's a difference between "paying higher salaries in fair competition for talents" and "buying people to let them rot to make sure they don't work for competition".

It's the same as "lowering prices to the benefit of consumer" vs "price dumping to become a monopoly".

Facebook never did it at scale though. Google did.


> It's the same as "lowering prices to the benefit of consumer" vs "price dumping to become a monopoly".

Where has that ever worked? Predatory pricing is highly unlikely.

See eg https://www.econlib.org/library/Columns/y2017/Hendersonpreda... and https://www.econlib.org/archives/2014/03/public_schoolin.htm...

> Facebook never did it at scale though. Google did.

Please provide some examples.

> There's a difference between "paying higher salaries in fair competition for talents" and "buying people to let them rot to make sure they don't work for competition".

It's up to the workers themselves to decide whether that's a good deal.

And I'm not sure why as a worker you would decide to rot? If someone pays me a lot to put in a token effort, just so I don't work for the competition, I might happily take that over and practice my trumpet playing while 'working from home'.

I can also take that offer and shop it around. Perhaps someone else has actual interesting work, and comparable pay.


> Where has that ever worked? Predatory pricing is highly unlikely. > See eg

Neither of the articles understand how predatory pricing works, assuming it's a single-market process. In the most usual case you fuel price dumping in one market by profits from the other. This way you can run it potentially indefinitely and you're doing it not in a hope of making profits on this market some day but to make sure no one else does. Funnily enough the second author got a good example but still failed to see it under his nose: public schools do have 90% of the market, and in many countries almost 100%. Obviously it works. Netscape died despite having a superior product because it was competing with a public school so to speak. Browser market is dead up to this date.

> And I'm not sure why as a worker you would decide to rot? If someone pays me a lot to put in a token effort, just so I don't work for the competition, I might happily take that over and practice my trumpet playing while 'working from home'.

That's exactly what happens and people proceed to degrade professionally.

> Perhaps someone else has actual interesting work, and comparable pay.

Not unless that someone sits on the ads money pipe.

> Please provide some examples

What kind of example do you expect? If it helps, half the people I personally know in Google "practice the trumpet" in your words. Situation is slowly improving though in the past two years.

I'm not saying it should be made illegal. I'm saying it's definitely happening and it's sad for me to see. I want the tech industry to move forward, not the amateur trumpet one.


https://en.wikipedia.org/wiki/Predatory_pricing says

> For a period of time, the prices are set unrealistically low to ensure competitors are unable to effectively compete with the dominant firm without making substantial loss. The aim is to force existing or potential competitors within the industry to abandon the market so that the dominant firm may establish a stronger market position and create further barriers to entry.[2] Once competition has been driven from the market, consumers are forced into a monopolistic market where the dominant firm can safely increase prices to recoup its losses.[3]

What you are describing is not predatory pricing, that's a big part of why I was confused.

> Funnily enough the second author got a good example but still failed to see it under his nose: public schools do have 90% of the market, and in many countries almost 100%. Obviously it works.

Please consider reading the article more carefully. Your interpretation requires the author to be an idiot.

---

What you are describing about browsers is interesting. But it's more like bundling and cross subsidies. Neither Microsoft nor Google were ever considering making money from raising the price of their browser after competition had been driven out. That's required for predatory pricing.


> Fortunately, or unfortunately, that has the side effect of eg paying veterinary nurses peanuts, because there's always people willing to do those kinds of 'cute' jobs.

Veterinaries (including technicians) have an absurdly high rate of suicide. They have a stressful job, constantly around death and mistreatment situations, and don’t get the respect (despite often knowing more than human doctors) or the salaries to match.

Calling these jobs “cute” or saying the veterinary situation is “fortunate” borders on cruel, but I believe you were just uninformed.


Yet, people still line up to become veterinaries (and technicians). Which proves my point.

> Calling these jobs “cute” or saying the veterinary situation is “fortunate” borders on cruel, [...]

Perhaps not the best choice of words, I admit.


> Yet, people still line up to become veterinaries (and technicians). Which proves my point.

The informed reality is that the rate of drop out is also huge. Not only from people who leave the course while studying, but also professionals who abandon the field entirely after just a few years of work.

Many of them are already suffering in college yet continue due to a sense of necessity or sunk cost and burn themselves out.

So no, it does not prove your point. The one thing it proves is that the public in general is insufficiently informed about what being a veterinary is like. They should be paid more and have better conditions (worth noting some countries do treat them better), not be churned out and left to die (literally) because there’s always another chump down the line.


> So no, it does not prove your point. The one thing it proves is that the public in general is insufficiently informed about what being a veterinary is like.

That doesn't really matter. What would matter is how well informed the people who decide to become a veterinary are.

> They should be paid more and have better conditions [...]

Well, everyone should be treated better and paid better.

> [...] because there’s always another chump down the line.

If they could somehow make the improvements you suggest (but don't specify how), they would lead to even more chumps joining the queue.

(And no, that's not a generalised argument against making people's lives better. If you improve the appeal of non-vet jobs, fewer people will join the vet line.

If you improve the treatment of workers in general, the length of the wanna-be-vet queue, and any other 'job queue' will probably stay roughly the same. But people will be better off.)


I am fine with large pool of greedy people trying their hand at programming. Some of them will stick and find meaning in work. Rest will wade out in downturn. Net positive.

> Nor is it great for the yet-to-mature craft that high salaries invited a very large pool of primarly-compensation-motivated people who end up diluting the ability for primarily-craft-motivated people to find and coordinate with each other in pursuit of higher quality work and more robust practices.

It's great to enjoy programming, and to enjoy your job. But we live under capitalism. We can't fault people for just working a job.

Pushing for lower salaries won't help anybody.


Pushing salary lowers help the society at large, or at least that’s the thesis of OP. While it sucks for SWE, I actually kind of agree. The skyrocketing of SWE salary in the US, and the slow progress US is making towards normalizing/reducing it does not help US competitiveness. I would not fault Meta for this though, as much as US society at large.

SWE should enjoy it while they can before salary becomes similar to other engineering trades.


I don't understand people who think high salaries are bad. Who should get the money instead? Should even more of it go to execs and shareholders? Why is that better?

> but driving software engineer salaries out of reach of otherwise profitable, sustainable businesses is not a good thing.

I'm not convinced he's actually done that. Pretty much any 'profitable, sustainable business' can afford software developers.

Software developers are paid pretty decently, but (grabbing a couple of lists off of Google) it looks like there's 18 careers more lucrative than it (from a wage perspective), and computers-in-general are only 3 of the top 25 highest paying careers - https://money.usnews.com/careers/best-jobs/rankings/best-pay...

Medical, Legal, Finance, and Sales as careers (roughly in that order) all seem to pay more on average.


Few viable technology businesses and non-technology busiesses with internal software departments were prepared to see their software engineers suddenly suddenly expect doctor or lawyer pay and can't effectively accomodate the change.

They were largely left to rely on loyalty and other kinds of fragile non-monetary factors to preserve their existing talent and institutuonal knowledge and otherwise scavenge for scraps when making new hires.

For those companies outside the specific Silicon Valley money circle, it was an extremely disruptive change and recovery basically requires that salaries normalize to some significant degree. In most cases, engineers provide quite a lot of value but not nearly so much value as FAANG and SV speculators could build into their market-shaping offers.

It's not a healthy situation for the industry or (if you're wary of centralization/monopolization) society as a whole.


In general, it's probably not sustainable (with some exceptions like academia that have never paid that well leaving aside the top echelon and that had its own benefits) to expect that engineering generally lags behind SV software engineering. Especially with some level of remote persisting, presumably salaries/benefits equilibrate to at least some degree.

Why should internal software departments be viable? Isn't it a massive waste to have engineers write software to be used by a single company?

That business can search and find talents globally for fraction of SV salary.

If FAANG company can hire an engineer overseas for 60k$ annually why other cannot?


Because maintaining the organizational infrastructure to coordinate remote teams dispersed to time zones all over the world and with different communication styles, cultural assumptions, and legal requirements is a whole matter of its own?

Companies that can do that are at an advantage over those who can't right now, but pulling that off is neither trivial nor immediate nor free.


I worked for a company that was very good at that. It resulted in software organizations in 50+ countries.

I had teams in North American, Europe, Russia and East Asia. It resulted in a diversified set of engineers who were close to our customers (except in Russia where the engineers were highly qualified but few prospects for sales). Managing across cultures and time zones is a competence. Jet lag from travel was not as great...


>but driving software engineer salaries out of reach of otherwise profitable, sustainable businesses is not a good thing.

What if businesses paid their workers more?


A person (or a company) can be two very different things at the same time. It's undeniable as you say that there have been a lot of high-profile open source innovations coming from Facebook (ReactJS, LLaMA, HHVM, ...), but the price that society at large paid for all of this is not insignificant either, and Meta hasn't meaningfully apologized for the worst of it.

Meta’s open source contributions stand on their own as great regardless of their obviously shady social media management and privacy tactics. The former are feats of software engineering, the later have a lot to do with things far beyond problems like handing data at scale, refreshing feeds fast, ensuring atomic updates to user profiles, etc.

Basically I don’t think their privacy nightmare stuff detracts from what the brain trust of engineers over there have been doing in the open source world.


They're sharing it for a reason. That reason is to disarm their opponents.

Call me cynical, but it was the only way not to be outplayed by OpenAI and to compete with Google, etc.

100%. It was the only real play they had.

Yeah. Very glad Meta is doing what they’re doing here, but the tiger’s not magically changing its stripes. Take care as it might next decide to eat your face.

Why is Meta doing it though? This is an astronomical investment. What do they gain from it?

They're commoditizing their complement [0][1], inasmuch as LLMs are a complement of social media and advertising (which I think they are).

They've made it harder for competitors like Google or TikTok to compete with Meta on the basis of "we have a super secret proprietary AI that no one else has that's leagues better than anything else". If everyone has access to a high quality AI (perhaps not the world's best, but competitive), then no one -- including their competitors -- has a competitive advantage from having exclusive access to high quality AI.

[0]: https://www.joelonsoftware.com/2002/06/12/strategy-letter-v/

[1]: https://gwern.net/complement


Yes. And, could potentially diminish OpenAI/MS.

Once everyone can do it, then OpenAI value would evaporate.


Once every human has access to cutting edge AI, that ceases to be a differentiating factor, so the human talent will again be the determining factor.

And the content industry will grow ever more addictive and profitable, with content curated and customized specifically for your psyche. The very industry Meta happens to be the one to benefit from its growth most among all tech giants.

> Once everyone can do it, then OpenAI value would evaporate.

If you take OpenAI's charter statement seriously, the tech will make most humans' (economic) value evaporate for the same reason.

https://openai.com/charter


> will make most humans' (economic) value evaporate for the same reason

With one hand it takes, with the other it gives - AI will be in everyone's pocket, and super-human level capable of serving our needs; the thing is, you can't copy a billion dollars, but you can copy a LLaMA.


> OpenAI’s mission is to ensure that artificial general intelligence (AGI)—by which we mean highly autonomous systems that outperform humans at most economically valuable work—benefits all of humanity. We will attempt to directly build safe and beneficial AGI, but will also consider our mission fulfilled if our work aids others to achieve this outcome.

No current LLM is that, and Transformers may always be too sample-expensive for that.

But if anyone does make such a thing, OpenAI won't mind… so long as the AI is "safe" (whatever that means).

OpenAI has been totally consistent with saying that safety includes assuming weights are harmful until proven safe because you cannot un-release a harmful model; Other researchers say the opposite, on the grounds that white-box research is safety research is easier and more consistent.

I lean towards the former, not because I fear LLMs specifically, but because the irreversibly and the fact we don't know how close or far we are means it's a habit we should turn into a norm before it's urgent.


Very similar to Tesla and EVs

...like open balloon.

He went into the details of how he thinks about open sourcing weights for Llama responding to a question from an analyst in one of the earnings call last year after Llama release. I had made a post on Reddit with some details.

https://www.reddit.com/r/MachineLearning/s/GK57eB2qiz

Some noteworthy quotes that signal the thought process at Meta FAIR and more broadly

* We’re just playing a different game on the infrastructure than companies like Google or Microsoft or Amazon

* We would aspire to and hope to make even more open than that. So, we’ll need to figure out a way to do that.

* ...lead us to do more work in terms of open sourcing, some of the lower level models and tools

* Open sourcing low level tools make the way we run all this infrastructure more efficient over time.

* On PyTorch: It’s generally been very valuable for us to provide that because now all of the best developers across the industry are using tools that we’re also using internally.

* I would expect us to be pushing and helping to build out an open ecosystem.


"different game"

But what game? What is the AI play that makes giving it away a win for meta?


A lot of the other companies are selling AI as a service. Meta hasn't really been in the space of selling a raw service in that way. However, they are at a center point of human interaction that few can match. In this space, it is how they can leverage those models to enhance that and make that experience better that can be where they win. (Think of, for example, giving a summery of what you've missed in your groups, letting you join more and still know what's happening without needing to shift through it all, identifying events and activities happening that you'd be interested in. This will make it easier to join more groups as the cost of being in one is less, driving more engagement).

For facebook, it isn't the technology, but how it is applied, is where their game starts to get interesting.

When you give away the tooling and treat it as first class, you'll get the wider community improving it on top of your own efforts, cycle that back into the application of it internally and you now have a positive feedback loop where other, less open models, lack one.


Weaken the competition (google and ms). Bing doesn’t exist because it’s a big money maker for ms, it exists to put a dent in google’s power. Android vs apple. If you can’t win then you try to make the others lose.

I think you really have to understand Zuckerberg's "origin story" to understand why he is doing this. He created a thing called Facebook that was wildly successful. Built it with his own two hands. We all know this.

But what is less understood is that from his point of view, Facebook went through a near death experience when mobile happened. Apple and Google nearly "stole" it from him by putting strict controls around the next platform that happened, mobile. He lives every day even still knowing Apple or Google could simply turn off his apps and the whole dream would come to an end.

So what do you do in that situation? You swear - never again. When the next revolution happens, I'm going to be there, owning it from the ground up myself. But more than that, he wants to fundamentally shift the world back to the premise that made him successful in the first place - open platforms. He thinks that when everyone is competing on a level playing field he'll win. He thinks he is at least as smart and as good as everyone else. The biggest threat to him is not that someone else is better, it's that the playing field is made arbitrarily uneven.

Of course, this is all either conjecture or pieced together from scraps of observations over time. But it is very consistent over many decisions and interactions he has made over many years and many different domains.


I think what Meta is doing is really smart.

We don't really know where AI will be useful in a business sense yet (the apps with users are losing money) but a good bet is that incumbent platforms stand to benefit the most once these uses are discovered. What Meta is doing is making it easier for other orgs to find those use-cases (and take on the risk) whilst keeping the ability to jump in and capitalize on it when it materializes.

As for X-Risk? I don't think any of the big tech leadsership actually beleive in that. I also think that deep down a lot of the AI safety crowd love solving hard problems and collecting stock options.

On cost, the AI hype raises Met's valuation by more than the cost of engineers and server farms.


> I don't think any of the big tech leadsership actually beleive in that.

I think Altman actually believes that, but I'm not sure about any of the others.

Musk seems to flitter between extremes, "summoning the demon" isn't really compatible with suing OpenAI for failing to publish Lemegeton Clavicula Samaltmanis*.

> I also think that deep down a lot of the AI safety crowd love solving hard problems and stock options.

Probably at least one of these for any given person.

But that's why capitalism was ever a thing: money does motivate people.

* https://en.wikipedia.org/wiki/The_Lesser_Key_of_Solomon


Zuck equated the current point in AI to iOS vs Android and MacOS vs Windows. He thinks there will be an open ecosystem and a closed one coexisting if I got that correctly, and thinks he can make the former.

Meta is an advertising company that is primarily driven by user generated content. If they can empower more people to create more content more quickly, they make more money. Particularly the metaverse, if they ever get there, because making content for 3d VR is very resource intensive.

Making AI as open as possible so more people can use it accelerates the rate of content creation


You could say the same about Google, couldn't you?

Yea probably, but I don't think Google as a company is trying to do anything open regarding AI other than raw research papers

Also google makes most of their money off search, which is more business driven advertising vs showing ads in between user generated content bites


Mark probably figured Meta would gain knowledge and experience more rapidly if they threw Llama out in the wild while they caught up to the performance of the bigger & better closed source models. It helps that unlike their competition, these models aren't a threat to Meta's revenue streams and they don't have an existing enterprise software business that would seek to immediately monetize this work.

If they start selling ai in their platform, it's a really good option, as people know they can run it somewhere else if they had to (for any reason, e.g: you could make a poc with their platform but then because of regulations you need to self host, can you do that with other offers?)

Zuck is pretty open about this in a recent earnings call:

https://twitter.com/soumithchintala/status/17531811200683049...


Besides everything said here in comments, Zuck would be actively looking to own the next platform (after desktop/laptop and mobile), and everyone's trying to figure what that would be.

He knows well that if competitors have a cash cow, they have $$ to throw at hundreds of things. By releasing open-source, he is winning credibility, establishing Meta as the most used LLM, and finally weakening the competition from throwing money on the future initiatives.


They heavily use AI internally for their core FaceBook business - analyzing and policing user content, and this is also great PR to rehabilitate their damaged image.

There is also an arms race now of AI vs AI in terms of generating and detecting AI content (incl deepfakes, election interference, etc, etc). In order not to deter advertizers and users, FaceBook need to keep up.


They will be able to integrate intelligence into all their product offerings without having to share the data with any outside organization. Tools that can help you create posts for social media (like an AI social media manager), or something that can help you create your listing to sell an item on Facebook Marketplace, tools that can help edit or translate your messages on Messenger/Whatsapp, etc. Also, it can allow them to create whole new product categories. There's a lot you can do with multimodal intelligent agents! Even if they share the models themselves, they will have insights into how to best use and serve those models efficiently and at scale. And it makes AI researchers more excited to work at Meta because then they can get credit for their discoveries instead of hoarding them in secret for the company.

The same thing he did with VR. Probably got tipped off Apple is on top of Vision Pro, and so just ruthlessly started competing in that market ahead of time

/tinfoil

Releasing Llama puts a bottleneck on developers becoming reliant on OpenAI/google/microsoft.

Strategically, it’s … meta.


Generative AI is a necessity for the metaverse to take off. Creating metaverse content is too time consuming otherwise. Mark really wants to control a platform so the companies whole strategy seems to be around getting the quest to take off.

I would assume it's related to fair use and how OpenAI and Google have closed models that are built on copyrighted material. Easier to make the case that it's for the public good if it's open and free than not...

It’s a shame it can’t just be giving back to the community and not questioned.

Why is selfishness from companies who’ve benefited from social resources not a surprising event vs the norm.


Because they're a publicly traded company with a fiduciary duty to generate returns for shareholders.

The two are not mutually exclusive.

If it was Wikipedia doing this, sure, assume the best.

Looks like it can't be accessed outside the states? I get a "Meta AI isn't available yet in your country"

Llama3 is available on Poe.

It does seem uncharacteristic. Wonder how much of the hate Zuck gets is people that just don't like Facebook, but as person/engineer, his heart is in the right place? It is hard to accept this at face value and not think there is some giant corporate hidden agenda.

> but also to not use pessimistic AI "doomerism" as an excuse to hide the crown jewels and put it behind a centralized API with a gatekeeper because of "AI safety risks."

AI safety risk is substantial. It is also testable. (There are prediction markets on it, for example.) Of course, some companies may latch onto various valid arguments for insincere reasons.

I'd challenge everyone to closely compare ideas such as "open source software is better" versus "state of the art trained AI models are better developed in the open". The exact same arguments do NOT work for both.

It is one thing to publish papers about e.g. transformers. It is another thing to publish the weights of something like GPT 3.5+; it might theoretically be a matter of degree, but that matter of degree makes a real difference, if only in terms of time. Time matters because it gives people and society some time to respond.

Software security reports are often made privately or embargoed. Why? We want to give people and companies time to defend their systems.

Now consider this thought-experiment: assume LLMs (and their hybrid derivatives) enable perhaps 1,000,000 new kinds of cyberattacks, 1,000 new bioweapon attacks, and so on. Are there are a correspondingly large number of defensive benefits? This is the crux of the question I think. First, I don't expect we're going to get a good assessment of the overall "balance". Second, any claims of "balance" are beside the point, because these attacks and defenses don't simply cancel each other out. The distribution of the AI-fueled capability advance will probably ratchet up risk and instability.

Open source software's benefits stem from the assumption that bugs get shallower with more eyes. More eyes means that the open source product gets stronger defensively.

With LLMs that publish their weights, both the research and the implementations is out; you can't get guardrails. The closest analogue to an "OSS security report" would take the form of "I just got your LLM to design a novel biological weapon. Do you think you can use it to design an antidote?"

A systematic risk-averse person might want to ask: what happens if we enumerate all offensive vs defensive technological shifts? Should we reasonably believe that the benefits outweigh the risks?

Unfortunately, the companies making these decisions aren't bearing the risks. This huge externality both pisses me off and scares the shit out of me.


I too like making up hypothetical insane scenarios in my head. The difference is that they stay with me in the shower.

Was this meant as an insult? That is a plausible reading of what you wrote. There’s no need to be disparaging. It hurts yourself and others too.

I welcome substantive discussion. Consider this:

https://openai.com/research/building-an-early-warning-system...


You did not respond to the crux of my argument: The dynamics between offensive and defensive technology. Have you thought about it? What do you think is rational to conclude?

This is the organization that wouldn't moderate facebook during Myanmarr yeah? The one with all the mental health research they ignore?

Zuckerberg states during the interview that once the ai reaches a certain level of capability they will stop releasing weights - i.e. they are going the "OpenAI" route: this is just trying to get ahead of the competition, it's a sound strategy when you're behind to leverage open source.

I see no reason to be optimistic about this organization, the open source community should use this an abandon them ASAP.


I actually think Mr Zuckerburg is maturing and has a chance of developing a public persona of being decent person!

I say public persona, as I've never met him, and have no idea what he is like as a person on an individual level.

Maturing in general and studying martial arts is likely to be a contributing factor.


It's crazy how the managerial executive class seems to resent the vital essence of their own companies. Based on the behavior, nature, stated beliefs and interviews I've seen of most tech CEOs and CEOs in general, there seems to be almost a natural aversion to talking about things in non hyper-abstracted terms.

I get the feeling that the nature of the corporate world is often better understood as a series of rituals to create the illusion of the necessity of the capitalist hierarchy itself. (not that this is exclusive to capitalism, this exists in politics and any system that becomes somewhat self-sustaining) More important than a company doing well is the capacity to use the company as an image/lifestyle enhancement tool for those at the top. So many companies run almost mindlessly as somewhat autonomous machines, allowing pretense and personal egoic myth-making to win over the purpose of the company in the first place.

I think this is why Elon, Mark, Jensen, etc. have done so well. They don't perceive their position as founder/CEOs as a class position: a level above the normal lot that requires a lack of caring for tangible matters. They see their companies as ways of making things happen, for better or for worse.


It's because Elon, Mark, and Jensen are true founders. They aren't MBAs who got voted in because shareholders thought they would make them the most money in the shortest amount of time.

I kind of wonder. Does what they do counter the growth of Google?

I remember reading years ago that page/brin wanted to build an AI.

This was long before the AI boom, when saying something like that was just weird (like musk saying he wanted to die on mars weird)


The more likely version is that this course of action is in line with strategy recommended by consultants. Takes the wind out of their competitors sail

Always bet on Zuck!

It's like Elon saying: we have open sourced our patents, use them. Well, use the old patents and stay behind forever....

Exactly.

Yes - for sure this AI is trained on their vast information base from their social networks and beyond but at least it feels like they're giving back something. I know it's not pure altruism and Zuck has been open about exactly why they do it (tldr - more advantages in advancing AI through the community that ultimately benefits Meta), but they could have opted for completely different paths here.

The quickest way to disabuse yourself of this notion is to login to Facebook. You’ll remember that Zuck makes money from the scummiest pool of trash and misinformation the world has ever seen. He’s basically the Web 2.0 tabloid newspaper king.

I don’t really care how much the AI team open sources, the world would be a better place if the entire company ceased to exist.


Yeah lmao, people are giving meta way too much credit here tbh.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: