Meta does stellar work in AI. I’m quite certain the recent DMCAs were a case of ...

KaiserPro · on April 6, 2023

> left hand not knowing what the right hand is doing.

No it was trying to stop abuse. Unlike OpenAI, facebook can't get away with releasing a thing that is more than capable of making up libellous, racist or any other type of illegal or PR decimating thing.

The point of the really restrictive release was to at least try to limit that kind of issue.

Fortunatly for meta, most people who use the weights aren't the end user with a large twitter following, so the risk is (now) low.

NelsonMinar · on April 6, 2023

The Economist had a recent overview of the AI competitive landscape that highlights Meta's outsize role thanks to their embrace of openness. https://www.economist.com/business/2023/03/26/big-tech-and-t...

kristopolous · on April 6, 2023

They're well on their way to yahoo'ing themselves - valuable and significant software contributions for a platform nobody is using.

lacker · on April 6, 2023

I agree, Meta's work in AI has been really impressive, and it's encouraging how they are open-sourcing so much.

It's funny, back in the 90s I disliked Microsoft and thought they were inherently culturally opposed to open source. But big corporations really just follow their business incentives when it comes to open source.

It's not like Meta "loves to be open". But they would hate a world where all the powerful AI development happened on one of the big clouds. Imagine if every researchers used some closed-source API from Google or Azure or AWS. Meta would try to hire some AI people, and they'd be like, ugh if we take the job we have to use this weird Facebook-specific thing. (<cough>Flow</cough>)

So, supporting PyTorch and providing open source models is just a good business strategy for Meta. I'm glad to see it.

tomcam · on April 6, 2023

I’m glad Microsoft is open sourcing a lot of things.

I have never, ever seen a convincing business case for their doing so, however.

Not your opinion. A compelling business case.

lacker · on April 7, 2023

It's pretty simple, they make a lot of money off Azure, and open source software is a complement to Azure. See the classic:

https://www.joelonsoftware.com/2002/06/12/strategy-letter-v/

In particular, open source software prevents people from getting locked into any of the AWS, Linux, iOS, or Android ecosystems. All of those are popular software development ecosystems that Microsoft does not own or influence very much.

tomcam · on April 8, 2023

Not convincing or even relevant IMHO. Azure infra is largely closed source and Microsoft is making a ton of money off it.

SimonPStevens · on April 6, 2023

The business case for open sourcing dev tooling is that more developers will use them and build on them and that brings more people to your platforms.

tomcam · on April 7, 2023

Sure. Any proof at all for that? I mean it's the obvious answer, but generally in business people who don't pay for things tend not to pay for things. In fact, in the enterprise having a big budget is the way you keep your corporate power. Free stuff isn't a way to keep your budget high.

Again, I would like to agree with your thinking. I just don't think the real world backs it up.

kouteiheika · on April 6, 2023

> I agree, Meta's work in AI has been really impressive, and it's encouraging how they are open-sourcing so much.

My usual reaction to anything Facebook releases is "yawn, they released yet another model that is practically useless because it's released under a non-commercial license", but I'm pleasantly surprised that this one seems to be actually liberally licensed! Hopefully this continues into the future.

kamranjon · on April 6, 2023

Are the weights open source?

polygamous_bat · on April 6, 2023

Right here: https://github.com/facebookresearch/segment-anything#model-c...

neuronexmachina · on April 6, 2023

For anyone curious about file sizes for the PyTorch models:

* default/sam_vit_h_4b8939.pth: 2.4GB

* sam_vit_l_0b3195.pth: 1.2GB

* sam_vit_b_01ec64.pth: 358MB

woopwoop · on April 6, 2023

lordswork · on April 6, 2023

Where? From the article:

>Currently, the code (without the weights) is available on GitHub

dragonwriter · on April 6, 2023

The weights are linked from the readme; avoiding large binary resources in GitHub repositories is fairly normal.

wongarsu · on April 6, 2023

In the README on said github

https://github.com/facebookresearch/segment-anything#model-c...

Name_Chawps · on April 6, 2023

The weights are not "source".

sokoloff · on April 6, 2023

That is likely true in the most pedantic sense, but in practice, if I create an algorithm that works by using a series of matrix transformations against a set of carefully chosen (read: "trained") matrices and I open-source only the matrix manipulation code but not the specially chosen matrices, I think there's a fair argument to say that I haven't open-sourced the entire algorithm.

In the phrase "the model is open-source in every sense of the word", that, IMO, must include the weights.

Name_Chawps · on April 6, 2023

I'd be curious to know if you think open source companies should share all of their database records too.

sokoloff · on April 6, 2023

I think of two broad categories of database records: transactional data (data created while running the system) and domain data (data created during development and shipped to production as part of the release process).

The former type of data I wouldn't expect to ever be open-sourced. The latter type might or might not be, depending on the intent of open-sourcing the related system.

If I created a human language translation system that used a SQL database to store the dictionary (domain) data and claimed the system was open-source without shipping the domain data, I think people would rightly say that the system was not fully opened.

flangola7 · on April 6, 2023

The actual source code is not important. The source code can be printed on a single A4 page, the valuable final product is the weights you get after running the code for fifty million dollars of compute time.

Name_Chawps · on April 6, 2023

If they start claiming they're "open weights" or "open final product" I'll be up in arms.

sokoloff · on April 6, 2023

They are claiming that. They are also delivering on that claim: https://github.com/facebookresearch/segment-anything#model-c...

dragonwriter · on April 6, 2023

The weights seem to be under the same license, just distributed separately because it doesn’t make sense for the giant binary artifacts of training to be part of the source repository.

jtsiskin · on April 6, 2023

If I autogenerated a huge amount of C based on the weights, that added/multipled variables the same way the existing code+weights does, then would it be “source”?