Introducing ChatGPT and Whisper APIs

minimaxir · on March 1, 2023

> It is priced at $0.002 per 1k tokens, which is 10x cheaper than our existing GPT-3.5 models.

This is a massive, massive deal. For context, the reason GPT-3 apps took off over the past few months before ChatGPT went viral is because a) text-davinci-003 was released and was a significant performance increase and b) the cost was cut from $0.06/1k tokens to $0.02/1k tokens, which made consumer applications feasible without a large upfront cost.

A much better model and a 1/10th cost warps the economics completely to the point that it may be better than in-house finetuned LLMs.

I have no idea how OpenAI can make money on this. This has to be a loss-leader to lock out competitors before they even get off the ground.

vishal0123 · on March 1, 2023

> I have no idea how OpenAI can make money on this.

I did some quick calculation. We know the number of floating point operations per token for inference is approximately twice the number of parameters(175B). Assuming they use 16 bit floating point, and have 50% of peak efficiency, A100 could do 300 trillion flop/s(peak 624[0]). 1 hour of A100 gives openAI $0.002/ktok * (300,000/175/2/1000)ktok/sec * 3600=$6.1 back. Public price per A100 is $2.25 for one year reservation.

[0]: https://www.nvidia.com/en-us/data-center/a100/

[1]: https://azure.microsoft.com/en-in/pricing/details/machine-le...

freeqaz · on March 1, 2023

It's also worth mentioning that, because Microsoft is an investor, they're likely getting these at cost or subsidized.

OpenAI doesn't have to make money right away. They can lose a small bit of money per API request in exchange for market share (preventing others from disrupting them).

As the cost of GPUs goes down, or they develop at ASIC or more efficient model, they can keep their pricing the same and then make money later.

They also likely can make money other ways like by allowing fine-tuning of the model or charging to let people use the model with sensitive data.

somenameforme · on March 2, 2023

Who will they be making money from? OpenAI is looking for companies willing to:

- tolerate the current state of the chatbots

- tolerate the high per-query latency

- tolerate having all queries sent to OpenAI

- tolerate OpenAI [presumably] having 0 liability for ChatGPT just randomly hallucinating inappropriate nonsense

- be willing to pay a lot of money for the above

I'm kind of making an assumption on that last point, but I suspect this is going to end up being more small market business to business than mass market business to consumer. A lot of these constraints make it not really useable for many things. It's even somewhat suspect for the most obvious use case of search, not only because of latency but also because the provider needs to make more money per search after the bot than before. There's also the caching issue. Many potential uses are probably going to be more inclined to get the answers and cache them to reduce latency/costs/'failures' than endlessly pay per-use.

Anyhow, probably a lack of vision on my part. But I'd certainly like to know what I'm not seeing.

Too · on March 2, 2023

Sadly, it will likely mostly be used to generate endless streams of SEO spam. Not for interactive use.

CyanBird · on March 2, 2023

> Who will they be making money from?

Videogames maybe?

https://www.youtube.com/watch?v=ejw6OI4_lJw

This prototype is certainly something to have an eye out for

gimili · on March 2, 2023

Lots of usecases actually need creative "hallucinations" where they are valuable.

Even e.g. to develop hardware such as planes and cars: https://assistedeverything.substack.com/p/todays-ai-sucks-at...

dpkirchner · on March 2, 2023

A lot of companies use third parties to provide customer support, and the results are often very low quality and full of misunderstandings and what we now call hallucinations. I think a good LLM could do a better job and I bet it'd be cheaper, too. And as a bonus training the bots to handle new products is practically instant when compared to training humans.

whatshisface · on March 1, 2023

Their new AI safety strategy is to slow the development of the technology by dumping, to lower the price too much to fund bootstrapped competitors.

Silverback_VII · on March 2, 2023

I highly doubt it. OpenAI, Google and Meta are not the only ones who can implement these systems. The race for AGI is one for power and power is survival.

nr2x · on March 3, 2023

LLM can do amazing things, but it’s a basically just an autocomplete system. It has the same potential to take over the world as your phones keyboard. It’s just a tool.

dev1ycan · on March 2, 2023

They want this, the interview from their CEO sorta confirmed that to me, he said some crap about wanting to release it slowly for "safety" (we all know this is a lie).

But he can't get away with it with all the competition in other companies coming on top of China, Russia and others also adopting AI development

npunt · on March 1, 2023

Yeah we're in an AI landgrab right now where at- or below-cost pricing is buying marketshare, lock-in, and underdevelopment of competitors. Smart move for them to pour money into it.

whatshisface · on March 1, 2023

We have got to find a word for plans that are plainly harmful yet advantageous to their executors that's more descriptive than "smart..."

mgreg · on March 2, 2023

It's called "predatory pricing".

https://www.ftc.gov/advice-guidance/competition-guidance/gui...

npunt · on March 1, 2023

Agree. I didn't want to moralize, just wanted to point out it's a shrewd business move. It's rather anticompetitive, though that is hard to prove in such a dynamic market. Who knows, we may soon be calling it 'antitrust'.

aaronblohowiak · on March 1, 2023

Shrewd or cunning

ugh123 · on March 1, 2023

For that you need 2 words: venture capital

dankwizard · on March 1, 2023

Tactical

pharmakom · on March 2, 2023

Economists call this price dumping

jimbokun · on March 1, 2023

Uberly?

JW_00000 · on March 2, 2023

I prefer Webvan-esque. From https://en.wikipedia.org/wiki/Webvan:

> The company's investors pressured it to grow very fast to obtain first-mover advantage. This rapid growth was cited as one of the reasons for the downfall of the company.

IMO, selling at a loss to gain market share only makes sense if there are network effects that lead to a winner-takes-all situation. Of which there are some for ChatGPT (training data when people press the thumbs up/down buttons), but is that sufficient?

badloginagain · on March 2, 2023

Also useful for bootstrapping a dev ecosystem.

If engineers are getting into AI development through OpenAI, they're using tools and systems within the OpenAI ecosystem.

Daily on HN there's a post on some AI implementation faster than chatgpt. But my starting point is OpenAI. If you can capture the devs, especially at this stage, you get a force multiplier.

jeron · on March 2, 2023

I prefer uber-esque

FractalHQ · on March 2, 2023

anti-competitive predatory pricing

am44jnsf · on March 2, 2023

capitalism

Retr0id · on March 1, 2023

capitalistic, monopolistic

mach1ne · on March 2, 2023

Not very effective considering that it will be remade in open source 1-2 years from now.

zpeti · on March 2, 2023

Yeah, if I was an owner or investor like Jasper.ai (AI written content generation SaaS) I'd be pretty worried right now.

andrepd · on March 2, 2023

OpenAI doesn't have to make money right away. They can lose a small bit of money per API request in exchange for market share (preventing others from disrupting them).

Maybe I'm just old but back in my day this would be called "dumping" or "anti-competitive" or "market distortion" or "unfair competition". Now it's just the standard way of doing things.

kcatskcolbdi · on March 2, 2023

Sure it would be called those things and then nothing would come of it. If a country uses morally compromised methods to win a war history just calls it winning the war.

komali2 · on March 2, 2023

That seems to be changing. I've seen an uptick in criticism against the usa for unnecessarily (according to top military advisors, experts, generals etc at the time) dropping the atom bomb on Japan for example.

Doctor_Fegg · on March 2, 2023

Absolutely. The bombing of Dresden has been viewed as a mistake - verging on a war crime - in Britain for the last 20 or so years.

counttheforks · on March 2, 2023

Verging on? It was a mass murder of civilians. The US holocausted Japan and got away with it.

Tostino · on March 2, 2023

And Japan did similar levels of atrocities to Korea, China, and others in the region.

We can acknowledge that things were historically pretty horrible and strive to be better in the future.

timthorn · on March 2, 2023

By some people - that's certainly not a universal view.

SturgeonsLaw · on March 2, 2023

Wouldn't stop anyone from doing it again if the stakes were high enough

SHARSKY · on March 2, 2023

The winners write the history books

nr2x · on March 3, 2023

You kidding? What do you think a business loan is? Almost every business needs some form of subsidy to get off the ground.

panarky · on March 2, 2023

Microsoft isn't using Nvidia A100s for inference are they? Seems like they'd use their Project Brainwave custom FPGAs.

UncleOxidant · on March 1, 2023

> As the cost of GPUs goes down

Has that been happening? I guess there's been a bit of a dip after the crypto crash, but are prices staying significantly lower?

> or they develop at ASIC or more efficient model

This seems likely. Probably developing in partnership with Microsoft.

freeqaz · on March 1, 2023

It's definitely not happening at the high end of the market (NVIDIA A100s with 40GB or 80GB of RAM).

The cards that were used for mining have since crashed in terms of prices, but those were always gamer cards and very rarely Datacenter cards.

inciampati · on March 1, 2023

The market segmentation is likely a result of Nvidia's monopoly position. They double the RAM and flops, improve the thermals and housing and sell for ten fold the price. It doesn't make sense to me. A cheap 4090 theoretically outperforms even the A6000 RTX Ada. https://timdettmers.com/2023/01/30/which-gpu-for-deep-learni...

Nvidia needs to satisfy gamers, who individually can't spend more than a few $k on a processor. But they also have the server sector on lockdown due to CUDA. Seems they can easily make money in both places. Maybe those H100s aren't such a good deal...

If someone understands these dynamics better I'd be curious to learn!

dgacmu · on March 2, 2023

Nope, this is about it. They try to force the larger users into the expensive cards by prohibiting datacenter use in the driver EULA. This works sufficiently well in America, but it also means that you can find German companies like Hetzner that will happily rent you lots of consumer cards.

(There are also some density advantages to the SMX form factor and the datacenter cards are passively cooled so you can integrate them into your big fan server or whatnot. But those differences are relatively small and certainly not on their own worth the price difference. It's mostly market segmentation.)

lumost · on March 2, 2023

The main limiter in the data center setting is licensing, interconnects, and ram.

By contract - you can’t sell 4090s in a data center. You’ll find a few shops skirting this, but nobody can get their hands on 100k 4090s without raising legal concerns.

Likewise, nvidia A100s have more than a few optimizations through nvlink which are only available on data center chips.

Lastly, per card memory matters a lot Nvidia has lead the market on the high end here.

jhrmnn · on March 1, 2023

I understood this as $/FLOP, I think it's plausible that that has been happening.

cubefox · on March 2, 2023

"We know the number of floating point operations per token for inference is approximately twice the number of parameters"

Does someone have a source for this?

(By the way, it is unknown how many parameters GPT-3.5 has, the foundation model which powers finetuned models like ChatGPT and text-davinci-003. GPT-3 had 175 billion parameters, but per the Hoffmann et al Chinchilla paper it wasn't trained compute efficiently, i.e. it had too many parameters relative to its amount of training data. It seems likely that GPT-3.5 was trained on more data with fewer parameters, similar to Chinchilla. GPT-3: 175B parameters, 300B tokens; Chinchilla: 70B parameters, 1.4T tokens.)

vishal0123 · on March 2, 2023

https://arxiv.org/pdf/2001.08361.pdf. See the C_forward formula approxiamtion.

cubefox · on March 2, 2023

Thank you. Though it isn't quite clear to me whether the additive part is negligible?

vishal0123 · on March 2, 2023

From the paper

> For contexts and models with d_model > n_ctx/12, the context-dependent computational cost per token is a relatively small fraction of the total compute.

For GPT3, n_ctx is 4096 and d_model is 12228 >> 4096/12.

sytelus · on March 2, 2023

From eq 2.2, additive part is usually in few 10s of millions. So, for N > 1B, approximation should be good but it doesn't work. For example, GPT3 inference flops is actually 3.4E+18 so the ratio is 19,000 not 2.

minimaxir · on March 1, 2023

It's speculated that ChatGPT uses 8x A100s, which flips the conclusion. Although the ChatGPT optimizations done to reduce costs could have also reduced the number of GPUs needed to run it.

mlyle · on March 1, 2023

No, the amount of math done is (approximately) the same; if you make the denominator 8x bigger, you make the numerator 8x bigger too.

refulgentis · on March 1, 2023

Would multiplying the GPUs by 8 decrease another part of the equation by 1/8, i.e. X flops on 1 GPU = Y seconds, X flops on 8 GPUs = Y / 8?

(Btw I keep running into you or your content the past couple months, thanks for all you do and your well thought out contributions -@jpohhhh)

pelasaco · on March 1, 2023

I checked the price of a A100, and its costs 15k? Is that right?

alchemist1e9 · on March 1, 2023

And $2.25 per hour on 1 year reservation means 8,760 hours x 2.25 = $19,710 rent for the year. Not a bad yield for the provider at all, but makes sense given overheads and ROI expected.

rictic · on March 1, 2023

Cost of power usage is marginal compared to that too:

300W per A100 * 8766 hours per year * $0.12 per kWh = $316 to power an A100 for a year

Tepix · on March 2, 2023

$0.12 per kWh is a very low price these days

15155 · on March 2, 2023

Is this a low price for a datacenter negotiating their load with a utility provider (as most do?)

pelasaco · on March 1, 2023

yes, specially that you don't have to deal with buying it, maintaining it, etc...

sroussey · on March 1, 2023

Not sure why people are so scared of this (in general). Yes, it’s a pain, but only an occasional pain.

I’ve had servers locked up in a cage for years without seeing them. And the cost for bandwidth has plummeted over the last two decades. (Not at AWS, lol)

Sebb767 · on March 2, 2023

The problem isn't the good times, the problem is when something happens in the middle of the night, when a RAM stick goes bad or when you suddenly need triple the compute power. Usually, you get to feel the pain when you need it the least.

I'm hosting a lot of stuff myself on my own hardware, so I do sympathize with this argument, but in a time>>money situation, going to the cloud makes a lot of sense.

pelasaco · on March 2, 2023

exactly, you pay for the case where a down time on Sunday happens or you are in vacation out of the city and something happens.. I had this issue back in the days with my bitcoin miners.. Always when I was out of the city, one of them went down and I wanted to go back ASAP

thewataccount · on March 1, 2023

Wait 8x total? For everyone at once?

freeqaz · on March 1, 2023

Per instance (worker serving an API request) it requires 8x GPUs. I believe they have thousands of these instances and they scale them up with load.

Because the model isn't dynamic (it doesn't learn) it is stateless and can be scaled elastically.

thewataccount · on March 1, 2023

Ah okay, that makes a lot more sense thank you!

pharmakom · on March 2, 2023

I expect some level of caching and even request bucketing by similarity is possible.

How many users come with the same prompt?

thewataccount · on March 2, 2023

In my experience running the same prompt always get's different results. Maybe they cache between different people but I'm not sure that'd be worth the cache space at that point? although 8x A100s is a lot to not have caching...

vineyardmike · on March 1, 2023

Each model needs 8x to run at the same time per request.

cavisne · on March 2, 2023

Does openai actually specify the size of the model?

InstructGPT 2B outperformed gpt 3 175B, and chatgpt has a huge corpus of distilled prompt -> response data now.

I’m assuming most of these requests are being served from a much smaller model to justify the price.

OpenAI is fundamentally about training larger models, I doubt they want to be in the business of selling A100 capacity at cost when it could be used for training

kkielhofner · on March 1, 2023

But those A100s only come by eight and it’s speculated the model requires eight (VRAM).

For a three year reservation that comes to over $96k/yr - to support one concurrent request.

ALittleLight · on March 1, 2023

What do you mean one concurrent request? Can't you have a huge batch size to basically support a huge number of concurrent requests?

e.g. Endpoint feeds a queue, queue fills a batch, batched results generate replies. You are simultaneously fulfilling many requests.

kkielhofner · on March 1, 2023

Hopefully they’re doing plenty of batching - you don’t even need to roll your own as you’re describing. Inference servers like Triton will dynamically batch requests with SLA params for max response time (for example).

That said I don’t think anyone anyone outside of OpenAI knows what’s going on operationally. Same goes for VRAM usage, potential batch sizes, etc. This is all wild speculation. Same goes for whatever terms OpenAI is getting out of MS/Azure.

What isn’t wild speculation is that even with three year reserve pricing last gen A100x8 (H100 is shipping) will set you back $100k/yr - plus all of the usual cloud bandwidth, etc fees that would likely increase that by at least 10-20%.

We’re talking about their pricing and costs here. This gives a general idea what anyone trying to self host this would be up against - even if they could get the model.

vishal0123 · on March 2, 2023

> will set you back $100k/yr

This is 6 month of salary of one average developer's salary there. And BTW they are likely doing inference on 100s or 1000s of GPUs, not just 8.

kkielhofner · on March 2, 2023

Yes and a devops engineer to manage an even moderately complex cloud deployment is an average of an extra $150k/yr. I don't know where this "cloud labor skill, knowledge, experience, and time is free" thinking comes from.

8, 80k, or 800k GPUs depending on requirements and load - the point remains the same.

madelyn-goodman · on March 1, 2023

I really wonder if one way they are able to make money on it is by monetizing all the data that pours into these products by the second.

bboygravity · on March 1, 2023

The could probably live off of the NSA sponsoring alone.

ddmma · on March 1, 2023

Spot on

ilaksh · on March 2, 2023

They also mention in the new API docs that they are no longer keeping data submitted to ChatGPT. Or at least not to the ChatGPT API.

_just7_ · on March 2, 2023

Would probably pile up to an inhuman amount of data storage. Imagine having to pay for storing the equivalent of 1000 tokens of text within that budget of only 0.0002 dollars

Tepix · on March 2, 2023

That's one zero too many. Storage cost of 1000 tokens (6000 bytes) on a single HDD is $0.000000096 assuming $16/TB

drexlspivey · on March 1, 2023

the only one making money on this is NVIDIA

bigfudge · on March 1, 2023

Selling shovels in the goldrush…

Dave_Rosenthal · on March 1, 2023

Note that they also charge equally for input and output tokens but, as far as I understand, processing inputs tokens is much computationally cheaper, which drops their price further.

lumost · on March 1, 2023

Isn’t it 2.25 per hour per a100?

TheMagicHorsey · on March 1, 2023

Yes, he means 2.25 per hour with a 1 yr reservation.

p1esk · on March 2, 2023

You can get A100 on Lambda Labs cloud for $1.1/hr ($8.8/hr per 8xA100) without any reservation.

gyrovagueGeist · on March 1, 2023

Its a good baseline, but I very much doubt that openAI is paying anywhere near the public cost for their compute allocation.

lumost · on March 1, 2023

Direct purchasing isn’t too much cheaper. An H100 costs 35k new. OpenAI and MS are probably getting those for around 16k about 1.82 per hour.

osigurdson · on March 1, 2023

This would be a really fun optimization challenge for sure!

smy20011 · on March 2, 2023

The 600t performance is with sparsity in the spec. I think the price is nearly break even if sparsity is not used in the model.

dharma1 · on March 1, 2023

Reckon they will (if not already) use 4bit or 8bit precision and may not need 175b params

polygamous_bat · on March 1, 2023

> I have no idea how OpenAI can make money on this. This has to be a loss-leader to lock out competitors before they even get off the ground.

The worst thing that can happen to OpenAI+ChatGPT right now is what happened to DallE 2, a competitor comes up with an alternative (even worse if it's free/open like Stable Diffusion) and completely undercuts them. Especially with Meta's new Llama models outperforming GPT-3, it's only a matter of time someone else gathers enough human feedback to tune another language model to make an alternate ChatGPT.

krelian · on March 1, 2023

I thought it was Midjourney who stole their thunder. Stable Diffusion is free but it's much harder to get good results with it. Midjourney on the other hand spits out art with a very satisfying style.

anonylizard · on March 1, 2023

You are like 2 months out of date. Stable diffusion now has a massive ecosystem around it (civitai/automatic1111), that when used well, completely crushes any competitors in terms of the images it produces.

Midjourney is still competitive, but mostly because its easier to use.

Dalle2 will get you laughed out of the room in any ai art discussion.

Cantinflas · on March 1, 2023

It's amazing that "being two months out of date" in AI means that you are already a dinosaur

ChickenNugger · on March 1, 2023

For real! This stuff is moving fast. It feels like just last week I was posting about how it's going to change...art. And now there are hilarious deepfake memes of past and current presidents shit talking about video games.

There are a handful of ML art subs that have pretty amazing stuff daily. Especially the NSFW ones, which if you've studied any history of media VHS/DVD/Blu-ray/the internet, porn is a major innovation driver because humans are thirsty creatures.

JustBreath · on March 1, 2023

Yeah, it used to be I'd set Google results to just one year back, now I'm having to set it to one month.

buddhistdude · on March 1, 2023

Can you explain what you do that for?

soulofmischief · on March 1, 2023

Avoiding out of date advice, also filtering just for newest trends and techniques

LoganDark · on March 2, 2023

> porn is a major innovation driver because humans are thirsty creatures.

Hehe, yeah. I'm personally waiting for a model that is good at nonhuman stuff. Not just furries... but the focus seems to be on human content for now.

fennecfoxy · on March 3, 2023

Yeah that's definitely one thing it'll be great at, fantasy themed porn. For me furry stuff, but yeah for others whatever their tastes are.

Atm someone has to model, rig, texture, animate etc. Hopefully shortly we can just connect a bunch of systems together to generate video right from a prompt.

Useful for non-porn stuff as well, but the OP is right; lots of innovation occurs when humans are horny (porn) or angry (war).

pksebben · on March 2, 2023

train the porn model you want to see in the world

LoganDark · on March 6, 2023

not enough funds or training data for that to be economical at the moment :(

goldfeld · on March 1, 2023

What are some niche ML art subs to hang around in? Excepting the NSFW..

ChickenNugger · on March 1, 2023

I don't know about niche, but MachineLearning and StableDiffusion are the only SFW ones.

FWIW, the NSFW ones are unstable_diffusion, sdforall, sdnsfw, aipornhub

tomohelix · on March 2, 2023

Oh yes, that part worths a lot...

bigfudge · on March 1, 2023

Can you recommend any interesting ones?

nickthegreek · on March 1, 2023

I scan the SD subreddit and am subscribed to 3 big ai art youtubes just to stay up to date. With things moving this fast, alot of info is out of date and can be very burdensome to comb through the good stuff later. I try and set aside 30mins twice a week to apply the new techniques to help cement them in my mind and see their strengths and weaknesses. ControlNET really changed the game and now OffsetNoise (check out the IlluminatiDiffusion model) is now really pushing SD passed midjourney for real artistic control of your output.

kristofferR · on March 1, 2023

What are the youtubers?

nickthegreek · on March 1, 2023

https://youtube.com/@Aitrepreneur

https://youtube.com/@OlivioSarikas

https://youtube.com/@sebastiankamph

tracerbulletx · on March 1, 2023

ControlNet became popular with in the last couple of weeks and LoRA fine-tuning slightly before that and both things have completely changed the landscape too. Even a month out of date and you are a dinosaur at the moment.

dwringer · on March 1, 2023

These things are advancing way faster than they're being taken advantage of fully. Even SD 1.4 with months-old technology can produce far higher quality images than most of what's seen from midjourney or the latest tools. Things like ControlNet are amazing, to be sure, but there's nothing "dinosauric" about the technology without it. We haven't begun to see the limits of what's possible yet with existing tools, though you're right about the rapid pace of innovation.

birracerveza · on March 2, 2023

Make it two weeks. I haven't paid attention for a second and stuff like Controlnet pops up and evolves into Multi-Controlnet and then into MultiDiffusion.

PoignardAzur · on March 2, 2023

> and evolves into Multi-Controlnet and then into MultiDiffusion

What the?

*googles multi-controlnet"

Wow. These diffusion models are like weeping angels. You really can't take your eyes off of them for long.

gremlinsinc · on March 1, 2023

That's what the singularity is all about, a moment in time when 2 seconds late turns you into a dinosaur, be greatful it's 2 months, not 2 weeks, 2 days, or 2 minutes.

13years · on March 1, 2023

The AI utopia seems to be evolving into just a new rat race. I'm obsolete before I can think about it.

astrange · on March 2, 2023

The fun part of an S-curve looks exponential when you're on it.

evilduck · on March 1, 2023

Besides HN, what other venues are popular for staying current on this topic?

fisjy · on March 1, 2023

I started /r/aigamedev as a subreddit to keep up to date on generative AI technologies, with a focus on the tech and workflows for gamedev. Its largely my own interest links as I research for work and personal, but its growing, and fluff free (so far).

CuriouslyC · on March 1, 2023

if you hang out in reddit.com/r/stablediffusion you'll always be up to date

evilduck · on March 1, 2023

Thanks, what about broader news (GPT, Bing, Llama, etc)? The Stable Diffusion sub is only image AI oriented.

ghshephard · on March 1, 2023

Twitter. Folllow your top 10 or so ML/AI news summarizers. There is enough new information every day to keep you busy reading new papers, APIs, technologies.

Honestly the "This happened in the last week" is more information than anybody can fully wrap their heads around, so you just have to surf the headlines and dig into the few things that interest you.

addandsubtract · on March 2, 2023

Can you recommend some accounts to follow?

ghshephard · on March 5, 2023

The great thing about the AI world - is everything diffuses out quickly on the "For You" timeline - and then you can add people that you are interested in (which reinforces your interest in AI).

Some bootstrapping accounts might be @rosstaylor90, @rasbt, @karpathy, @ID_AA_Carmack, @DrJimFan, @YiTayML, @JeffDean, @dustinvtran, @tunguz, @fchollet, @ylecun, @miramurati, @nonmayorpete, @pmarca, @sama.

These are definitely not an authoritative list - just some of the AI names I follow - but, honestly - if any relevant news breaks - your timeline picks it up within minutes - so you just need a good random sample. Your interests will diverge and you'll pick up your own follows pretty quickly.

nickthegreek · on March 2, 2023

Youtube

flangola7 · on March 1, 2023

Agriculture reduced the global human economy/resource production doubling time from 100,000s of years to 1000s of years. Industrial revolution dropped it from 1000s to 10s or even 1s. If AI follows the same path it becomes 0.1 - 0.01 years.

Your 401k wouldn't need 40 years to build a comfortable retirement, only 4 weeks.

eternalban · on March 1, 2023

I just watched a video that convincingly showed that it is energy and energy alone that determines the production growth of humanity. Until the day AI can "generate" stuff (you know, something out of nothing) it can only at best streamline existing production, which is entirely capped by energy limits.

We may drown in oceans of audio, video, novels, poems, films, porn, blue prints, chemical formulas, etc. dreamed up by AI, but to realize these designs, blueprints, formulas, drugs, etc. ("production") we need to actually resource the materials, and have the necessary energy to make it happen.

It will not be AI that catapults humanity. It can definitely mutate human society (for +/-) but it will not (and can not) result in any utopian outcomes, alone. But something like cold fusion, if it actually becomes a practical matter, would result in productivity that would dwarf anything that came before (modulo material resource requirements).

lordnacho · on March 1, 2023

Couldn't the AI invent fusion?

astrange · on March 2, 2023

It could imagine fusion really well, but inventing working fusion doesn't appear to rely on "intelligence". "Intelligence" ain't "omnipotence".

eternalban · on March 1, 2023

Has it?

gnatolf · on March 1, 2023

Care to give a link to that video?

eternalban · on March 1, 2023

https://news.ycombinator.com/item?id=34982415

lotsofpulp · on March 1, 2023

Assuming the supply of labor or automation sufficient to provide a “comfortable retirement” also takes 4 weeks to come online.

theRealMe · on March 1, 2023

I understand you’re joking, but surely it’s asymptotic to some multiple of human gestational periods.

dotancohen · on March 1, 2023

Not once the A in AI becomes a (or the) critical creative factor.

flangola7 · on March 2, 2023

This was not a joke. It's a serious concept that's been around for several years.

pdntspa · on March 1, 2023

> Your 401k wouldn't need 40 years to build a comfortable retirement, only 4 weeks.

If this is true you can pretty much say goodbye to the concept of money. The inflation this brings about will be legendary

j45 · on March 1, 2023

One week of change in 2023 is like a month's worth of progress in previous years.

Edit: typo and clarity.

ChildOfChaos · on March 1, 2023

Stable diffusion might have a reasonable eco system around it, but automatic1111 was always around and 'completely crushes any competitors' is rather rich, Midjourney is still considered the standard as far as I was aware.

I used both again recently and the difference was very clear, midjourney is leaps and bounds above anything else.

Sure, stable diffusion has more control over the output, but the images are usually average at best, were as Midjourney is pretty stunning almost always.

samspenc · on March 1, 2023

I thought Midjourney was better as well, until I saw some recent videos from Corridor Crew on Youtube. For those who don't know, this is a VFX studio in LA that tries to keep at the cutting-edge of video production techniques and posts content to their Youtube channel, and they have a massive number of followers and several viral videos.

They recently created a full 7-minute anime using Stable Diffusion with their own models and their existing video production gear, I'll post the links and let the results speak for themselves

The actual 7-minute anime piece produced using SD: https://www.youtube.com/watch?v=GVT3WUa-48Y

Behind the scenes: "Did we change anime forever?" https://www.youtube.com/watch?v=_9LX9HSQkWo "VFX reveal before and after" https://www.youtube.com/watch?v=ljBSmQdL_Ow

ChildOfChaos · on March 1, 2023

While this is cool this doesn't change my opinion at all.

Each still image is still not that impressive. Good for them using the tech in a clever way but i don't find this that relevant.

pksebben · on March 2, 2023

those trees are part of a forest, you know.

the benefits of such fine grained control aren't a trick. it's why they were able to scrap together frames that don't jump all over the place (mostly).

the other benefit of such a broadly hacked upon model is that it grows in leaps and bounds.

All due respect to mid journey, but the stable diffusion hype is not just hype.

ChildOfChaos · on March 2, 2023

I agree, don't believe it's just hype, that level of control is useful, but for outright image quality and for most use cases, midjourney is better.

I still don't like the look of most of the Stable diffusion images, they just look slightly off/amateurish to me, where as midjourney produces images that make you go 'wow'

If you wanted to use these tools, midjourney would be my go too, with stable diffusion a backup for when some of the additional features were needed, perhaps inpanting on a midjourney image and using controlnet if needed but if you just want a pure image, midjourney is what you want.

sebzim4500 · on March 1, 2023

No one uses raw stable diffusion though, there are model mixes for whatever usecase you have.

cma · on March 3, 2023

> but automatic1111 was always around and 'completely crushes any competitors' is rather rich

Controlnet is the big new thing, it is on a different level from earlier img2img.

GaggiX · on March 1, 2023

What models/LoRA you use with SD?

AuryGlenz · on March 1, 2023

It doesn’t really matter. He’s right - Midjourney is leagues ahead as far as actually following your prompt and having it be aesthetically pleasing. I say this as someone who has made several Dreambooth and fine tuned models and has started to use Stable Diffusion in my work.

Now, if you happen to find or make a SD model that’s exactly what you’re looking for you’re in luck. I have no interest in it but it seems like all of the anime models work pretty well.

You obviously have a ton more control in SD, especially now with ControlNet. But if you want to see the Ninja Turtles surfing on Titan in the style of Rembrandt or something Midjourney will probably kick out something pretty good. Stable Diffusion won’t.

Kurtz79 · on March 1, 2023

Easier to use is often all that it takes.

In Midjourney you get fantastic results just by using their discord and a text prompt.

To get some similar results in Stable Diffusion you need to set it up, download the models, understand how the various moving parts work together, fiddle with the parameters, donwload specific models out of the hundreds (thousands?) available, iterate, iterate, iterate...

CuriouslyC · on March 1, 2023

Setting up the environment and tooling around in the code is not a burden, it's a nice change of pace from the boring code I have to deal with normally. Likewise, playing around to build intuition about how prompts and parameters correspond to neighborhoods in latent space is quite fun.

Beyond that, being able to go to sleep with my computer doing a massive batch job state space exploration and wake up with a bunch of cool stuff to look at gives me Christmas vibes daily.

practice9 · on March 1, 2023

Sure, but if Midjourney outputs a low quality results for your prompt, they are going to be much more difficult to improve. It's a black box at this point.

While with SD there can be multiple solutions for a single problem, but yeah, you have to develop your own workflow (which will inevitably break with new updates)

pksebben · on March 2, 2023

... along with like half the interfaces to pytorch, and whatever hacks you had to implement to get mps working.

But it's this kind of stuff that keeps me engaged. SD is truly a godsend to masochistic hacker types.

13years · on March 1, 2023

There is playgroundai.com and leonardo.ai. Nothing to download.

skybrian · on March 1, 2023

Why do you say that? Couldn’t you just use dreamstudio.ai?

throwaway675309 · on March 1, 2023

Ridiculous. Stable diffusion might have a massive ecosystem around it but mid journey is making money hand over fist. Most people don't even necessarily have a discreet GPU necessary to be able to run SD, and the vast majority of artists that I know are using midjourney and then doing touchups afterwards.

Even with all the different models that you can load in stable diffusion MJ is 1000 times better at natural language parsing and understanding, and requires significantly less prompt crafting to be able to get aesthetically pleasing results.

Having used automatic1111 heavily with an RTX 2070, the only area I'll concede SD can do a better job is in closeup Headshots and character generation. MJ blows SD out of the water where complex prompts involving nuanced actions are concerned.

Once midjourney adds controlnet and inpainting to their website that's pretty much game over.

DoesntMatter22 · on March 1, 2023

How is MJ making any money? Do they charge? When I went on their discord it was free.

Garlef · on March 2, 2023

you get 200 images for free.

afterwards its $10, $30, $60 per month

jgalt212 · on March 2, 2023

Yeah, but what does one do with an infinite supply of ok-ish images?

Garlef · on March 4, 2023

depending on what you want, you can actually get images that are pretty nice. i'm using it to generate abstract art and i've seen worse in the real world

here are two examples on my insta account:

https://www.instagram.com/p/Co9O0P6Aga_/ https://www.instagram.com/p/CoXOnBuMMpL/

minimaxir · on March 1, 2023

The StableDiffusion subreddit is a good resource on the current state of Stable Diffusion, particularly post-ControlNet.

https://www.reddit.com/r/stablediffusion

Baeocystin · on March 1, 2023

I still think that Midjourney is hamstringing themselves by being Discord-only. And their keyword nannying is pretty bad. It a testament too their overall quality that they're still as popular as they are are, but I really don't think they are doing themselves any favors, especially as the Stable Diffusion ecosystem continues to grow.

astrange · on March 2, 2023

Only having a Discord interface is a cheap way of having someone else take care of abuse/CSAM issues for you.

refulgentis · on March 1, 2023

This isn’t as true as it sounds, ex. stable diffusion can do better but requires in depth practice and experience.

For your average user, DallE is easy, MJ is fairly disorienting, and SD requires a technical background. I agree with you completely no one serious is doing art with DallE.

I would have said same as you until I tried integrating SD vs. DallE APIs, I desparately want SD because it’s easily 1/10th the cost, but it misses the point much more often. Probably gonna ship it anyway :X

RichardGao112 · on March 1, 2023

Check out Evoke SD API: https://evoke-app.com/

You don't need a technical background at all really. We've also got something cooking that does prompt tuning in the background so there's less prompting needed from the user.

We also have a discord: https://discord.gg/dXJtarPsCm

xnx · on March 1, 2023

Is there a good news site/blog for keeping up to date on AI tools and development? I'm looking for something a little more edited than a Reddit board.

OkGoDoIt · on March 1, 2023

Do you have any recently updated examples, blog posts, whatever showing that DALLE is worse than modern stable diffusion? I was still under the impression that DALLE was better (with better meaning the images are more likely to be what you asked for, more lifelike, more realistic, not necessarily artistically pleasing), with the downside of it being locked away and somewhat expensive. And my understanding is that stable diffusion 2.0+ is actually a step backwards in terms of quality, especially for anything involving images of humans. But as this thread acknowledges, this area is moving very quickly and my knowledge might be out of date, so definitely happy to see some updated comparisons if you have any to suggest. It feels like ever since Chat GPT came out, they haven’t been many posts about stable diffusion an image generation, they got crowded out of the spotlight.

aiappreciator · on March 1, 2023

If you want an example, go check out DALLE2 subreddit vs SD subreddit.

The former is a wasteland, the latter is more popular than r/art (despite having 1% of subscribers, it has more active users at any given moment)

If you want something ready to use for a newbee, midjourney v4 crushes DALLE2 on both prompt comprehension and the images look far more beautiful.

If you are already into art, then StableDiffusion has a massive ecosystem of alternate stylized models (many which look incredible) and LORA plugins for any concept the base model doesn't understand.

DALLE2 is just a prototype that was abandoned by OpenAI, their main business is GPTs, DALLE was just a side hustle.

CuriouslyC · on March 1, 2023

Dall-E is more likely to generate an image that to some degree contains what you asked for. It also tends to produce less attractive images and is closed so you can't really tune it much. People mostly don't try to do completely whole cloth text to image generation with stable diffusion, for anything involved they mostly do image to image with a sketch or photobashed source. With controlnet and a decently photobashed base image you can get pretty much anything you want, in pretty much any style you want, and it's fast.

dragonwriter · on March 1, 2023

> I was still under the impression that DALLE was better (with better meaning the images are more likely to be what you asked for, more lifelike, more realistic, not necessarily artistically pleasing),

“Artistically pleasing” is often what people ask for.

> with the downside of it being locked away and somewhat expensive.

Those are enormous downsides. Even if DALL-E was better in some broadly relevant ways in the base model, SD’s free (gratis, at least) availability means the SD ecosystem has finetuned models (whether checkpoints or ancillary things like TIs, hypernetworks, LORAs, etc.) adapted to... lots of different purposes, and you can mix and match these to create your own models for your own specific purposes.

A web interface backed by strictly the base SD model (of any version) might lose to the same over DALL-E for uses where the set of tools in the SD ecosystem do not.

OkGoDoIt · on March 2, 2023

I don’t disagree about the downside of DALL-E being locked away and expensive. It’s been exciting to see the Cambrian explosion of improvement to stable diffusion since its initial release. This is how AI research should be done and it’s sad that “Open AI” is not actually open.

That being said, for a business use cases, where I want to give it a simple prompt and have a high chance of getting a good usable result, it’s not clear to me that stable diffusion is there yet. Many of the most exciting SD community results seem to be in anime and porn, which can be a bit hard to follow. I guess the use cases that I’m excited about are things like logo generators, blog post image generators, product image thumbnail generators for e-commerce, industrial design, etc.

But please prove me wrong! I’m excited for SD to be the state of the art, it’s definitely better in the long term that’s it’s so accessible. I‘m sure a good guide or blog post about what’s new in stable diffusion outside of anime generation would be an interesting read.

astrange · on March 2, 2023

DALLE2 is underpowered and has never improved since they released it. The actual quality of the images is very low (literally in the sense of they have lots of artifacts) because they saved CPU time by not running enough diffusion passes.

GaggiX · on March 1, 2023

People usually still use SD v1.5 because of the experience that people have with finetuning and merging with it. Also a lot of LoRA are trained for v1.4/1.5 models and they wouldn't work with v2.1, of course you also have incredible capability to control the generation with SD and this helps, to see some result: https://youtu.be/AlSCx-4d51U

DoesntMatter22 · on March 1, 2023

Dalle 2 was great initially but the SD BLEW past it. I mean way way way past it. Dalle2 is like a Model T Ford and SD is a Fighter Jet. It's that different. Dalle-2 is dead already.

pksebben · on March 2, 2023

given what they're up to these days, I think it's understandable that they'd leave dalle to gather dust.

rom-antics · on March 1, 2023

I love that there are so many options that people disagree about which is best. THAT is probably the worst thing that can happen to OpenAI - not just one competitor, but a whole heap of them.

TeMPOraL · on March 1, 2023

I must be horribly out of date then - I thought Midjourney was the cut down DALL-E approximation, created to givr something to play with to people who couldn't get on the various waiting lists, or can't afford to run SD on their own.

jeron · on March 1, 2023

>Dalle2 will get you laughed out of the room in any ai art discussion.

and claiming AI art is art would get you laughed out of any art discussion.

personally I think AI art is really cool, but to discount what Dalle 2 did for AI art is unfair.

soulofmischief · on March 1, 2023

My company has a team of AI-enpowered artists who would overwhelmingly disagree with you on the premise that AI art is not art. Maybe you're the only one doing the laughing.

astrange · on March 2, 2023

A lot of online "artists" are mad about it. Generally not professionals who actually need productivity, but semipros who live off one-off commissions or else people who are just generally mad at tech bros.

cypress66 · on March 1, 2023

Don't forget controlnet, which is a game changer.

coldtea · on March 1, 2023

>You are like 2 months out of date. Dalle2 will get you laughed out of the room in any ai art discussion.

So, the field is so immature than things change completely every few months?

GaggiX · on March 1, 2023

Didn't you realize how bleeding edge this technology is?

napier · on March 1, 2023

This time last year the field was a few hundred people with their colab notebooks.

pksebben · on March 2, 2023

that's the crazy bit - this field is at like, toddler levels of maturity, and it's already producing mind-blowing results.

mlboss · on March 1, 2023

Stable diffusion + ControlNet is fire! Nothing compares to it. ControlNet allows you to have tight control over the output. https://github.com/lllyasviel/ControlNet

drusepth · on March 1, 2023

I think that's kind of a bigger issue with Dall-E: they just sat in the middle of the two consumer extremes, without a differentiating feature themselves. Midjourney ate away at them from the quality highground while Stable Diffusion bit their ankles from the cost lowground.

skocznymroczny · on March 1, 2023

Check out OpenJourney. It's Stable Diffusion trained on Midjourney style images. I had great results with it.

karmasimida · on March 2, 2023

SD has won their plugin system crashed the game

karmasimida · on March 1, 2023

But this is bound to happen at some point I think?

ChatGPT is massive success, but that means the competitor will jump in at all cost, and that includes open source effort.

tstrimple · on March 1, 2023

Bound to happen, so establish yourself as deeply as possible as quickly as possible. Once folks are hooked up to these APIs, there's a cost and friction to switching. This just feels like a land grab that OpenAI is trying to take advantage of by moving quickly.

jejeyyy77 · on March 1, 2023

Is there though? It's just a matter of swapping out $BASE_API_URL.

tstrimple · on March 1, 2023

Most of the clients I'm working with aren't interested in the base level of service. They are looking to further train the models for their specific use cases. That's a much higher barrier to switch than replacing an API. You've got to understand how the underlying models are handling and building context. This sort of customer is paying far more than the advertised token rates and are locked in more tightly.

jejeyyy77 · on March 2, 2023

not really. fine tuning generally just involves running tailored training data through the model - the actual training algorithm is fairly generalized.

For example, the Dreambooth fine tuning algorithm was originally designed for Google's image, but was quickly applied to Stable Diffusion.

KRAKRISMOTT · on March 1, 2023

You have to rebuild all your prompts when switching providers.

DrBenCarson · on March 1, 2023

If the superlative LLM can’t handle prompts from another provider, it just isn’t the superlative LLM.

This area by definition has no moats. English is not proprietary.

Use case is everything.

drusepth · on March 1, 2023

Switching to another LLM isn't always about quality. Being able to host something yourself at a lower or equal quality might be preferred due to cost or other reasons; in this case, there's no assumption that the "new" model will have comparable outputs to another LLM's specific prompt style.

In a lot of cases, you can swap models easier but all the prompt tweaking you did originally will probably need to be done again with the new model's black box.

novaRom · on March 1, 2023

Host something yourself also for educational reasons, just experimenting, this is how new applications and technologies to be discovered and created.

jkaptur · on March 1, 2023

Do you? They're natural language, right?

travisjungroth · on March 1, 2023

You don't have to, but they will have been optimized for one model. It's unlikely they'll work as well on a different model.

GalenErso · on March 1, 2023

I can't wait for TolkienAPI, where prompts will have to be written in Quenya.