Meta Movie Gen

syntaxing · 2024-10-04T13:21:20 1728048080

I find the edit video with text the most fascinating aspect. I can see this being used for indie films that doesn’t have a CGI budget. Like the scene with the movie theater, you can film them on lounge chairs first and then edit it to seem like a movie theater.

gen3 · 2024-10-04T15:07:00 1728054420

100% agree, the background replace that puts the guy into a stadium would be fully usable as a cut in a movie/tv show, and the background is believable enough that no one would bat an eye. If you use it properly, I expect a quality uplift on indie films/shorts. Your limit is your creativity

jeltz · 2024-10-05T10:59:55 1728125995

I personally expect a decrease in quality. Without limits people tend to get less creative. Sure, there is some balance here in that tools also enable new things to be done which are not possible without tools but working around limits has often inspired some of the most creative works.

amelius · 2024-10-05T11:49:40 1728128980

I don't think that is necessarily true. Right now movies are so expensive that they can be created only by a few handfuls of people. But those people might not be the most creative people around. If thousands of people can create movies, we might find out that some people we didn't know of are far more creative.

Also "creation by committee" isn't a thing when somebody can produce a movie in their basement.

Anyway, I look forward to people using this tech to create alternative endings of existing movies.

fallous · 2024-10-05T12:08:12 1728130092

So expensive? It has never been cheaper to create movies thanks to digital cameras and non-linear editors, digital audio workstations, etc. You are no longer encumbered by the costs of film, development, renting an edit bay, requiring an audio editing studio to mix audio and maintain a tape library of special effects or hire foley artists, no need for an optical printer to layer visual effects, etc.

You already can produce a movie in your basement, many of which can be found on YouTube.

williamcotton · 2024-10-05T12:51:51 1728132711

Sure, the cost of making Clerks has dropped, but not the cost of making Dune.

xoac · 2024-10-05T17:05:10 1728147910

You aren’t going to be making dune with this. Maybe a “we have dune at home version”…

littlestymaar · 2024-10-05T18:22:39 1728152559

If the results aren't cherry-picked it looks like more than good enough to make any high budget movie from the early 2010s if not Dune.

Com60Score · 2024-10-05T22:39:13 1728167953

Like with all generative AI this comes down to how specific you can get, and how consistent you can be. If you can art direct each element of the frame, down to the design of the individual props and scenic items, and have those items remain consistent from shot to shot. Then do the same with lighting, actors, camera characteristics (eg lens, focus, position in the scene, framing), etc, etc, etc, then maybe you’ve got a chance of making a ‘high budget movie from the early 2010s’. But I haven’t seen any generative ai that comes even close to this level of control or consistency… They’re closer to slot machines than anything consistent…

norswap · 2024-10-07T16:38:50 1728319130

Yup, you can notice some issues even in their picked example. e.g. the prompt for the video of the painting woman says "there is a bear cub at her feet" and it quite clearly is not "at her feet" in the video.

alickz · 2024-10-05T20:06:44 1728158804

> Without limits people tend to get less creative

But lowering those limits allows for more people to get creative

How many beautiful stories never left their author's head because their author couldn't afford it? Either the monetary cost or the opportunity cost

Considering how many movies come from one place in one country (Hollywood), we haven't even scratched the surface of human creativity

fragmede · 2024-10-05T11:11:04 1728126664

And without being forced to interact with other people. The movie made by one creative and 100 automatons does not ai all compare to the one where there are multiple brilliant creatives butting heads and personalities and choosing never to work with each other again but the show must go on.

How many movie lines have been adlibbed but are absolute classics? Sonofabitch, he stole my line!

XorNot · 2024-10-05T15:22:47 1728141767

What a bizarre statement in an age when the phrase "executive meddling" can describe the sameness of so much content output, and most of the greatest flops have a story which goes "yeah there were too many people involved".

Like the second Avengers movie had this problem in spades.

i-LINK · 2024-10-05T18:29:06 1728152946

It's not a bizarre statement at all. An executive meddling with something because X or Y element of a piece of art doesn't align with A or B market trend is not the same thing as people working together and sometimes clashing due to creative differences. You'll find that most works that you, or other people, like weren't the result of a sole individual's creative decisions going completely unchallenged. Others suggested, or revised, or fought. There can be too many cooks in the kitchen, of course, but that's an entirely different issue from executive meddling.

fragmede · 2024-10-05T23:19:21 1728170361

I'm not sure why you jump to executive involvement when I specifically stated two creatives butting heads. There's all sorts of stories where execs or someone came in and forced a movie to have, eg a giant robot spider in a wild west settint that didn't make sense but they really wanted one in some movie so they forced it to happen in one of the projects they were overseeing. but the sum of us is better than individual, so while there are solo artists out there, they're the exception rather than the rule.

golergka · 2024-10-05T17:00:07 1728147607

The era of easily available game engines have brought to live hundreds of thousands of garbage games, but that doesn't matter. What matters are hundreds of really innovative ones that we wouldn't get otherwise.

bbqfog · 2024-10-05T17:11:37 1728148297

Using AI has all kinds of new and unusual limits. It's hard to get exactly what you want and you often get unexpected results along the way.

mrandish · 2024-10-05T06:06:30 1728108390

> Your limit is your creativity

In the professional creative tools business, "Now the only limit is your creativity" has been a popular marketing tagline for decades, especially for products based on new enabling technologies. It's common enough that a wry corollary has developed in response, which goes: "Unfortunately, for a lot of people that's a pretty big limit."

redundantly · 2024-10-05T00:09:41 1728086981

> Your limit is your creativity

And how many tokens you can afford.

gen3 · 2024-10-05T00:31:57 1728088317

You’re not wrong, but the comment somewhat misses the point. A shot like that would require you to rent a stadium (generally not cheap) along with paying for a believable number of extras. That would put the shot out of budget for most indie filmmakers. Spending $20 on tokens to get “good enough” is totally worth it, and allows you to get shots that were previously out of reach.

ErigmolCt · 2024-10-05T10:30:08 1728124208

In some ways it opens up creative possibilities

ForHackernews · 2024-10-04T13:36:25 1728048985

Why bother? Actors cost money and scheduling is difficult. Do the whole thing in AI - the model will be trained on better actors than your indie cast, anyway.

M4v3R · 2024-10-04T13:46:56 1728049616

It will be a loong time before AI can produce lead actors that are believable, act exactly the way you as a director want and tell the story you want to tell, so I think at least for now you'll still need the actors for the lead roles. But I can totally this being used for generating people/stuff in the background of certain shots in a low budget movie.

noch · 2024-10-05T10:09:20 1728122960

> It will be a loong time before AI can produce lead actors that are believable.

A "loong time" will be sooner that most of us think.

The way this is done currently is similar to motion capture except that the tools are gradually becoming democratized: A single actor can act all the roles you need (You could even act the scenes and roles yourself). That footage is then fed into a model that generates an actor with the appearance and voice that you desire.

As a random on the internet, my prediction is that within a year, you'll be able to produce lead actors that are believable using movie generation plus smartphone footage of yourself acting the scene.

Initially it will be expensive to make a feature length film. But from 2025 onward, the cost will come down as the tools improve. These will be a different type of movie for sure, but every advancement in film technology has always led to films that seem strange compared to what came before.

XorNot · 2024-10-05T15:24:17 1728141857

You're getting downvoted, but I agree with you except for your timeline. This won't be possible in a year. What's here is a concept demo, but the gulf between "that's neat" and "you can make a decent 10 minute short film" is pretty vast.

noch · 2024-10-06T01:00:11 1728176411

> But the gulf between "that's neat" and "you can make a decent 10 minute short film" is pretty vast.

Agreed. I expect the tools I described to be prohibitively expensive for the average person for some time. By the end of 2025, probably only well-funded studios will be able to use such tools and probably not economically.

But I'd be quite surprised if Hollywood studios/publishers aren't using their immense back catalogue to train private models right now. I don't think they'll ever allow royalty-free movie generation tools to be used that were trained on their catalogues. So perhaps there'll be a cottage industry of stock footage by amateur and professional actors for training/augmenting the movie generation models and tools that will be available for general use, royalty free.

Or perhaps these tools will simply emerge as just another TikTok filter and we'll see goofy couples who filmed a skit in their living room presented as gladiators arguing on the surface of Mars while their dog runs back and forth between them unable to choose a favourite.

bboygravity · 2024-10-05T11:36:13 1728128173

"long time" in LLM land = 2 years tops

mikae1 · 2024-10-05T18:24:53 1728152693

Seems you drank way too much of the Altman cool aid. :-D

How has LLMs actually developed in the last two years?

Have you noticed movies and TV series use multiple cameras to capture a scene from different angles?

When it comes to video these things can't get consistency between angles or scenes.

Add to that that the results are full of glitches and the resolution is equivalent to a CRT screen in the year 2000. Same resolution as s

Fixing these limitations would equate to a revolution rather than a steady evolution.

And let's not forget that these systems are also hugely unprofitable at this stage.

gloflo · 2024-10-05T12:54:32 1728132872

Anything is possible at zombo.com

ipaddr · 2024-10-04T21:54:48 1728078888

Isn't that a core problem now. Getting actors to act exactly how you want was never a solved problem.

But this limits promotion where actors do interviews and sell the movie to the public. It also limits an actor doing something crazy that tanks a movie like a tweet.

l33tbro · 2024-10-04T22:18:18 1728080298

The answer is that it depends on the director. For David Fincher or the Coen brothers, having this level of exactitude and precision is what their craft is all about.

But for plenty of other masters - think Cassavetes, Mike Leigh, even PTA - the actor's outstanding talent and instincts bring something to the script and vision that is outside of their prescriptive powers. Their focus is essentially setting up a framework for magic to happen inside of.

lagadu · 2024-10-05T11:13:44 1728126824

> Getting actors to act exactly how you want was never a solved problem.

As a choreographer myself, that's not necessarily a problem but rather a feature: it depends on how the director creates. Often you want what's unique to the performer, you don't want them to do something that's exactly like what you envisioned but whatever their interpretation/vision of it is, the "imperfectness" is what makes it interesting and rich.

squarefoot · 2024-10-05T05:38:13 1728106693

> Getting actors to act exactly how you want was never a solved problem.

Also, some great lines in movies came from actors ad libs. I hope there will always be some space for mild hallucination; without improvisation we wouldn't even have jazz.

mrandish · 2024-10-05T06:15:41 1728108941

Yep. Including one of the best moments in one of the most influential sci-fi films of all time. https://en.wikipedia.org/wiki/Tears_in_rain_monologue

lloeki · 2024-10-05T07:38:15 1728113895

"I know."

https://www.snopes.com/fact-check/harrison-ford-i-know/

zappchance · 2024-10-04T13:47:09 1728049629

Consistentency between scenes is one possible reason.

deng · 2024-10-04T14:22:34 1728051754

These are not movies, these are clips. The stock photo/clip industry is surely worried about this, and probably will sue because 100% these models were trained on their work. If this technology ever makes movies, it'll be exactly like all the texts, images and music these models create: an average of everything ever created, so incredibly mediocre.

sethammons · 2024-10-05T05:03:34 1728104614

I imagine a movie maker where you say "use model A and put them in scene 32f, add a crowd and zoom in on A. They should look very worried." Then they can just play with it. Then save a scene, onto the next. Since AI can continue an animation, I don't see why it can't faithfully recreate given models with more development

DirkH · 2024-10-05T16:34:09 1728146049

What'll happen in both industries is the same that will happen everywhere else: adopt or die. The huge winners will be those that creatively use this new tool without 100% relying on it to do everything.

jkolio · 2024-10-05T14:19:17 1728137957

There have been several AI short film festivals, as well as several AI music videos that have been produced. The caveats are that quality varies, the best ones simply employ solid production in general (good editing, strong directorial vision, etc), and I don't know that anything feature length is out or even in the works.

spaceman_2020 · 2024-10-05T17:15:20 1728148520

the problem is that these stock footage companies are up against the richest corporations to ever exist. Legal recourse will take a monumental amount of money and time.

Hate to say this, but as things stand, tech companies stand to become all pervasive and all powerful if AI keeps growing the way it has

Aeolun · 2024-10-05T12:03:38 1728129818

Why are there so many websites that are essentially static HTML that make my phone stutter?

The video’s look cool, but I can’t really enjoy reading about them if my phone freezes every 2 seconds.

_heimdall · 2024-10-05T12:30:25 1728131425

I'm seeing weird bank on a Pixel 6a / chromium browser as well. I'm on mobile so I can't check the source, but this can't just be static HTML.

When I scroll the page, sections of text are missing then pop in, randomly though not as a scroll driven animation. It almost feels like something is blocking the browser's render loop and it can't catch up to actually paint the text. That'd be an insane bug on such a simple page, though I put nothing past react these days if they used it here.

joquarky · 2024-10-06T02:43:28 1728182608

I'm also having trouble with the page.

I wish web browsers had a "pause" button for scripts so I can just scroll to the bottom, let everything load and then hit pause and "freeze" the page contents so I can read without distractions.

I also feel like the quality of web UX is in rapid decline, and nobody wants to hire competent web developers who grok the fundamentals anymore.

Kailhus · 2024-10-05T21:01:44 1728162104

Not so much stutter here but definitely some layout shifts as images/video elements load :/

rudasn · 2024-10-05T18:14:23 1728152063

It's actually quite usable and fast if you turn javascript off.

hnben · 2024-10-08T12:42:00 1728391320

maybe the companies, who make them, have not enough know-how in web-development.

runeks · 2024-10-05T12:13:46 1728130426

Which browser?

arendtio · 2024-10-05T13:56:00 1728136560

And which device.

Over the years, adding many simultaneous videos to websites has become quite common, and I have always marveled at how well many devices can handle this.

Nevertheless, it is pretty demanding for the hardware, and many smartphones are not made for such tasks.

kosolam · 2024-10-05T15:20:03 1728141603

Must be because Facebook use php

aDyslecticCrow · 2024-10-05T16:17:32 1728145052

Your comment makes no sense

selbyk · 2024-10-05T18:32:50 1728153170

I think that's the joke

Eikon · 2024-10-05T18:45:17 1728153917

Your comment uses php.

fasa99 · 2024-10-05T14:05:35 1728137135

Q: Why are there so many websites that are essentially static HTML that make my phone stutter?

A: because your phone is a potato

The first free tech support from HN is free, subsequent questions will be $29.99 per.

baxuz · 2024-10-05T17:43:16 1728150196

Took over 20s to settle on a Galaxy s21.

reneberlin · 2024-10-05T02:40:39 1728096039

We humans are so excessively dependent on vision input and with entertaining through visuals, too. But more and more all those visuals become meaningless to me and it all just feels like fast-food-junk to me.

As any pre-schooler will be able to produce anything (watch out parents) imaginable in seconds doesn't make it better to me or is of any real value.

Ok, i needed to edit it again to add: maybe this IS the value of it. We can totally forget about phantasizing stories with visuals (movies) because nobody will care anymore.

ShrigmaMale · 2024-10-05T08:39:54 1728117594

they’re junk food-y visuals too. i don’t know how to describe it beyond looking like a cross between fisher-price and a light dose of shrooms.

baby · 2024-10-05T04:08:25 1728101305

Yeah I agree, I never understood the appeal of photography, it's so easy, you don't need to paint for hours to produce something original, you just need to buy a camera and click on a button. That's it. And people pay for that, I don't get it.

lurking_swe · 2024-10-05T10:01:45 1728122505

is it easy? i’m not a photographer but i enjoy taking pictures as a hobby. Your opinion baffles me haha. People always wonder why their iphone can’t shoot photos that look as good as the iphone camera reviews.

https://www.austinmann.com/trek/iphone-16-pro-camera-review-...

Understanding why your camera performs well in certain conditions and how to tweak the camera, is a mini-science all of its own. Also i think the reason photography is appreciated is because of composition.

Knowing where to position yourself, and at what moment, can make photographs turn out magical. Your eyes are processing millions of frames a second, but a photographer often gets just ONE chance to capture some scenes. Some people appreciate the “art” in that. :)

Here’s a thought experiment. Do you consider all wedding photographers to be of equal value? If not, why? Anyone can do it according to you. just click a button. Something to think about…

NibblesMeKibble · 2024-10-05T12:00:24 1728129624

> Knowing where to choose the right words, and at what moment, can make video gen turn out magical. The AI is processing millions of tokens a second, but a prompter often finds just ONE perfect prompt to capture some scenes. Some people appreciate the “art” in that. :)

I do. It can take hours to make the vision in your head be replicated by the AI. Sure you can spit out generic scenes, but the game changes when the goal is expressing a specific vision with the prompt; choosing the right synonyms, phrasing, etc to make it spit out the closest image.

Do I think I'm an artist for that? Not particularly. I like to think of myself as a prompt engineer, a completely different skill for sure, after all I do think overly logical and practical in life in general so it meshes well with my skills plus my artistic background.

jkolio · 2024-10-05T14:13:12 1728137592

I dislike the term "prompt engineer," particularly if you're not setting up systems on a technical level. And it's still artistic, without necessarily making prompters artists. Because the process is one closer to curation rather than creation, I like to think of prompters as "curators". You're reaching into latent space (a collective visual history), pulling out a collapsed possibility, and deciding if it fits your needs or not.

lurking_swe · 2024-10-05T19:56:24 1728158184

there’s a reason i put “art” in quotes. I don’t mean the traditional definition, but rather the less commonly used definition from the dictionary:

“a skill at doing a specified thing, typically one acquired through practice.”

I agree that a photographer is not a true artist. It’s not hard to understand why people might appreciate their skill and creativity though?

qup · 2024-10-05T11:43:07 1728128587

It's sarcasm

amelius · 2024-10-05T11:56:55 1728129415

Well to be honest I could have written that without sarcasm.

qup · 2024-10-05T12:16:58 1728130618

You don't understand why people pay for cameras?

acjohnson55 · 2024-10-05T14:55:41 1728140141

I honestly thought that, too, until I got my first digital camera and started trying to take photos that I liked as much as other ones I saw. Then I realized how much of a craft it is and I gained a much deeper appreciation.

golergka · 2024-10-05T17:02:33 1728147753

If this is sarcasm which aims to highlight the problem with parent comment, I wholeheartedly agree.

heurist · 2024-10-04T15:34:24 1728056064

I've been saying for years that generated content is an impending tsunami that's going to drown out all real human voices online. The internet may become effectively unusable as a result for anything other than entertainment.

boogieknite · 2024-10-04T16:14:51 1728058491

This is interesting and i see some of this now. Even here on HN and other forums i thought were mostly "human". Even one of my group chats i can tell one of my friends is using ai responses, but one of the other members cant tell and replies earnestly.

I am grossed out by this. my instinct is to avoid ai slop. The interesting part to me is: What next? Where do we go? Will it be that "human" forums are pushed further into obscurity of the internet? Or will go so far as that we all start preferring meeting in person? Im clueless here

the_gipsy · 2024-10-05T07:41:41 1728114101

> Even one of my group chats i can tell one of my friends is using ai responses, but one of the other members cant tell and replies earnestly.

Too bad she won't live! But then again, who does?

StefanBatory · 2024-10-05T17:45:20 1728150320

Vetting people into groups will become much more common, I think. Unless you can verify that person, ideally by knowing them irl, don't talk to them online.

sethammons · 2024-10-05T04:57:56 1728104276

Your droids, they'll have to wait outside

whiplash451 · 2024-10-04T18:57:29 1728068249

Cryptography-secured/signed generated content / interactions?

93po · 2024-10-04T19:00:42 1728068442

worldcoin project solves a lot of this when combined with web of trust, however everyone's knee jerk reaction to worldcoin is pretty bad and so it's annoying to even mention it

whiplash451 · 2024-10-04T19:09:47 1728068987

I’m not knowledgeable in crypto/worldcoin.

I was rather thinking classical cryptography baked into generative networks.

tim333 · 2024-10-04T20:48:32 1728074912

Worldcoin was intended to solve it but in practice it's not being used that way. Not enough adoption by users or websites.

Maybe in the future?

I'm a worldcoiner but so far it's just been free money.

93po · 2024-10-05T20:24:43 1728159883

i mean it's in its infancy, if there's a critical need for it, which i think there will be soon, i think it'll get more popular

Andrex · 2024-10-05T22:08:28 1728166108

US citizens still use paper Social Security Cards. The point where there is a "critical need" recognized by the relevant parties may be further off than you believe.

Aeolun · 2024-10-05T12:08:59 1728130139

Humans will start to notice this shit. I used openai to help me edit my stories originally, but then when I started reading other stories it quickly became evident to me that people just generated them entirely with AI.

ChatGPT is way too happy to overuse the word cacophony.

jowea · 2024-10-05T14:06:08 1728137168

My hot take is that we will have some small obscure forums with people, some social media flooded with AI content and other social media where you need to register with government ID and facescan.

shortrounddev2 · 2024-10-05T15:24:40 1728141880

I am 10,000% ready for forums to make a comeback. The internet hasn't been good since 2010

danlugo92 · 2024-10-05T04:45:22 1728103522

I got into hiking.

solardev · 2024-10-05T15:05:44 1728140744

Maybe that's a good thing. The internet never reached its potential as being the connective fabric of humanity. Mostly it's just marketing and spam. If the internet died and we all went back to smaller communities, that really wouldn't be the worst thing IMO. We're not really evolved for global communications at scale anyway.

danielbln · 2024-10-05T18:23:49 1728152629

We're not evolved for most things in modern life, that's not really much of an argument.

solardev · 2024-10-05T18:31:28 1728153088

Shrug. Maybe it's an argument to de-modernize more things in order to bring daily life back to the environments we're healthier and happier in.

Edit: In particular, I'm not convinced the internet was a net positive. Running water and sewage systems, sure. But what has 24/7 smartphone access actually done for our societies? Most people today don't seem any better off than they were in the 80s and 90s, and in many ways that actually matter, they seem worse off. Sure, they have access to way more information than our predecessors ever did... but it's not like we built a better world off it. Mostly the internet has accelerated the concentration of wealth towards the top, increased anxiety across the world, and significantly contributed to the global downfall of representative democracies, to name a few. Sometimes modernity can just be a collection of pathologies with a few beneficial side effects.

nl · 2024-10-05T12:47:26 1728132446

Why should I care?

Have you seen what most humans say? If an AI says more intelligent things I'm all for it.

jowea · 2024-10-05T14:04:24 1728137064

AIs say what the people giving them orders tell them to.

And until we get AGI, AI talk will necessarily be some combination of hallucinations, unintelligent, vapid, etc.

spaceman_2020 · 2024-10-05T17:16:31 1728148591

There could be AI agents in this thread right now

And you wouldn’t even know it

alickz · 2024-10-05T20:08:59 1728158939

If you can't tell, does it matter?

spaceman_2020 · 2024-10-06T04:17:56 1728188276

I really don’t think it does

nojs · 2024-10-05T12:58:01 1728133081

Mission fucking accomplished?

https://xkcd.com/810/

shortrounddev2 · 2024-10-05T15:22:53 1728141773

Would be nice if we were able to go to communities of human verified users. Smaller in scope than social media

cedws · 2024-10-05T12:46:03 1728132363

The Internet used to be a sort of hideaway for nerdy people to hang out and have fun. Ever since the invention of the smartphone, possibly before (see “Eternal September”) it’s gone to shit. These days I would rather spend time offline.

Are there any other Internet-based hideaways to retreat to? Somewhere where ads, clout chasing, and AI slop doesn’t exist?

shortrounddev2 · 2024-10-05T15:25:40 1728141940

IRC is more active than forums, but I miss forums

GreenWatermelon · 2024-10-06T10:08:34 1728209314

Come to Tildes.net! We're comfy small community with a single rule: don't be an asshole.

TiredOfLife · 2024-10-05T11:34:30 1728128070

> The internet may become effectively unusable as a result for anything other than entertainment.

That already happened, even without AI.

okdood64 · 2024-10-05T19:41:55 1728157315

What was useful and usable about the internet before that isn't now?

chpatrick · 2024-10-05T11:21:41 1728127301

I think it will just become the new baseline abd people will still value anything better than that.

Andrex · 2024-10-05T22:06:10 1728165970

Maybe we should abandon HTTP and create a new protocol just for humans. HHTTP.

skywhopper · 2024-10-04T23:09:12 1728083352

The only problem is that all the AI slop is not actually entertaining either.

danielbln · 2024-10-05T09:45:22 1728121522

Beware of sampling bias. Slop will always be slop.

TrackerFF · 2024-10-04T13:13:42 1728047622

All the vids have that instantly recognizable GenAI "sheen", for the lack of a better word. Also, I think the most obvious giveaway are all the micro-variations that happen along the edges, which give a fuzzy artifact.

lopis · 2024-10-04T13:40:13 1728049213

I assure you that's not enough. These are high quality videos. Once they get uploaded to social media, compression mostly makes imperfections go away. And it's been shown that when people are not expecting AI content, they are much less likely to realize they are looking at AI. I would 100% believe most of these videos were real if caught off guard.

jetrink · 2024-10-04T18:48:18 1728067698

A friend who lives in North Carolina sent me a video of the raging floodwaters in his state- at least that's what the superimposed text claimed it was. When I looked closer, it was clearly an Indian city filled with Indian people and Indian cars. He hadn't noticed anything except the flood water. It reminded me of that famous selective attention test video[1]. I won't ruin it for those who haven't seen it, but it's amazing what details we can miss when we aren't looking for them. I suspect this is made even worse when we're casually viewing videos in a disjointed way as on social media and we're not even giving one part of the video our full attention.

1. https://www.youtube.com/watch?v=vJG698U2Mvo

jsheard · 2024-10-04T20:34:03 1728074043

For the entire duration of the Russia/Ukraine war "combat footage" that is actually from the video game ARMA 3 has gone viral fairly regularly, and now exactly the same thing is happening with Israel/Iran.

aguaviva · 2024-10-04T21:18:12 1728076692

And which YouTube happily promotes straight to the top, of course -- thanks to the efforts of its rocket-science algorithm team. (Not sure whether the ones I've been seeing were generated by that particular platform, but YT does seem to promote obviously fake and deceptively labelled "combat" footage with depressing regularity).

ynniv · 2024-10-04T22:36:33 1728081393

The willingness of people to believe that combatants are wearing cinematic body cams for no tactical reason can only be matched by their willingness to assume people meticulously record every minute of their lives just so they can post a once-in-a-lifetime event on TikTok.

Who even needs AI generated videos when you can just act out absurdity and pretend it's real?

Seanambers · 2024-10-04T22:53:27 1728082407

As far as I know, most of the viral stuff has been active air defence CWIS and the like which can be hard to discern.

There's a morbid path from the grainy Iraq war and earlier shaky footage, through IS propaganda which at the time had basically the most intense combat footage ever released to the Ukraine war. Which took it to the morbid end conclusion of endless drone video deaths and edited clips 30+ mins long with day long engagements and defending.

And yes, to answer your belief that there is none - there is loads of "cinematic body cam footage out there now".

daemoens · 2024-10-05T01:33:43 1728092023

Thousands of combatants are wearing bodycams, and pretty regularly, there are videos released by Russians of a dead Ukrainian's last moments taken from their corpse and the same happens vice versa.

baby · 2024-10-05T04:10:27 1728101427

Dude I clicked on some random Youtube accounts that were streaming the world cup live, and it took me a while to realize that they were actually just streaming video games replica of the actual game (at least, I think they were simulating the actual game with a video game, but I'm not sure as I didn't compare closely)

riffraff · 2024-10-05T04:32:09 1728102729

I've seen that a bunch of times, there's CGI highlights of most football matches.

I still don't know if it's autogenerated from the original video or recreated manually but yeah it's pretty realistic for the first few seconds.

jsheard · 2024-10-05T10:30:29 1728124229

Someone once did the opposite - streamed a real pay-per-view UFC match on Twitch and pretended it was a game he was playing. It actually worked for a while before the Twitch mods realized what was going on.

https://www.theverge.com/2017/12/4/16732912/ufc-video-stream...

stavros · 2024-10-04T22:40:39 1728081639

It's kind of sad that we don't even need AI to create misinformation, the bar for what people will fall for is really low.

sufficer · 2024-10-05T00:14:19 1728087259

I've shown my own videos I made in dcs world to idiots at the bar in airports and they believed I was the ghost of kiev lmao

fossuser · 2024-10-04T20:41:26 1728074486

People believe false things easily if it confirms their priors. Confirmation bias is strong.

Fake images play into that, but they don't need to be AI generated for that to be true, it's been true forever.

wpietri · 2024-10-04T21:25:47 1728077147

And let's not forget the paper that goes with the video, which has a stellar title: http://www.chabris.com/Simons1999.pdf

itslennysfault · 2024-10-04T19:12:34 1728069154

hmmm... Maybe it's because I knew it was testing me, but I noticed it right away and counted the right count.

I could see it being pretty shocking if I hadn't, but I honestly can't imagine how I'd miss that.

jetrink · 2024-10-04T19:24:13 1728069853

It probably doesn't work if you're primed to look for hidden details. I took the test along with my Psychology 101 class of about 30 people and no one noticed anything amiss.

gitaarik · 2024-10-05T05:20:43 1728105643

Once you see it you can indeed not imagine how you couldn't. Some people see it the first time, but it's a small amount of people. This video just demonstrates how humans can only focus at one thing at a time, and when we're multitasking, we're actually doing little parts of different tasks one at a time but very quickly after each other, kind of like a single CPU core. And if we tightly focus our attention to one point, we are not aware of other things that might be relatively close to that point.

That is also how magicians work, drawing your attention to one particular thing, hiding the secret of the trick from you, sometimes even in plain sight, like in the video.

Or pickpockets, who might bump into you and picking your pocket at the same time, where your attention is focussed on the sudden impact, keeping your attention away from your walled being taken.

firebaze · 2024-10-04T19:18:25 1728069505

> hmmm... Maybe it's because I knew it was testing me, but I noticed it right away and counted the right count.

> I could see it being pretty shocking if I hadn't, but I honestly can't imagine how I'd miss that.

The point of the video wasn't to count correctly, but to see the gorilla

bee_rider · 2024-10-04T20:16:28 1728072988

99% the person was playing along for the rest of us, so we get a chance to enjoy the video as intended.

yunwal · 2024-10-04T19:33:32 1728070412

cool, he noticed it right away

andrewinardeer · 2024-10-04T20:50:31 1728075031

I believe them. Why would people lie on the internet?

xsmasher · 2024-10-04T20:51:10 1728075070

> I noticed it right away

lz400 · 2024-10-05T01:28:18 1728091698

I was focused on counting. I counted very wrong, but caught the gorilla right away.

scotty79 · 2024-10-04T20:11:01 1728072661

If you see a text accompanying some content you can de-prime yourself by saying "nuh-uh, that's exactly what it's fscking not."

hackernewds · 2024-10-04T19:22:06 1728069726

I do not see how the examples you mentioned are related to the topic? What does selective attention have to do with the video looking AI generated in all the frames?

CSSer · 2024-10-04T19:25:08 1728069908

Their argument is that if someone is affected by confirmation bias, they likely won’t notice these kinds of details.

Essentially, send me a video of something I care about and I will only look for that thing. Most people are not detectives, and even most would-be detectives aren’t yet experts.

szundi · 2024-10-04T20:26:44 1728073604

probably people will soon develop a habit of verifying every detail in videos of interest haha

szvsw · 2024-10-04T21:45:15 1728078315

Cause people are well known for verifying every detail in most other forms of media already right?

gitaarik · 2024-10-05T05:23:12 1728105792

Verifying what?

paul7986 · 2024-10-04T20:45:00 1728074700

Indeed watching Reels or Tiktok videos is an exercise in testing your bullshit meter and commenting accordingly to let the uninformed know hey this is most likely fake.

asveikau · 2024-10-05T15:11:34 1728141094

Facebook is mostly this now too. Long comment threads of boomers thanking AI images for their military service or congratulating it on a long marriage.

mikae1 · 2024-10-04T18:42:03 1728067323

> it's been shown that when people are not expecting AI content, they are much less likely to realize they are looking at AI.

At this point, looking at a big tech SoMe feed I would expect that everything is, or at least could be, gen AI content.

ddtaylor · 2024-10-04T15:44:01 1728056641

I regularly catch my kids watching AI generated content and they don't know it.

dham · 2024-10-04T19:58:05 1728071885

It's kind of an interesting phenomenon. I read something on this. Basically being born between ~1980 and ~1990 is a huge advantage in tech.

ben_w · 2024-10-04T21:31:23 1728077483

The only generation that ever knew how to set the clock on a VCR: our parents needed our help; our kids won't have even seen a VCR outside of a museum, much less used one.

wsintra2022 · 2024-10-05T00:40:08 1728088808

Very interesting point. Wonder about the generation before and what skills they had to share with their parents who were most likely traumatised from a world war or two. I remember setting the vcr clock and tuning the new tv with the remote. I’m sure the adults could of figured it out but they probably got more from seeing their ‘smart’ kids figuring it out in time for the cartoons!

driverdan · 2024-10-05T03:13:43 1728098023

The parents of those of us who grew up in the 80's and 90's invented the VCR, they could use it just fine.

WhyOhWhyQ · 2024-10-05T00:37:57 1728088677

The Zoomers have the advantage that the bar is pretty low these days.

throwup238 · 2024-10-04T16:28:36 1728059316

A surprising amount of it is really popular too. I recently figured out that the Movie Recaps channel was all AI when the generated voice slipped and mispronounced a word in a really unnatural way. They post videos almost daily and they get millions of views. Must be making bank.

the_af · 2024-10-04T21:08:52 1728076132

A group I follow about hobby/miniatures (as in wargaming miniatures and dioramas) recently shared an "awesome" image of a diorama from another "hobby" group.

The image had all the telltale signs of being AI generated (too much detail, the lights & shadows were the wrong scale, the focus of the lens was odd for the kind of photo, etc). I checked that other group, and sure enough, they claim to be about sharing "miniature dioramas" but all they share is AI-generated crap.

And in the original group, which I'm a member of and is full of people who actually create dioramas -- let's say they are "subject matter experts" -- nobody suspected anything! To them, who are unfamiliar with AI art, the photo was of a real hand-made diorama.

inkcapmushroom · 2024-10-04T18:29:05 1728066545

I was watching UFC recaps on Youtube and the algorithm got me onto AI generated MMA content, I watched for a while before realizing it. They were using old videos which were "enhanced" using AI and had an AI narrator. I only realized it when the fight footage got so old, and the AI had to do so much work to touch it up, that artifacts started appearing in the video. Once I realized it I rewatched the earlier clips in the video and could see the artifacts there too, but not until I was looking for them.

araes · 2024-10-05T19:26:50 1728156410

There's already rabbit holes of fake MMA fighting you can fall into online? Even if you're a "fan" and relatively aware of what to look for ... still difficult to spot? Horribly, had the same sensation while watching UFC at a bar. "Haven't I seen this match where they fall on the ground and hug for hours before?" Mostly empty background audience with limited reactions.

Somebody took AI video editing, and in a year or two, we're already at entire MMA rabbit holes of fake videos.

Commenting mostly as a personal evidence reference of how crazy the World Wide Web has gotten from anecdotal sources.

Twisell · 2024-10-04T18:42:26 1728067346

Most probably they employ overseas, underpaid workers with non-standard English accents and so they include text-to-speach in the production process to smoothen the end result.

I won't argue wether text to speech qualifies as an AI but I agree they must be making bank.

bee_rider · 2024-10-04T20:20:25 1728073225

I wonder if they are making bank. Seems like a race to the bottom, there’s no barrier to entry, right?

atomic128 · 2024-10-04T20:41:39 1728074499

Right, content creators are in a race to the bottom.

But the people who position themselves to profit from the energy consumption of the hardware will profit from all of it: the LLMs, the image generators, the video generators, etc. See discussion yesterday: https://news.ycombinator.com/item?id=41733311

Imagine the number of worthless images being generated as people try to find one they like. Slop content creators iterate on a prompt, or maybe create hundreds of video clips hoping to find one that gets views. This is a compute-intensive process that consumes an enormous amount of energy.

The market for chips will fragment, margins will shrink. It's just matrix multiplication and the user interface is PyTorch or similar. Nvidia will keep some of its business, Google's TPUs will capture some, other players like Tenstorrent (https://tenstorrent.com/hardware/grayskull) and Groq and Cerebras will capture some, etc.

But at the root of it all is the electricity demand. That's where the money will be made. Data centers need baseload power, preferably clean baseload power.

Unless hydro is available, the only clean baseload power source is nuclear fission. As we emerge from the Fukushima bear market where many uranium mining companies went out of business, the bottleneck is the fuel: uranium.

valval · 2024-10-04T23:12:46 1728083566

You spent a lot of words to conclude that energy is the difference maker between modern western standards of living and whatever else there is and has been.

atomic128 · 2024-10-04T23:50:28 1728085828

Ok, too many words. Here's a summary:

Trial and error content-creation using generative AI, whether or not it creates any real-world value, consumes a lot of electricity.

This electricity demand is likely to translate into demand for nuclear power.

When this demand for nuclear power meets the undersupply of uranium post-Fukushima, higher uranium prices will result.

defrost · 2024-10-04T23:58:27 1728086307

Continuing that thought, higher uranium prices and real demand will lead to unshuttering and exploiting known and proven deposits that are currently idle and increase exploration activity of known resources to advance their status to measured and modelled for economic feasiblity, along with revisiting radiometric maps to flag raw prospects for basic investigation.

More supply and lower prices will result.

Not unlike the recent few years in (say) lithium, anticipated demand surged exploration and development, actual demand didn't meet anticipated demand and a number of developed economicly feasible resources were shuttered .. still waiting in the wings for a future pickup in demand.

atomic128 · 2024-10-05T00:14:52 1728087292

Spend a few months studying existing demand (https://en.wikipedia.org/wiki/List_of_commercial_nuclear_rea...), existing supply (mines in operation, mines in care and maintenance, undeveloped mines), and the time it takes to develop a mine. Once you know the facts we can talk again.

Look at how long NexGen's Rook 1 Arrow is taking to develop (https://s28.q4cdn.com/891672792/files/doc_downloads/2022/03/...). Spend an hour listening to what Cameco said in its most recent conference call. Look at Kazatomprom's persistent inability to deliver the promised pounds of uranium, their sulfuric acid shortages and construction delays.

Uranium mining is slow and difficult. Existing demand and existing supply are fully visible. There's a gap of 20-40 million pounds per year, with nothing to fill the gap. New mines take a decade or more to develop.

It is not in the slightest like lithium.

defrost · 2024-10-05T00:23:33 1728087813

> Spend a few months studying existing demand

Would two decades in global exploration geophysics and being behind the original incarnation of https://www.spglobal.com/market-intelligence/en/industries/m... count?

> Once you know the facts we can talk again.

Gosh - that does come across badly.

atomic128 · 2024-10-05T00:33:19 1728088399

Apologies.

When someone compares uranium to lithium, I know I'm not talking to a uranium expert.

All the best to you, and I'll try to be more polite in the future.

defrost · 2024-10-05T00:39:13 1728088753

Weird .. and to think I spent several million line kms in radiometric surveys, worked multiple uranium mines, made bank on the 2007 price spike and that we published the definite industry uranium resources maps in 2006-2010.

Clearly you're a better expert.

> when someone compares uranium to lithium, I know I'm not talking to a uranium expert.

It's about boom bust and shuttering cycles that apply in all resource exploration and production domains.

Perhaps you're a little too literal for analogies? Maybe I'm thinking in longer time cycles than yourself and don't a few years of lag as anything other than a few years.

atomic128 · 2024-10-05T00:58:15 1728089895

Once again, allow me to offer my sincere apologies.

You are well-prepared to familiarize yourself with the current supply/demand situation. It's time to "make bank", just like you did in 2007... only more so. The 2007 spike was during an oversupplied uranium market and mainly driven by financial actors.

I invite you to begin by listening to any recent interview with Mike Alkin.

Good night and enjoy your weekend.

derefr · 2024-10-04T19:35:12 1728070512

> Most probably they employ overseas, underpaid workers with non-standard English accents and so they include text-to-speach in the production process to smoothen the end result.

Might also be an AI voice-changer (i.e. speech2speech) model.

These models are most well-known for being used to create "if [famous singer] performed [famous song not by them]" covers — you sing the song yourself, then run your recording through the model to convert the recording into an equivalent performance in the singer's voice; and then you composite that onto a vocal-less version of the track.

But you can just as well use such a model to have overseas workers read a script, and then convert that recording into an "equivalent performance" in a fluent English speaker's voice.

Such models just slip up when they hit input phonemes they can't quite understand the meaning of.

(If you were setting this up for your own personal use, you could fine-tune the speech2speech model like a translation model, so it understands how your specific accent should map to the target. [I.e., take a bunch of known sample outputs, and create paired inputs by recording your own performances of them.] This wouldn't be tenable for a big low-cost operation, of course, as the recordings would come from temp workers all over the world with high churn.)

spywaregorilla · 2024-10-04T20:11:19 1728072679

Can you identify any of these models?

asveikau · 2024-10-05T15:18:12 1728141492

I think it's unusual to assume they are based in the US and employ/underpay foreigners. A lot of people making the content are just from somewhere else.

mystifyingpoi · 2024-10-04T19:56:40 1728071800

But it uses AI only for audio, right? Script for the vid seems to be written by human, given the unusual humor type of this channel. I started watching this channel some time ago.

throwup238 · 2024-10-04T20:50:27 1728075027

It's hard to tell whether they use AI for script generation. After having seen enough of those recaps, the humor seems to be rather mechanical and basic humor is relatively easy to get from an LLM if prompted correctly. The video titles also seem as if they were generated.

That said, this channel has been producing videos well before ChatGPT3.5/4 so at the very least they probably started with human written scripts.

mewpmewp2 · 2024-10-04T23:37:02 1728085022

I thought it was just text to speech when I happen to saw some of those videos. And it seems to have been consistently similar since before ChatGPT etc. Why do you think titles are AI generated?

I feel like it might actually be quite complex for AI to pull up the perfect clips and edit them with the script, including timing and everything. Maybe it could be made automatic, but nonetheless it would be a complex process and I don't think possible few years ago. I know Gemini and possibly some others can analyze video if fed to them, but I'm still skeptical that this channel in particular would have done it, when they have always had this frequency of uploads and similar tone.

Also I think there's far better TTS now with ElevenLabs and others so it could be made much more human like.

valval · 2024-10-04T23:05:55 1728083155

The way I see it, it won’t take long before human eyes won’t be able to distinguish AI generated content from original.

The only regret I have about that is losing video as a form of evidence. CCTV footage and the like are a valuable tool for solving crimes. That’s going to be out the window soon.

akoboldfrying · 2024-10-05T00:23:42 1728087822

Trust can be preserved by adding PKI at the hardware level. What you said about CCTV is true; once the market realises and demand appears, camera manufacturers will start making camera modules that, e.g., sign each frame with the manufacturer's private key, enabling Joe Public to verify that that frame came from a camera made by that manufacturer. Reputational risk makes the manufacturer store the private key in the device in a secure, tamper-proof way (like TPMs do now), which (mostly) prevents those private keys from leaking.

Does this create difficulties if you want to modify the raw video data in any way? Yes it does, even if you just want to save it in a different lossy compression level or format. But these problems aren't insurmountable. Essentially, provenance info can be added for each modification, signed by the entity that made the change, and the end viewer can then decide if they trust the full certificate chain (just as they do now with HTTPS).

gitaarik · 2024-10-05T05:36:15 1728106575

Oh wow, that's a great idea. Isn't this already happening maybe?

Recently someone said here that it's noticable that videos from CCTV cameras are often filmed with a phone or camera on a screen instead of using the original video, and people were speculating that it might be hard or impossible to get access to the original recording because of bureaucracy or something, but that recording a playback on a screen with a phone or camera or something is then often allowed. Maybe they also do this partly so that the original can't be easily messed with by other people.

But yeah if you can verify that a certain video was filmed at a certain time by a certain camera, that is great. Of course the companies providing these cameras should be trustworthy, and that the camera's are actually really sending what they actually record, and that the company itself doesn't mess with the original recordings.

akoboldfrying · 2024-10-05T09:19:42 1728119982

>Isn't this already happening maybe?

I recall an article posted 1-2 years ago about a camera company (Kodak? Can't remember) which was starting to offer something along these lines.

>the companies providing these cameras should be trustworthy, and that the camera's are actually really sending what they actually record, and that the company itself doesn't mess with the original recordings.

I agree. We can't guarantee any of these things, but on the bright side, the incentives are pointing in the right direction to make self-interested companies choose to behave the right way.

It will complicate things and make the hardware more expensive, so I doubt it will sweep through all consumer camera tech unless the "Is this photo real?" question becomes a crisis. There's also the fact that it would be possible to give individual cameras different private keys, with certificates signed by the manufacturer: This would enable non-repudiation (you would not be able to plausibly deny that you had taken a particular photo/video), which has potentially big upsides but also privacy downsides. I think that could be solved by giving the user the option of signing with their unique camera private key (when the user wants to prove to others that they took the photo themselves) or with the manufacturer's key (when they want to remain anonymous).

arcastroe · 2024-10-05T00:55:13 1728089713

It's sad that almost AS SOON as we acquired the ability to record real-life moments (with the promise of being able to share undeniable evidence of events with one another), we also acquired the ability to doctor it, negating that promise.

jowea · 2024-10-05T14:10:43 1728137443

I'm not sure we should have been trusting images for the previous decades either. Photoshop has been a thing for a long time already. I mean, there's those famous photos that Stalin had people removed from.

acdha · 2024-10-05T14:27:32 1728138452

Your mention of Stalin is I think stronger as an argument that there’s been a significant change. Those fakes took lots of time by skilled humans and were notoriously obvious - what made them effective was the crushing political power preventing them from receiving critical analysis in public.

Similarly, while Photoshop made it easier it happened at a time where technical advances made the problem harder because everyone’s standards for photos went up dramatically, and so producing a realistic fake was still a slow process for a skilled worker.

Now, it’s increasingly available to everyone and that means that we’re going to see a lot more scams and hoaxes as people without artistic talent or willingness to invest time can make realistic fakes even for minor things. That availability is transformative enough to merit the concern we’ve been seeing here.

jowea · 2024-10-05T15:02:19 1728140539

The glass half-full in me feels that the advantage to this is that in a few years the average person will know better than to trust anything that could be faked like that, instead of the old situation where someone who was willing to put in that effort could actually trick a lot of people.

acdha · 2024-10-05T16:49:58 1728146998

I think that’s true, but it’s kind of like the trade offs during the pandemic where we knew it would eventually settle into a stable state but still wanted to reduce the harm getting there. We basically need some large fraction of the global population to level up in media literacy at all once.

jowea · 2024-10-05T14:09:28 1728137368

I don't think it goes out the window completely. You need just the owner of the CCTV to stand up in court and say "yes this is the CCTV footage I personally copied from storage and I did not manipulate it".

mikehollinger · 2024-10-05T12:46:07 1728132367

> compression mostly makes imperfections go away

The ultimate compression is to reduce the video clip to a latent space vector representation to be rendered on device. :)

Just give us a few more revs of Moore’s law for that to be reasonable.

edit: found a patent… https://patents.google.com/patent/US11388416B2/en

dekhn · 2024-10-04T18:50:27 1728067827

That sheen looks (to me) like some of the filters that are used by people who copy videos from TV and movie and post them on (for example) facebook reels.

There's an entire pattern of reels that are basically just ripped-off-content with enough added noise to (I presume) avoid content detection filters. Then the comments have links to scam sites (but are labelled as "the IMDB page for this content").

CSSer · 2024-10-04T19:16:29 1728069389

The idea that Meta’s effectively stolen content is tainted by a requirement to avoid collecting stolen content is laughably ironic.

kylebenzle · 2024-10-04T19:36:08 1728070568

Yes, but thats just a hypothesis, have we seen any evidence that shows the cause of the "AI sheen" is bad training data, or more likly, just a shortcomming of generating realistic photos from text at this early stage.

newaccount74 · 2024-10-04T13:41:57 1728049317

I thought the movements were off. The little girl on the beach moves like an adult, the painter looks like a puppet, and everything is in slow motion?

declan_roberts · 2024-10-04T15:02:09 1728054129

They look like some commercial promo video, which makes sense since that's probably what they were trained on.

the_af · 2024-10-04T21:11:29 1728076289

To me they seem off, but off in the same sense real humans in ads always seem off. E.g. the fake smile of the smiling girl. That's what people look like in ads.

DebtDeflation · 2024-10-04T13:23:51 1728048231

At least all the humans in these videos seem to have the correct number of fingers, so that's progress. And Moo Deng seems to have a natural sheen for some reason so can't hold that against them. But your point about the edges is still a major issue.

blargey · 2024-10-04T13:29:39 1728048579

I wonder how much RLHF or other human tweaking of the models contributes to this sort of overstauration / excess contrast in the first place. The average consumer seems to prefer such features when comparing images/video, and use it as a heuristic for quality. And there have been some text-to-image comparisons of older gen models to newer gen, purporting that the older, more hands-off models didn't skew towards kitschy and overblown output the way newer ones do.

Rinzler89 · 2024-10-04T13:18:01 1728047881

>All the vids have that instantly recognizable GenAI "sheen"

That's something that can be fixed in a future release or you can fix it right now with some filters in post in your pipeline.

atrus · 2024-10-04T13:25:19 1728048319

I think the big blind spot people have with these models is that the release pages only show just the AI output. But anyone competently using these AI tools will be using them in step X of a hundred step creative process. And it's only going to get worse as both the AI tools improve and people find better ways to integrate them into their workflow.

Rinzler89 · 2024-10-04T13:35:50 1728048950

Yeah exactly. Video pipelines that go into productions we only see the end results of have a lot of steps to them beyond just the raw video output/capture. Even Netflix/Hollywood productions without VFX have a lot of retouching and post processing to them.

derefr · 2024-10-04T19:38:26 1728070706

Not even filters; every text2image model ever created thusfar, can be very easily nudged with a few keywords into generating outputs in a specific visual style (e.g. artwork matching the signature style of any artist it has seen the some works from.)

This isn't an intentional "feature" of these models; rather, it's kind of an inherent part of how such models work — they learn associations between tokens and structural details of images. Artists' names are tokens like any other, and artists' styles are structural details like any other.

So, unless the architecture and training of this model are very unusual, it's gonna at least be able to give you something that looks like e.g. a "pencil illustration."

surfingdino · 2024-10-04T13:19:45 1728047985

> "That's something that can be easily fixed in a future release (...)"

This has been the default excuse for the last 5+ years. I won't hold my breath.

Rinzler89 · 2024-10-04T13:22:21 1728048141

5 years ago there were no AI videos. A bit over a year ago the best AI videos were hilarious hallucinations of Will Smith eating spaghetti.

Today we have these realistic videos that are still in the uncanny valley. That's insane progress in the span of a year. Who knows what it will be like in another year.

Let'em cook.

authorfly · 2024-10-04T13:25:28 1728048328

Disco Diffusion was a (bad) thing in 2021 that lead to the spaghetti video / weird Burger Kind Ads level quality. But it ran on consumer GPUs / in Jupyter notebook.

2 years ago we had decent video generation for clips

7 months ago we have Sora https://news.ycombinator.com/item?id=39393252 (still silence since then)

With these things, like DALL-E 1 and GPT-3, the original release of the game changer often comes ca. 2 years before people can actually use it. I think that's what we're looking at.

I.e. it's not as fast as you think.

bbor · 2024-10-04T13:37:41 1728049061

What video generation was decent 2 years ago? Will smith eating spaghetti was barely coherent and clearly broken, and that was March 2023 (https://knowyourmeme.com/memes/ai-will-smith-eating-spaghett...).

And isn’t this model open source…? So we get access to it, like, momentarily? Or did I miss something?

authorfly · 2024-10-06T09:35:52 1728207352

So you're right to be excited, I agree. And I don't know, Meta, like OpenAI, seems to release conditionally, though yes, more. I doubt it before the election.

When the Will Smith one was released, it was kind of a parody though. Tech had already been able to produce that level of "quality" for about 2 years at the time of it's publishing. The Will Smith one is honestly something you could have created with Disco Diffusion in early 2021, I used to do this back then...

2022 saw: https://makeavideo.studio/ (coherent, but low res - it was possible to upscale at extreme expense) https://sites.research.google/phenaki/ https://lumiere-video.github.io/

It was more like 18-20 months ago sorry so early 2023, but https://runwayml.com/research/gen-1 was getting there as was https://pika.art/home - Sora obviously changed the game, but I would say these two were great.

AJ007 · 2024-10-04T19:39:27 1728070767

The subtle "errors" are all low hanging fruit. It reminds me of going to SIGGRAPH years back and realizing most of the presentations were covering things which were almost imperceptible when looking at the slides in front. The math and the tech was impressively, but qualitatively it might have not even mattered.

The only interesting questions now have nothing to do with capability but with economics and raw resources.

In a few years, or less, clearly we'll be able to take our favorite books and watch unabridged, word-for-word copies. The quality, acting, and cinematography will rival the biggest budget Hollywood films. The "special effects" won't look remotely CG like all of the newest Disney/Marvel movies -- unless you want them to. If publishers put up some sort of legal firewall to prevent it, their authors, characters, and stories will all be forgotten.

And if we can spend $100 of compute and get something I described above, why wouldn't Disney et al throw $500m at something to get even more out of it, and charge everyone $50? Or maybe we'll all just be zoo animals soon (Or the zoo animals will have neuralink implants and human level intelligence, then what?)

ben_w · 2024-10-04T21:49:09 1728078549

> In a few years, or less, clearly we'll be able to take our favorite books and watch unabridged, word-for-word copies. The quality, acting, and cinematography will rival the biggest budget Hollywood films. The "special effects" won't look remotely CG like all of the newest Disney/Marvel movies -- unless you want them to. If publishers put up some sort of legal firewall to prevent it, their authors, characters, and stories will all be forgotten.

I'm also expecting, before 2030, that video game pipelines will be replaced entirely. No more polygons and textures, not as we understand the concepts now, just directly rendering any style you want, perfectly, on top of whatever the gameplay logic provided.

I might even get that photorealistic re-imagining of Marathon 2 that I've been wanting since 1997 or so.

lancesells · 2024-10-04T21:20:53 1728076853

> In a few years, or less, clearly we'll be able to take our favorite books and watch unabridged, word-for-word copies. The quality, acting, and cinematography will rival the biggest budget Hollywood films. The "special effects" won't look remotely CG like all of the newest Disney/Marvel movies -- unless you want them to. If publishers put up some sort of legal firewall to prevent it, their authors, characters, and stories will all be forgotten.

I don't think so at all. You're thinking a movie is just the end result that we watch in theaters. Good directing is not a text prompt, good editing is not a text prompt, good acting is not a text prompt. What you'll see in a few years is more ads. Lots of ads. People who make movies aren't salivating at this stuff but advertising agencies are because it's just bullshit content meant to distract and be replaced by more distractions.

ben_w · 2024-10-04T21:55:27 1728078927

Indeed, adverts come first.

But at the same time, while it is indeed true that the end result is far more than simply just making good images, LLMs are weird interns at everything — with all the negative that implies as well as the positive, so they're not likely to produce genuinely award winning content all by themselves even though they can do better by asking them for something "award winning" — so it's certainly conceivable that we'll see AI indeed do all these things competently at some point.

surfingdino · 2024-10-04T20:34:38 1728074078

> "In a few years, or less, clearly we'll be able to take our favorite books and watch unabridged, word-for-word copies."

That would be a boring movie.

ekianjo · 2024-10-04T13:20:29 1728048029

You had AI videos 5 years ago?

surfingdino · 2024-10-04T13:26:16 1728048376

AI in general.

bbor · 2024-10-04T13:42:43 1728049363

…I mean, it was advancing slowly for linguistic tasks until late 2022, that’s fair. That’s why we’re in such a crazy unexpected rollercoaster of an era - we accidentally cracked intuitive computing while trying to build the best text autocomplete.

AI in general is from 1950, or more generally from whenever the abacus was invented. This very website runs on AI, and always has. I would implore us to speak more exactly if we’re criticizing stuff; “LLMs” came around (in force) in 2023, both for coherent language use (ChatGPT 3.5) and image use (DALLE2). The predecessors were an order of magnitude less capable, and going back 5 years puts us back in the era of “chatbots”, aka dumb toys that can barely string together a Reddit comment on /r/subredditsimulator.

surfingdino · 2024-10-04T19:03:45 1728068625

AI so far has given us ability to mass produce shit content of no use to anybody and the next iteration of customer support phone menu trees that sound more convincingly yet remain just as useless. That and another round of IP theft and mass surveillance in the name of progress.

kelseyfrog · 2024-10-04T19:49:57 1728071397

This is a consequence of a type of cognitive bias - bad examples of AI are more easily detectable than good examples of AI. Subsequently, when we recall examples of AI content, bad examples are more easily accessible. This leads to the faulty conclusion that.

> AI so far has given us ability to mass produce shit content of no use to anybody

Good AI goes largely undetected, for the simple reason that it closely matches the distribution of non-AI content.

Controversial aside: This is same bias that results in non-passing trans people being representative of the whole. Passing trans folk simply blend in.

bear141 · 2024-10-06T00:25:38 1728174338

This basic concept can be applied in many places. Do you ever wonder why social movements seem to never work out well and demands are never met? That’s because when they do work out, and demands are met, those people quickly become the “oppressor” or the powerful class from which others are fighting to receive more rights or money.

All criminals seem so incredibly stupid that you can’t understand why anyone would ever try since they all are caught? The smart ones don’t get caught and no one ever hears about them.