Hacker News new | past | comments | ask | show | jobs | submit login
Meta Movie Gen (meta.com)
1107 points by brianjking 32 days ago | hide | past | favorite | 1008 comments



I find the edit video with text the most fascinating aspect. I can see this being used for indie films that doesn’t have a CGI budget. Like the scene with the movie theater, you can film them on lounge chairs first and then edit it to seem like a movie theater.


100% agree, the background replace that puts the guy into a stadium would be fully usable as a cut in a movie/tv show, and the background is believable enough that no one would bat an eye. If you use it properly, I expect a quality uplift on indie films/shorts. Your limit is your creativity


I personally expect a decrease in quality. Without limits people tend to get less creative. Sure, there is some balance here in that tools also enable new things to be done which are not possible without tools but working around limits has often inspired some of the most creative works.


I don't think that is necessarily true. Right now movies are so expensive that they can be created only by a few handfuls of people. But those people might not be the most creative people around. If thousands of people can create movies, we might find out that some people we didn't know of are far more creative.

Also "creation by committee" isn't a thing when somebody can produce a movie in their basement.

Anyway, I look forward to people using this tech to create alternative endings of existing movies.


So expensive? It has never been cheaper to create movies thanks to digital cameras and non-linear editors, digital audio workstations, etc. You are no longer encumbered by the costs of film, development, renting an edit bay, requiring an audio editing studio to mix audio and maintain a tape library of special effects or hire foley artists, no need for an optical printer to layer visual effects, etc.

You already can produce a movie in your basement, many of which can be found on YouTube.


Sure, the cost of making Clerks has dropped, but not the cost of making Dune.


You aren’t going to be making dune with this. Maybe a “we have dune at home version”…


If the results aren't cherry-picked it looks like more than good enough to make any high budget movie from the early 2010s if not Dune.


Like with all generative AI this comes down to how specific you can get, and how consistent you can be. If you can art direct each element of the frame, down to the design of the individual props and scenic items, and have those items remain consistent from shot to shot. Then do the same with lighting, actors, camera characteristics (eg lens, focus, position in the scene, framing), etc, etc, etc, then maybe you’ve got a chance of making a ‘high budget movie from the early 2010s’. But I haven’t seen any generative ai that comes even close to this level of control or consistency… They’re closer to slot machines than anything consistent…


Yup, you can notice some issues even in their picked example. e.g. the prompt for the video of the painting woman says "there is a bear cub at her feet" and it quite clearly is not "at her feet" in the video.


> Without limits people tend to get less creative

But lowering those limits allows for more people to get creative

How many beautiful stories never left their author's head because their author couldn't afford it? Either the monetary cost or the opportunity cost

Considering how many movies come from one place in one country (Hollywood), we haven't even scratched the surface of human creativity


And without being forced to interact with other people. The movie made by one creative and 100 automatons does not ai all compare to the one where there are multiple brilliant creatives butting heads and personalities and choosing never to work with each other again but the show must go on.

How many movie lines have been adlibbed but are absolute classics? Sonofabitch, he stole my line!


What a bizarre statement in an age when the phrase "executive meddling" can describe the sameness of so much content output, and most of the greatest flops have a story which goes "yeah there were too many people involved".

Like the second Avengers movie had this problem in spades.


It's not a bizarre statement at all. An executive meddling with something because X or Y element of a piece of art doesn't align with A or B market trend is not the same thing as people working together and sometimes clashing due to creative differences. You'll find that most works that you, or other people, like weren't the result of a sole individual's creative decisions going completely unchallenged. Others suggested, or revised, or fought. There can be too many cooks in the kitchen, of course, but that's an entirely different issue from executive meddling.


I'm not sure why you jump to executive involvement when I specifically stated two creatives butting heads. There's all sorts of stories where execs or someone came in and forced a movie to have, eg a giant robot spider in a wild west settint that didn't make sense but they really wanted one in some movie so they forced it to happen in one of the projects they were overseeing. but the sum of us is better than individual, so while there are solo artists out there, they're the exception rather than the rule.


The era of easily available game engines have brought to live hundreds of thousands of garbage games, but that doesn't matter. What matters are hundreds of really innovative ones that we wouldn't get otherwise.


Using AI has all kinds of new and unusual limits. It's hard to get exactly what you want and you often get unexpected results along the way.


> Your limit is your creativity

In the professional creative tools business, "Now the only limit is your creativity" has been a popular marketing tagline for decades, especially for products based on new enabling technologies. It's common enough that a wry corollary has developed in response, which goes: "Unfortunately, for a lot of people that's a pretty big limit."


> Your limit is your creativity

And how many tokens you can afford.


You’re not wrong, but the comment somewhat misses the point. A shot like that would require you to rent a stadium (generally not cheap) along with paying for a believable number of extras. That would put the shot out of budget for most indie filmmakers. Spending $20 on tokens to get “good enough” is totally worth it, and allows you to get shots that were previously out of reach.


In some ways it opens up creative possibilities


Why bother? Actors cost money and scheduling is difficult. Do the whole thing in AI - the model will be trained on better actors than your indie cast, anyway.


It will be a loong time before AI can produce lead actors that are believable, act exactly the way you as a director want and tell the story you want to tell, so I think at least for now you'll still need the actors for the lead roles. But I can totally this being used for generating people/stuff in the background of certain shots in a low budget movie.


> It will be a loong time before AI can produce lead actors that are believable.

A "loong time" will be sooner that most of us think.

The way this is done currently is similar to motion capture except that the tools are gradually becoming democratized: A single actor can act all the roles you need (You could even act the scenes and roles yourself). That footage is then fed into a model that generates an actor with the appearance and voice that you desire.

As a random on the internet, my prediction is that within a year, you'll be able to produce lead actors that are believable using movie generation plus smartphone footage of yourself acting the scene.

Initially it will be expensive to make a feature length film. But from 2025 onward, the cost will come down as the tools improve. These will be a different type of movie for sure, but every advancement in film technology has always led to films that seem strange compared to what came before.


You're getting downvoted, but I agree with you except for your timeline. This won't be possible in a year. What's here is a concept demo, but the gulf between "that's neat" and "you can make a decent 10 minute short film" is pretty vast.


> But the gulf between "that's neat" and "you can make a decent 10 minute short film" is pretty vast.

Agreed. I expect the tools I described to be prohibitively expensive for the average person for some time. By the end of 2025, probably only well-funded studios will be able to use such tools and probably not economically.

But I'd be quite surprised if Hollywood studios/publishers aren't using their immense back catalogue to train private models right now. I don't think they'll ever allow royalty-free movie generation tools to be used that were trained on their catalogues. So perhaps there'll be a cottage industry of stock footage by amateur and professional actors for training/augmenting the movie generation models and tools that will be available for general use, royalty free.

Or perhaps these tools will simply emerge as just another TikTok filter and we'll see goofy couples who filmed a skit in their living room presented as gladiators arguing on the surface of Mars while their dog runs back and forth between them unable to choose a favourite.


"long time" in LLM land = 2 years tops


Seems you drank way too much of the Altman cool aid. :-D

How has LLMs actually developed in the last two years?

Have you noticed movies and TV series use multiple cameras to capture a scene from different angles?

When it comes to video these things can't get consistency between angles or scenes.

Add to that that the results are full of glitches and the resolution is equivalent to a CRT screen in the year 2000. Same resolution as s

Fixing these limitations would equate to a revolution rather than a steady evolution.

And let's not forget that these systems are also hugely unprofitable at this stage.


Anything is possible at zombo.com


Isn't that a core problem now. Getting actors to act exactly how you want was never a solved problem.

But this limits promotion where actors do interviews and sell the movie to the public. It also limits an actor doing something crazy that tanks a movie like a tweet.


The answer is that it depends on the director. For David Fincher or the Coen brothers, having this level of exactitude and precision is what their craft is all about.

But for plenty of other masters - think Cassavetes, Mike Leigh, even PTA - the actor's outstanding talent and instincts bring something to the script and vision that is outside of their prescriptive powers. Their focus is essentially setting up a framework for magic to happen inside of.


> Getting actors to act exactly how you want was never a solved problem.

As a choreographer myself, that's not necessarily a problem but rather a feature: it depends on how the director creates. Often you want what's unique to the performer, you don't want them to do something that's exactly like what you envisioned but whatever their interpretation/vision of it is, the "imperfectness" is what makes it interesting and rich.


> Getting actors to act exactly how you want was never a solved problem.

Also, some great lines in movies came from actors ad libs. I hope there will always be some space for mild hallucination; without improvisation we wouldn't even have jazz.


Yep. Including one of the best moments in one of the most influential sci-fi films of all time. https://en.wikipedia.org/wiki/Tears_in_rain_monologue



Consistentency between scenes is one possible reason.


These are not movies, these are clips. The stock photo/clip industry is surely worried about this, and probably will sue because 100% these models were trained on their work. If this technology ever makes movies, it'll be exactly like all the texts, images and music these models create: an average of everything ever created, so incredibly mediocre.


I imagine a movie maker where you say "use model A and put them in scene 32f, add a crowd and zoom in on A. They should look very worried." Then they can just play with it. Then save a scene, onto the next. Since AI can continue an animation, I don't see why it can't faithfully recreate given models with more development


What'll happen in both industries is the same that will happen everywhere else: adopt or die. The huge winners will be those that creatively use this new tool without 100% relying on it to do everything.


There have been several AI short film festivals, as well as several AI music videos that have been produced. The caveats are that quality varies, the best ones simply employ solid production in general (good editing, strong directorial vision, etc), and I don't know that anything feature length is out or even in the works.


the problem is that these stock footage companies are up against the richest corporations to ever exist. Legal recourse will take a monumental amount of money and time.

Hate to say this, but as things stand, tech companies stand to become all pervasive and all powerful if AI keeps growing the way it has


Why are there so many websites that are essentially static HTML that make my phone stutter?

The video’s look cool, but I can’t really enjoy reading about them if my phone freezes every 2 seconds.


I'm seeing weird bank on a Pixel 6a / chromium browser as well. I'm on mobile so I can't check the source, but this can't just be static HTML.

When I scroll the page, sections of text are missing then pop in, randomly though not as a scroll driven animation. It almost feels like something is blocking the browser's render loop and it can't catch up to actually paint the text. That'd be an insane bug on such a simple page, though I put nothing past react these days if they used it here.


I'm also having trouble with the page.

I wish web browsers had a "pause" button for scripts so I can just scroll to the bottom, let everything load and then hit pause and "freeze" the page contents so I can read without distractions.

I also feel like the quality of web UX is in rapid decline, and nobody wants to hire competent web developers who grok the fundamentals anymore.


Not so much stutter here but definitely some layout shifts as images/video elements load :/


It's actually quite usable and fast if you turn javascript off.


maybe the companies, who make them, have not enough know-how in web-development.


Which browser?


And which device.

Over the years, adding many simultaneous videos to websites has become quite common, and I have always marveled at how well many devices can handle this.

Nevertheless, it is pretty demanding for the hardware, and many smartphones are not made for such tasks.


Must be because Facebook use php


Your comment makes no sense


I think that's the joke


Your comment uses php.


Q: Why are there so many websites that are essentially static HTML that make my phone stutter?

A: because your phone is a potato

The first free tech support from HN is free, subsequent questions will be $29.99 per.


Took over 20s to settle on a Galaxy s21.


We humans are so excessively dependent on vision input and with entertaining through visuals, too. But more and more all those visuals become meaningless to me and it all just feels like fast-food-junk to me.

As any pre-schooler will be able to produce anything (watch out parents) imaginable in seconds doesn't make it better to me or is of any real value.

Ok, i needed to edit it again to add: maybe this IS the value of it. We can totally forget about phantasizing stories with visuals (movies) because nobody will care anymore.


they’re junk food-y visuals too. i don’t know how to describe it beyond looking like a cross between fisher-price and a light dose of shrooms.


Yeah I agree, I never understood the appeal of photography, it's so easy, you don't need to paint for hours to produce something original, you just need to buy a camera and click on a button. That's it. And people pay for that, I don't get it.


is it easy? i’m not a photographer but i enjoy taking pictures as a hobby. Your opinion baffles me haha. People always wonder why their iphone can’t shoot photos that look as good as the iphone camera reviews.

https://www.austinmann.com/trek/iphone-16-pro-camera-review-...

Understanding why your camera performs well in certain conditions and how to tweak the camera, is a mini-science all of its own. Also i think the reason photography is appreciated is because of composition.

Knowing where to position yourself, and at what moment, can make photographs turn out magical. Your eyes are processing millions of frames a second, but a photographer often gets just ONE chance to capture some scenes. Some people appreciate the “art” in that. :)

Here’s a thought experiment. Do you consider all wedding photographers to be of equal value? If not, why? Anyone can do it according to you. just click a button. Something to think about…


> Knowing where to choose the right words, and at what moment, can make video gen turn out magical. The AI is processing millions of tokens a second, but a prompter often finds just ONE perfect prompt to capture some scenes. Some people appreciate the “art” in that. :)

I do. It can take hours to make the vision in your head be replicated by the AI. Sure you can spit out generic scenes, but the game changes when the goal is expressing a specific vision with the prompt; choosing the right synonyms, phrasing, etc to make it spit out the closest image.

Do I think I'm an artist for that? Not particularly. I like to think of myself as a prompt engineer, a completely different skill for sure, after all I do think overly logical and practical in life in general so it meshes well with my skills plus my artistic background.


I dislike the term "prompt engineer," particularly if you're not setting up systems on a technical level. And it's still artistic, without necessarily making prompters artists. Because the process is one closer to curation rather than creation, I like to think of prompters as "curators". You're reaching into latent space (a collective visual history), pulling out a collapsed possibility, and deciding if it fits your needs or not.


there’s a reason i put “art” in quotes. I don’t mean the traditional definition, but rather the less commonly used definition from the dictionary:

“a skill at doing a specified thing, typically one acquired through practice.”

I agree that a photographer is not a true artist. It’s not hard to understand why people might appreciate their skill and creativity though?


It's sarcasm


Well to be honest I could have written that without sarcasm.


You don't understand why people pay for cameras?


I honestly thought that, too, until I got my first digital camera and started trying to take photos that I liked as much as other ones I saw. Then I realized how much of a craft it is and I gained a much deeper appreciation.


If this is sarcasm which aims to highlight the problem with parent comment, I wholeheartedly agree.


I've been saying for years that generated content is an impending tsunami that's going to drown out all real human voices online. The internet may become effectively unusable as a result for anything other than entertainment.


This is interesting and i see some of this now. Even here on HN and other forums i thought were mostly "human". Even one of my group chats i can tell one of my friends is using ai responses, but one of the other members cant tell and replies earnestly.

I am grossed out by this. my instinct is to avoid ai slop. The interesting part to me is: What next? Where do we go? Will it be that "human" forums are pushed further into obscurity of the internet? Or will go so far as that we all start preferring meeting in person? Im clueless here


> Even one of my group chats i can tell one of my friends is using ai responses, but one of the other members cant tell and replies earnestly.

Too bad she won't live! But then again, who does?


Vetting people into groups will become much more common, I think. Unless you can verify that person, ideally by knowing them irl, don't talk to them online.


Your droids, they'll have to wait outside


Cryptography-secured/signed generated content / interactions?


worldcoin project solves a lot of this when combined with web of trust, however everyone's knee jerk reaction to worldcoin is pretty bad and so it's annoying to even mention it


I’m not knowledgeable in crypto/worldcoin.

I was rather thinking classical cryptography baked into generative networks.


Worldcoin was intended to solve it but in practice it's not being used that way. Not enough adoption by users or websites.

Maybe in the future?

I'm a worldcoiner but so far it's just been free money.


i mean it's in its infancy, if there's a critical need for it, which i think there will be soon, i think it'll get more popular


US citizens still use paper Social Security Cards. The point where there is a "critical need" recognized by the relevant parties may be further off than you believe.


Humans will start to notice this shit. I used openai to help me edit my stories originally, but then when I started reading other stories it quickly became evident to me that people just generated them entirely with AI.

ChatGPT is way too happy to overuse the word cacophony.


My hot take is that we will have some small obscure forums with people, some social media flooded with AI content and other social media where you need to register with government ID and facescan.


I am 10,000% ready for forums to make a comeback. The internet hasn't been good since 2010


I got into hiking.


Maybe that's a good thing. The internet never reached its potential as being the connective fabric of humanity. Mostly it's just marketing and spam. If the internet died and we all went back to smaller communities, that really wouldn't be the worst thing IMO. We're not really evolved for global communications at scale anyway.


We're not evolved for most things in modern life, that's not really much of an argument.


Shrug. Maybe it's an argument to de-modernize more things in order to bring daily life back to the environments we're healthier and happier in.

Edit: In particular, I'm not convinced the internet was a net positive. Running water and sewage systems, sure. But what has 24/7 smartphone access actually done for our societies? Most people today don't seem any better off than they were in the 80s and 90s, and in many ways that actually matter, they seem worse off. Sure, they have access to way more information than our predecessors ever did... but it's not like we built a better world off it. Mostly the internet has accelerated the concentration of wealth towards the top, increased anxiety across the world, and significantly contributed to the global downfall of representative democracies, to name a few. Sometimes modernity can just be a collection of pathologies with a few beneficial side effects.


Why should I care?

Have you seen what most humans say? If an AI says more intelligent things I'm all for it.


AIs say what the people giving them orders tell them to.

And until we get AGI, AI talk will necessarily be some combination of hallucinations, unintelligent, vapid, etc.


There could be AI agents in this thread right now

And you wouldn’t even know it


If you can't tell, does it matter?


I really don’t think it does


Mission fucking accomplished?

https://xkcd.com/810/


Would be nice if we were able to go to communities of human verified users. Smaller in scope than social media


The Internet used to be a sort of hideaway for nerdy people to hang out and have fun. Ever since the invention of the smartphone, possibly before (see “Eternal September”) it’s gone to shit. These days I would rather spend time offline.

Are there any other Internet-based hideaways to retreat to? Somewhere where ads, clout chasing, and AI slop doesn’t exist?


IRC is more active than forums, but I miss forums


Come to Tildes.net! We're comfy small community with a single rule: don't be an asshole.


> The internet may become effectively unusable as a result for anything other than entertainment.

That already happened, even without AI.


What was useful and usable about the internet before that isn't now?


I think it will just become the new baseline abd people will still value anything better than that.


Maybe we should abandon HTTP and create a new protocol just for humans. HHTTP.


The only problem is that all the AI slop is not actually entertaining either.


Beware of sampling bias. Slop will always be slop.


All the vids have that instantly recognizable GenAI "sheen", for the lack of a better word. Also, I think the most obvious giveaway are all the micro-variations that happen along the edges, which give a fuzzy artifact.


I assure you that's not enough. These are high quality videos. Once they get uploaded to social media, compression mostly makes imperfections go away. And it's been shown that when people are not expecting AI content, they are much less likely to realize they are looking at AI. I would 100% believe most of these videos were real if caught off guard.


A friend who lives in North Carolina sent me a video of the raging floodwaters in his state- at least that's what the superimposed text claimed it was. When I looked closer, it was clearly an Indian city filled with Indian people and Indian cars. He hadn't noticed anything except the flood water. It reminded me of that famous selective attention test video[1]. I won't ruin it for those who haven't seen it, but it's amazing what details we can miss when we aren't looking for them. I suspect this is made even worse when we're casually viewing videos in a disjointed way as on social media and we're not even giving one part of the video our full attention.

1. https://www.youtube.com/watch?v=vJG698U2Mvo


For the entire duration of the Russia/Ukraine war "combat footage" that is actually from the video game ARMA 3 has gone viral fairly regularly, and now exactly the same thing is happening with Israel/Iran.


And which YouTube happily promotes straight to the top, of course -- thanks to the efforts of its rocket-science algorithm team. (Not sure whether the ones I've been seeing were generated by that particular platform, but YT does seem to promote obviously fake and deceptively labelled "combat" footage with depressing regularity).


The willingness of people to believe that combatants are wearing cinematic body cams for no tactical reason can only be matched by their willingness to assume people meticulously record every minute of their lives just so they can post a once-in-a-lifetime event on TikTok.

Who even needs AI generated videos when you can just act out absurdity and pretend it's real?


As far as I know, most of the viral stuff has been active air defence CWIS and the like which can be hard to discern.

There's a morbid path from the grainy Iraq war and earlier shaky footage, through IS propaganda which at the time had basically the most intense combat footage ever released to the Ukraine war. Which took it to the morbid end conclusion of endless drone video deaths and edited clips 30+ mins long with day long engagements and defending.

And yes, to answer your belief that there is none - there is loads of "cinematic body cam footage out there now".


Thousands of combatants are wearing bodycams, and pretty regularly, there are videos released by Russians of a dead Ukrainian's last moments taken from their corpse and the same happens vice versa.


Dude I clicked on some random Youtube accounts that were streaming the world cup live, and it took me a while to realize that they were actually just streaming video games replica of the actual game (at least, I think they were simulating the actual game with a video game, but I'm not sure as I didn't compare closely)


I've seen that a bunch of times, there's CGI highlights of most football matches.

I still don't know if it's autogenerated from the original video or recreated manually but yeah it's pretty realistic for the first few seconds.


Someone once did the opposite - streamed a real pay-per-view UFC match on Twitch and pretended it was a game he was playing. It actually worked for a while before the Twitch mods realized what was going on.

https://www.theverge.com/2017/12/4/16732912/ufc-video-stream...


It's kind of sad that we don't even need AI to create misinformation, the bar for what people will fall for is really low.


I've shown my own videos I made in dcs world to idiots at the bar in airports and they believed I was the ghost of kiev lmao


People believe false things easily if it confirms their priors. Confirmation bias is strong.

Fake images play into that, but they don't need to be AI generated for that to be true, it's been true forever.


And let's not forget the paper that goes with the video, which has a stellar title: http://www.chabris.com/Simons1999.pdf


hmmm... Maybe it's because I knew it was testing me, but I noticed it right away and counted the right count.

I could see it being pretty shocking if I hadn't, but I honestly can't imagine how I'd miss that.


It probably doesn't work if you're primed to look for hidden details. I took the test along with my Psychology 101 class of about 30 people and no one noticed anything amiss.


Once you see it you can indeed not imagine how you couldn't. Some people see it the first time, but it's a small amount of people. This video just demonstrates how humans can only focus at one thing at a time, and when we're multitasking, we're actually doing little parts of different tasks one at a time but very quickly after each other, kind of like a single CPU core. And if we tightly focus our attention to one point, we are not aware of other things that might be relatively close to that point.

That is also how magicians work, drawing your attention to one particular thing, hiding the secret of the trick from you, sometimes even in plain sight, like in the video.

Or pickpockets, who might bump into you and picking your pocket at the same time, where your attention is focussed on the sudden impact, keeping your attention away from your walled being taken.


> hmmm... Maybe it's because I knew it was testing me, but I noticed it right away and counted the right count.

> I could see it being pretty shocking if I hadn't, but I honestly can't imagine how I'd miss that.

The point of the video wasn't to count correctly, but to see the gorilla


99% the person was playing along for the rest of us, so we get a chance to enjoy the video as intended.


cool, he noticed it right away


I believe them. Why would people lie on the internet?


> I noticed it right away


I was focused on counting. I counted very wrong, but caught the gorilla right away.


If you see a text accompanying some content you can de-prime yourself by saying "nuh-uh, that's exactly what it's fscking not."


I do not see how the examples you mentioned are related to the topic? What does selective attention have to do with the video looking AI generated in all the frames?


Their argument is that if someone is affected by confirmation bias, they likely won’t notice these kinds of details.

Essentially, send me a video of something I care about and I will only look for that thing. Most people are not detectives, and even most would-be detectives aren’t yet experts.


probably people will soon develop a habit of verifying every detail in videos of interest haha


Cause people are well known for verifying every detail in most other forms of media already right?


Verifying what?


Indeed watching Reels or Tiktok videos is an exercise in testing your bullshit meter and commenting accordingly to let the uninformed know hey this is most likely fake.


Facebook is mostly this now too. Long comment threads of boomers thanking AI images for their military service or congratulating it on a long marriage.


> it's been shown that when people are not expecting AI content, they are much less likely to realize they are looking at AI.

At this point, looking at a big tech SoMe feed I would expect that everything is, or at least could be, gen AI content.


I regularly catch my kids watching AI generated content and they don't know it.


It's kind of an interesting phenomenon. I read something on this. Basically being born between ~1980 and ~1990 is a huge advantage in tech.


The only generation that ever knew how to set the clock on a VCR: our parents needed our help; our kids won't have even seen a VCR outside of a museum, much less used one.


Very interesting point. Wonder about the generation before and what skills they had to share with their parents who were most likely traumatised from a world war or two. I remember setting the vcr clock and tuning the new tv with the remote. I’m sure the adults could of figured it out but they probably got more from seeing their ‘smart’ kids figuring it out in time for the cartoons!


The parents of those of us who grew up in the 80's and 90's invented the VCR, they could use it just fine.


The Zoomers have the advantage that the bar is pretty low these days.


A surprising amount of it is really popular too. I recently figured out that the Movie Recaps channel was all AI when the generated voice slipped and mispronounced a word in a really unnatural way. They post videos almost daily and they get millions of views. Must be making bank.


A group I follow about hobby/miniatures (as in wargaming miniatures and dioramas) recently shared an "awesome" image of a diorama from another "hobby" group.

The image had all the telltale signs of being AI generated (too much detail, the lights & shadows were the wrong scale, the focus of the lens was odd for the kind of photo, etc). I checked that other group, and sure enough, they claim to be about sharing "miniature dioramas" but all they share is AI-generated crap.

And in the original group, which I'm a member of and is full of people who actually create dioramas -- let's say they are "subject matter experts" -- nobody suspected anything! To them, who are unfamiliar with AI art, the photo was of a real hand-made diorama.


I was watching UFC recaps on Youtube and the algorithm got me onto AI generated MMA content, I watched for a while before realizing it. They were using old videos which were "enhanced" using AI and had an AI narrator. I only realized it when the fight footage got so old, and the AI had to do so much work to touch it up, that artifacts started appearing in the video. Once I realized it I rewatched the earlier clips in the video and could see the artifacts there too, but not until I was looking for them.


There's already rabbit holes of fake MMA fighting you can fall into online? Even if you're a "fan" and relatively aware of what to look for ... still difficult to spot? Horribly, had the same sensation while watching UFC at a bar. "Haven't I seen this match where they fall on the ground and hug for hours before?" Mostly empty background audience with limited reactions.

Somebody took AI video editing, and in a year or two, we're already at entire MMA rabbit holes of fake videos.

Commenting mostly as a personal evidence reference of how crazy the World Wide Web has gotten from anecdotal sources.


Most probably they employ overseas, underpaid workers with non-standard English accents and so they include text-to-speach in the production process to smoothen the end result.

I won't argue wether text to speech qualifies as an AI but I agree they must be making bank.


I wonder if they are making bank. Seems like a race to the bottom, there’s no barrier to entry, right?


Right, content creators are in a race to the bottom.

But the people who position themselves to profit from the energy consumption of the hardware will profit from all of it: the LLMs, the image generators, the video generators, etc. See discussion yesterday: https://news.ycombinator.com/item?id=41733311

Imagine the number of worthless images being generated as people try to find one they like. Slop content creators iterate on a prompt, or maybe create hundreds of video clips hoping to find one that gets views. This is a compute-intensive process that consumes an enormous amount of energy.

The market for chips will fragment, margins will shrink. It's just matrix multiplication and the user interface is PyTorch or similar. Nvidia will keep some of its business, Google's TPUs will capture some, other players like Tenstorrent (https://tenstorrent.com/hardware/grayskull) and Groq and Cerebras will capture some, etc.

But at the root of it all is the electricity demand. That's where the money will be made. Data centers need baseload power, preferably clean baseload power.

Unless hydro is available, the only clean baseload power source is nuclear fission. As we emerge from the Fukushima bear market where many uranium mining companies went out of business, the bottleneck is the fuel: uranium.


You spent a lot of words to conclude that energy is the difference maker between modern western standards of living and whatever else there is and has been.


Ok, too many words. Here's a summary:

Trial and error content-creation using generative AI, whether or not it creates any real-world value, consumes a lot of electricity.

This electricity demand is likely to translate into demand for nuclear power.

When this demand for nuclear power meets the undersupply of uranium post-Fukushima, higher uranium prices will result.


Continuing that thought, higher uranium prices and real demand will lead to unshuttering and exploiting known and proven deposits that are currently idle and increase exploration activity of known resources to advance their status to measured and modelled for economic feasiblity, along with revisiting radiometric maps to flag raw prospects for basic investigation.

More supply and lower prices will result.

Not unlike the recent few years in (say) lithium, anticipated demand surged exploration and development, actual demand didn't meet anticipated demand and a number of developed economicly feasible resources were shuttered .. still waiting in the wings for a future pickup in demand.


Spend a few months studying existing demand (https://en.wikipedia.org/wiki/List_of_commercial_nuclear_rea...), existing supply (mines in operation, mines in care and maintenance, undeveloped mines), and the time it takes to develop a mine. Once you know the facts we can talk again.

Look at how long NexGen's Rook 1 Arrow is taking to develop (https://s28.q4cdn.com/891672792/files/doc_downloads/2022/03/...). Spend an hour listening to what Cameco said in its most recent conference call. Look at Kazatomprom's persistent inability to deliver the promised pounds of uranium, their sulfuric acid shortages and construction delays.

Uranium mining is slow and difficult. Existing demand and existing supply are fully visible. There's a gap of 20-40 million pounds per year, with nothing to fill the gap. New mines take a decade or more to develop.

It is not in the slightest like lithium.


> Spend a few months studying existing demand

Would two decades in global exploration geophysics and being behind the original incarnation of https://www.spglobal.com/market-intelligence/en/industries/m... count?

> Once you know the facts we can talk again.

Gosh - that does come across badly.


Apologies.

When someone compares uranium to lithium, I know I'm not talking to a uranium expert.

All the best to you, and I'll try to be more polite in the future.


Weird .. and to think I spent several million line kms in radiometric surveys, worked multiple uranium mines, made bank on the 2007 price spike and that we published the definite industry uranium resources maps in 2006-2010.

Clearly you're a better expert.

> when someone compares uranium to lithium, I know I'm not talking to a uranium expert.

It's about boom bust and shuttering cycles that apply in all resource exploration and production domains.

Perhaps you're a little too literal for analogies? Maybe I'm thinking in longer time cycles than yourself and don't a few years of lag as anything other than a few years.


Once again, allow me to offer my sincere apologies.

You are well-prepared to familiarize yourself with the current supply/demand situation. It's time to "make bank", just like you did in 2007... only more so. The 2007 spike was during an oversupplied uranium market and mainly driven by financial actors.

I invite you to begin by listening to any recent interview with Mike Alkin.

Good night and enjoy your weekend.


> Most probably they employ overseas, underpaid workers with non-standard English accents and so they include text-to-speach in the production process to smoothen the end result.

Might also be an AI voice-changer (i.e. speech2speech) model.

These models are most well-known for being used to create "if [famous singer] performed [famous song not by them]" covers — you sing the song yourself, then run your recording through the model to convert the recording into an equivalent performance in the singer's voice; and then you composite that onto a vocal-less version of the track.

But you can just as well use such a model to have overseas workers read a script, and then convert that recording into an "equivalent performance" in a fluent English speaker's voice.

Such models just slip up when they hit input phonemes they can't quite understand the meaning of.

(If you were setting this up for your own personal use, you could fine-tune the speech2speech model like a translation model, so it understands how your specific accent should map to the target. [I.e., take a bunch of known sample outputs, and create paired inputs by recording your own performances of them.] This wouldn't be tenable for a big low-cost operation, of course, as the recordings would come from temp workers all over the world with high churn.)


Can you identify any of these models?


I think it's unusual to assume they are based in the US and employ/underpay foreigners. A lot of people making the content are just from somewhere else.


But it uses AI only for audio, right? Script for the vid seems to be written by human, given the unusual humor type of this channel. I started watching this channel some time ago.


It's hard to tell whether they use AI for script generation. After having seen enough of those recaps, the humor seems to be rather mechanical and basic humor is relatively easy to get from an LLM if prompted correctly. The video titles also seem as if they were generated.

That said, this channel has been producing videos well before ChatGPT3.5/4 so at the very least they probably started with human written scripts.


I thought it was just text to speech when I happen to saw some of those videos. And it seems to have been consistently similar since before ChatGPT etc. Why do you think titles are AI generated?

I feel like it might actually be quite complex for AI to pull up the perfect clips and edit them with the script, including timing and everything. Maybe it could be made automatic, but nonetheless it would be a complex process and I don't think possible few years ago. I know Gemini and possibly some others can analyze video if fed to them, but I'm still skeptical that this channel in particular would have done it, when they have always had this frequency of uploads and similar tone.

Also I think there's far better TTS now with ElevenLabs and others so it could be made much more human like.


The way I see it, it won’t take long before human eyes won’t be able to distinguish AI generated content from original.

The only regret I have about that is losing video as a form of evidence. CCTV footage and the like are a valuable tool for solving crimes. That’s going to be out the window soon.


Trust can be preserved by adding PKI at the hardware level. What you said about CCTV is true; once the market realises and demand appears, camera manufacturers will start making camera modules that, e.g., sign each frame with the manufacturer's private key, enabling Joe Public to verify that that frame came from a camera made by that manufacturer. Reputational risk makes the manufacturer store the private key in the device in a secure, tamper-proof way (like TPMs do now), which (mostly) prevents those private keys from leaking.

Does this create difficulties if you want to modify the raw video data in any way? Yes it does, even if you just want to save it in a different lossy compression level or format. But these problems aren't insurmountable. Essentially, provenance info can be added for each modification, signed by the entity that made the change, and the end viewer can then decide if they trust the full certificate chain (just as they do now with HTTPS).


Oh wow, that's a great idea. Isn't this already happening maybe?

Recently someone said here that it's noticable that videos from CCTV cameras are often filmed with a phone or camera on a screen instead of using the original video, and people were speculating that it might be hard or impossible to get access to the original recording because of bureaucracy or something, but that recording a playback on a screen with a phone or camera or something is then often allowed. Maybe they also do this partly so that the original can't be easily messed with by other people.

But yeah if you can verify that a certain video was filmed at a certain time by a certain camera, that is great. Of course the companies providing these cameras should be trustworthy, and that the camera's are actually really sending what they actually record, and that the company itself doesn't mess with the original recordings.


>Isn't this already happening maybe?

I recall an article posted 1-2 years ago about a camera company (Kodak? Can't remember) which was starting to offer something along these lines.

>the companies providing these cameras should be trustworthy, and that the camera's are actually really sending what they actually record, and that the company itself doesn't mess with the original recordings.

I agree. We can't guarantee any of these things, but on the bright side, the incentives are pointing in the right direction to make self-interested companies choose to behave the right way.

It will complicate things and make the hardware more expensive, so I doubt it will sweep through all consumer camera tech unless the "Is this photo real?" question becomes a crisis. There's also the fact that it would be possible to give individual cameras different private keys, with certificates signed by the manufacturer: This would enable non-repudiation (you would not be able to plausibly deny that you had taken a particular photo/video), which has potentially big upsides but also privacy downsides. I think that could be solved by giving the user the option of signing with their unique camera private key (when the user wants to prove to others that they took the photo themselves) or with the manufacturer's key (when they want to remain anonymous).


It's sad that almost AS SOON as we acquired the ability to record real-life moments (with the promise of being able to share undeniable evidence of events with one another), we also acquired the ability to doctor it, negating that promise.


I'm not sure we should have been trusting images for the previous decades either. Photoshop has been a thing for a long time already. I mean, there's those famous photos that Stalin had people removed from.


Your mention of Stalin is I think stronger as an argument that there’s been a significant change. Those fakes took lots of time by skilled humans and were notoriously obvious - what made them effective was the crushing political power preventing them from receiving critical analysis in public.

Similarly, while Photoshop made it easier it happened at a time where technical advances made the problem harder because everyone’s standards for photos went up dramatically, and so producing a realistic fake was still a slow process for a skilled worker.

Now, it’s increasingly available to everyone and that means that we’re going to see a lot more scams and hoaxes as people without artistic talent or willingness to invest time can make realistic fakes even for minor things. That availability is transformative enough to merit the concern we’ve been seeing here.


The glass half-full in me feels that the advantage to this is that in a few years the average person will know better than to trust anything that could be faked like that, instead of the old situation where someone who was willing to put in that effort could actually trick a lot of people.


I think that’s true, but it’s kind of like the trade offs during the pandemic where we knew it would eventually settle into a stable state but still wanted to reduce the harm getting there. We basically need some large fraction of the global population to level up in media literacy at all once.


I don't think it goes out the window completely. You need just the owner of the CCTV to stand up in court and say "yes this is the CCTV footage I personally copied from storage and I did not manipulate it".


> compression mostly makes imperfections go away

The ultimate compression is to reduce the video clip to a latent space vector representation to be rendered on device. :)

Just give us a few more revs of Moore’s law for that to be reasonable.

edit: found a patent… https://patents.google.com/patent/US11388416B2/en


That sheen looks (to me) like some of the filters that are used by people who copy videos from TV and movie and post them on (for example) facebook reels.

There's an entire pattern of reels that are basically just ripped-off-content with enough added noise to (I presume) avoid content detection filters. Then the comments have links to scam sites (but are labelled as "the IMDB page for this content").


The idea that Meta’s effectively stolen content is tainted by a requirement to avoid collecting stolen content is laughably ironic.


Yes, but thats just a hypothesis, have we seen any evidence that shows the cause of the "AI sheen" is bad training data, or more likly, just a shortcomming of generating realistic photos from text at this early stage.


I thought the movements were off. The little girl on the beach moves like an adult, the painter looks like a puppet, and everything is in slow motion?


They look like some commercial promo video, which makes sense since that's probably what they were trained on.


To me they seem off, but off in the same sense real humans in ads always seem off. E.g. the fake smile of the smiling girl. That's what people look like in ads.


At least all the humans in these videos seem to have the correct number of fingers, so that's progress. And Moo Deng seems to have a natural sheen for some reason so can't hold that against them. But your point about the edges is still a major issue.


I wonder how much RLHF or other human tweaking of the models contributes to this sort of overstauration / excess contrast in the first place. The average consumer seems to prefer such features when comparing images/video, and use it as a heuristic for quality. And there have been some text-to-image comparisons of older gen models to newer gen, purporting that the older, more hands-off models didn't skew towards kitschy and overblown output the way newer ones do.


>All the vids have that instantly recognizable GenAI "sheen"

That's something that can be fixed in a future release or you can fix it right now with some filters in post in your pipeline.


I think the big blind spot people have with these models is that the release pages only show just the AI output. But anyone competently using these AI tools will be using them in step X of a hundred step creative process. And it's only going to get worse as both the AI tools improve and people find better ways to integrate them into their workflow.


Yeah exactly. Video pipelines that go into productions we only see the end results of have a lot of steps to them beyond just the raw video output/capture. Even Netflix/Hollywood productions without VFX have a lot of retouching and post processing to them.


Not even filters; every text2image model ever created thusfar, can be very easily nudged with a few keywords into generating outputs in a specific visual style (e.g. artwork matching the signature style of any artist it has seen the some works from.)

This isn't an intentional "feature" of these models; rather, it's kind of an inherent part of how such models work — they learn associations between tokens and structural details of images. Artists' names are tokens like any other, and artists' styles are structural details like any other.

So, unless the architecture and training of this model are very unusual, it's gonna at least be able to give you something that looks like e.g. a "pencil illustration."


> "That's something that can be easily fixed in a future release (...)"

This has been the default excuse for the last 5+ years. I won't hold my breath.


5 years ago there were no AI videos. A bit over a year ago the best AI videos were hilarious hallucinations of Will Smith eating spaghetti.

Today we have these realistic videos that are still in the uncanny valley. That's insane progress in the span of a year. Who knows what it will be like in another year.

Let'em cook.


Disco Diffusion was a (bad) thing in 2021 that lead to the spaghetti video / weird Burger Kind Ads level quality. But it ran on consumer GPUs / in Jupyter notebook.

2 years ago we had decent video generation for clips

7 months ago we have Sora https://news.ycombinator.com/item?id=39393252 (still silence since then)

With these things, like DALL-E 1 and GPT-3, the original release of the game changer often comes ca. 2 years before people can actually use it. I think that's what we're looking at.

I.e. it's not as fast as you think.


What video generation was decent 2 years ago? Will smith eating spaghetti was barely coherent and clearly broken, and that was March 2023 (https://knowyourmeme.com/memes/ai-will-smith-eating-spaghett...).

And isn’t this model open source…? So we get access to it, like, momentarily? Or did I miss something?


So you're right to be excited, I agree. And I don't know, Meta, like OpenAI, seems to release conditionally, though yes, more. I doubt it before the election.

When the Will Smith one was released, it was kind of a parody though. Tech had already been able to produce that level of "quality" for about 2 years at the time of it's publishing. The Will Smith one is honestly something you could have created with Disco Diffusion in early 2021, I used to do this back then...

2022 saw: https://makeavideo.studio/ (coherent, but low res - it was possible to upscale at extreme expense) https://sites.research.google/phenaki/ https://lumiere-video.github.io/

It was more like 18-20 months ago sorry so early 2023, but https://runwayml.com/research/gen-1 was getting there as was https://pika.art/home - Sora obviously changed the game, but I would say these two were great.


The subtle "errors" are all low hanging fruit. It reminds me of going to SIGGRAPH years back and realizing most of the presentations were covering things which were almost imperceptible when looking at the slides in front. The math and the tech was impressively, but qualitatively it might have not even mattered.

The only interesting questions now have nothing to do with capability but with economics and raw resources.

In a few years, or less, clearly we'll be able to take our favorite books and watch unabridged, word-for-word copies. The quality, acting, and cinematography will rival the biggest budget Hollywood films. The "special effects" won't look remotely CG like all of the newest Disney/Marvel movies -- unless you want them to. If publishers put up some sort of legal firewall to prevent it, their authors, characters, and stories will all be forgotten.

And if we can spend $100 of compute and get something I described above, why wouldn't Disney et al throw $500m at something to get even more out of it, and charge everyone $50? Or maybe we'll all just be zoo animals soon (Or the zoo animals will have neuralink implants and human level intelligence, then what?)


> In a few years, or less, clearly we'll be able to take our favorite books and watch unabridged, word-for-word copies. The quality, acting, and cinematography will rival the biggest budget Hollywood films. The "special effects" won't look remotely CG like all of the newest Disney/Marvel movies -- unless you want them to. If publishers put up some sort of legal firewall to prevent it, their authors, characters, and stories will all be forgotten.

I'm also expecting, before 2030, that video game pipelines will be replaced entirely. No more polygons and textures, not as we understand the concepts now, just directly rendering any style you want, perfectly, on top of whatever the gameplay logic provided.

I might even get that photorealistic re-imagining of Marathon 2 that I've been wanting since 1997 or so.


> In a few years, or less, clearly we'll be able to take our favorite books and watch unabridged, word-for-word copies. The quality, acting, and cinematography will rival the biggest budget Hollywood films. The "special effects" won't look remotely CG like all of the newest Disney/Marvel movies -- unless you want them to. If publishers put up some sort of legal firewall to prevent it, their authors, characters, and stories will all be forgotten.

I don't think so at all. You're thinking a movie is just the end result that we watch in theaters. Good directing is not a text prompt, good editing is not a text prompt, good acting is not a text prompt. What you'll see in a few years is more ads. Lots of ads. People who make movies aren't salivating at this stuff but advertising agencies are because it's just bullshit content meant to distract and be replaced by more distractions.


Indeed, adverts come first.

But at the same time, while it is indeed true that the end result is far more than simply just making good images, LLMs are weird interns at everything — with all the negative that implies as well as the positive, so they're not likely to produce genuinely award winning content all by themselves even though they can do better by asking them for something "award winning" — so it's certainly conceivable that we'll see AI indeed do all these things competently at some point.


> "In a few years, or less, clearly we'll be able to take our favorite books and watch unabridged, word-for-word copies."

That would be a boring movie.


You had AI videos 5 years ago?


AI in general.


…I mean, it was advancing slowly for linguistic tasks until late 2022, that’s fair. That’s why we’re in such a crazy unexpected rollercoaster of an era - we accidentally cracked intuitive computing while trying to build the best text autocomplete.

AI in general is from 1950, or more generally from whenever the abacus was invented. This very website runs on AI, and always has. I would implore us to speak more exactly if we’re criticizing stuff; “LLMs” came around (in force) in 2023, both for coherent language use (ChatGPT 3.5) and image use (DALLE2). The predecessors were an order of magnitude less capable, and going back 5 years puts us back in the era of “chatbots”, aka dumb toys that can barely string together a Reddit comment on /r/subredditsimulator.


AI so far has given us ability to mass produce shit content of no use to anybody and the next iteration of customer support phone menu trees that sound more convincingly yet remain just as useless. That and another round of IP theft and mass surveillance in the name of progress.


This is a consequence of a type of cognitive bias - bad examples of AI are more easily detectable than good examples of AI. Subsequently, when we recall examples of AI content, bad examples are more easily accessible. This leads to the faulty conclusion that.

> AI so far has given us ability to mass produce shit content of no use to anybody

Good AI goes largely undetected, for the simple reason that it closely matches the distribution of non-AI content.

Controversial aside: This is same bias that results in non-passing trans people being representative of the whole. Passing trans folk simply blend in.


This basic concept can be applied in many places. Do you ever wonder why social movements seem to never work out well and demands are never met? That’s because when they do work out, and demands are met, those people quickly become the “oppressor” or the powerful class from which others are fighting to receive more rights or money.

All criminals seem so incredibly stupid that you can’t understand why anyone would ever try since they all are caught? The smart ones don’t get caught and no one ever hears about them.


You're making an unverifiable claim. How are we supposed to know that the undetected good AI exists at all? Everything I've seen explicitly produced by any of these models is in uncanny valley territory still, even the "good" stuff.


Don't care. Every request for verification will eventually reach the Münchhausen trilemma


Okay. So you are a person who does not care if what they are saying is true. Got it!


Verificationism[1] is a failed epistemology because it breaks under the Münchhausen trilemma. It's pseudo-scientific like astrology, four humors, and palm reading. Use better epistemologies.

https://en.wikipedia.org/wiki/Verificationism


The core use case is as a small part of larger programs. It’s just computer vision but for words :)


We don't have AI in general today


I'm thankful to be able to recognize that sheen, though I think it will go away soon enough


I don't think that's a bug. I think that helps us separate truth from fiction as we navigate the transition to this new world.


Ever heard of post processing? Because no, you can't trust these signals to always exist with AI content.


It is maybe recognizable in most cases, but definitely not instantly nor easily. I could definitely see nobody noticing one of those clips used in an otherwise non-AI video production.


I did some images generation and found a LORA for VHS footage. It's amazing what "taking away the sheen" can do to make an image look strikingly real.


The ATV turning in mid air was a giveaway as well. Physics seems to be a basic problem for these type of videos.


The bubble released into the air is also pretty good until at the end where bubbles appear out of thin air.

But overall the physics are surprisingly good. In the videos from text we a person moving covered in a bedsheet, a mirror doing vaguely mirror-like things, a monkey moving in water and creating plausible waves, shadows moving over a 3d object with the sloth in the pool and plausible fire. Those are all classic topics to tackle in computer-generated graphics, all casually handled by a model that isn't explicitly trained on physical simulation.

In a twist of irony it's the simplest of those (the mirror) that's the most obviously wrong.


Video autotune.


A lot look like CGI, but I wouldn't be able to tell that they weren't created by an actual animator.


I think that's because they're still using mean-squared error in their loss function.


Yeah but... it's good enough?

There were movies with horrible VFX that still sold perfectly well at the time.


An important contrast is that early VFX offered strong control with weak fidelity, and these prompt-based AI systems offer high fidelity with weak control. Intent matters if you want to make something more than a tech demo or throwaway B-roll and you can't communicate much intent in a 30 word prompt, assuming the model even follows the prompt accurately.


This is such an important problem of the entire genAI idea. It's absurd that people keep focusing on quality instead of talking about it.

But then, a lot of people have financial reasons to ignore the problem. What's too bad, because it's hindering the creation of stuff that are actually useful.


> AI systems offer high fidelity with weak control

You are spot on. I've been involved in creating technologies used by film and video creators for decades, so I have some understanding of what would be useful to them. The best video AIs I've seen only seem capable of replacing some stock video clip creation because, so far, I haven't seen any ability to maintain robust consistency from shot to shot and scene to scene. There's also no granular control other than the text prompt. At first glance, these demos are very impressive but when you try to map the capability shown to a real production workflow for a movie, TV show or commercial, they're not even close because they aren't even trying to solve the problem.

To be clear, I think it's probably possible to create a video AI that would be truly useful in a real production workflow, it's just that I haven't seen anything working in that direction yet.


> You are spot on. I've been involved in creating technologies used by film and video creators for decades, so I have some understanding of what would be useful to them. The best video AIs I've seen only seem capable of replacing some stock video clip creation because, so far, I haven't seen any ability to maintain robust consistency from shot to shot and scene to scene. There's also no granular control other than the text prompt. At first glance, these demos are very impressive but when you try to map the capability shown to a real production workflow for a movie, TV show or commercial, they're not even close because they aren't even trying to solve the problem.

Yeah it's really hard to get across to a lot of folks that are really amped up about these tools that what they're focused on refining is not getting them any closer to their imagined goal in most professional workflows. This will be great right off the bat for what most developers would need images for-- making a hero image for a blog post, making a blurb of video for a background, or a joke, or making assets for their video game that would never cut it for a non-cheapo commercial project but are better than what they'd have been able to cobble together themselves. But those workflows are fundamentally so different from the very first steps in the process. It's a larger-scale version of trying to explain to no-compromise FOSS zealots 20 years ago that Gimp was nowhere near able to replace Photoshop in a professional toolkit because they're completely disinterested in taking feedback about professional use cases, and that being able to write your own filters in Perl doesn't really help graphic designers-- well 20 years later, the gap is as wide as it ever has been, and there are even more people, almost exclusively FOSS nerds with to professional visual work experience, that insist it's better.

That said, it's nearly as hard to get this across to ADs who are like "what do you mean this shot is going to take you 3 days? I just made these stills which are like 70% there in midjourney in 10 minutes."

> To be clear, I think it's probably possible to create a video AI that would be truly useful in a real production workflow, it's just that I haven't seen anything working in that direction yet.

I think that neural networks, generally, are already fantastically useful in tools like Nuke's Copycat node. Nobody misses masking frame-by-frame if they don't have to do it. But prompt-based tools? Nah. If even 200 words in a prompt was enough information to convey work that needed to be done, why do creative workflows need so many revisions and why are there so many meetings with sketches and mood boards and concept art among career professionals? Text prompts are great for people that are working with a medium they don't really know how to create in because the real artistic decisions are already made by the artists whose artwork was ingested into the models. If you don't understand that level of nuance, you don't see how unbelievably consequential it is to the final product, and not having granular control of it seems nearly inconsequential. Most professionals look at it and see a toy because they know it will never be capable of making what they want it to make.


> neural networks, generally, are already fantastically useful in tools

Yes, I agree. You've highlighted the distinction I should have included of "prompt-based".

There's a vast gulf between these AI-researcher-based concept demos on one side and the NN-based features slowly getting implemented in real production tools. Like you, I've found it challenging to have constructive conversations about AI tooling with anyone not versed in real production workflows. To anyone with real industry experience it's obvious that so far these demos don't represent a threat to real production workflows or the skilled career professionals making a good living. It's not that they're not threatening, they're just threatening to replace a different type of job entirely. If you're one of the poor souls in an off-shore locale doing remote low-end piece-work like manning a stock photo/video clip farm or doing >$100 per piece gigs on Fivver - then, yeah, you should feel "threatened".

A meta-point I try to make in these conversations is that, at least so far, every actual paying creative job I've seen AI threaten are, IMHO, work I wouldn't wish on my worst enemy. These are low-paid entry-level sweatshop gigs and everyone doing them aspires to do something else as soon as they can. The two analogies I use are: 1) How the "threat" of robotics to jobs is actually playing out. So far, in industrial applications robots are replacing Amazon warehouse and manufacturing assembly line workers, literally today's equivalent of 1920s sweatshop work. Much like the heart-wrenching videos of children in Calcutta earning pennies sifting through junk piles for metal scraps, it'll be a better world when robots replace those jobs and humans have jobs designing, installing, programming and servicing the robots. Likewise in consumer robotics applications, so far, the robots in our house only vacuum the floors, change the cat litter box, and wash the dishes/clothes. Growing up my family spent a couple years living in Asia in the 1970s and we actually had a "wash ama" who came twice a week and washed our clothes manually with a washboard and a tub. Sounds quaint but in reality it was grueling labor. She was a lovely lady but I'm glad Maytag replaced that job.

The second analogy I often use is observing that self-driving cars are mainly a threat to Uber and Lyft drivers who often barely earn minimum wage and have no job security to start with. Career professionals actually working in real video and film production workflows feel as "threatened" by prompt-based AIs as Formula 1 drivers feel about self-driving cars. Why does current F1 champion Max Verstappen never get asked how he feels about AI self-driving cars coming for his job? :-) As you observed, anyone who understands the thousands of creative choices which comprise any shot in a quality film doesn't even see these prompt-based AI demos as relevant. Once you've heard a skilled cinematographer, colorist or director of photography spend over an hour deconstructing and debating the creative choices made in single shot or scene from a film, it's hard to even imagine these demos as a threat to that level of creative skill. But being able to crudely copy the traits of a composite of a thousand exemplars of the craft without understanding any of the interactions between those thousands of creative choices does make for impressive demos. Even though the fidelity of the crude copy is amazing, the fact is such shots are a random puree of a thousand different creative choices pulled from a thousand different great shots. That's the root of what unskilled people call the "AI-clip sheen". It won't be easy to eliminate from prompt-based clip generators because the nature of the NN is it doesn't understand the interactions of all those subtle creative choices it's aping. Mashing together one cinematographer's lens choice from one shot with another cinematographer's filter choice from another shot with a third cinematographer's film stock choice from another film and a colorist's palette from a fourth unrelated work and then training the output filter only against broad criteria like "looks good" or "like a high-quality art film" is not a strategy that, IMHO, will ever produce a true threat to skilled top-level production workflows.

At the same time, as you observed, NN's are already delivering tremendous value eliminating labor-intensive, repetitive manual production work like frame-by-frame rotoscoping and animation tweening, work no one actually in the industry is sorry to see humans being relieved of. While I think NN-based features in production tools will continue to expand the use cases they can assist, I'm not sure AI tools will ever completely replace high-skill production professionals. I've already mentioned the technical challenges based on how NNs work but even if these challenges are someday overcome, there's a more fundamental limitation which is economic. Although feature film, network-level television and high-end commercials have massive cultural reach and are huge industries, the overall economic value of the entire technical production workflow and related tooling isn't as large as most people imagine. From Panavision cameras, Zeiss film lenses and Adobe Premiere to Chapman camera cranes, Sachtler tripods and Kinoflow lights, it's a relatively small industry with no unicorn-level startups. Even assuming one could license all the necessary content and manually tag it, it's hard to imagine a viable business plan which justifies investing the hundreds of millions required to recruit top-level AI researchers, thousands of H100 GPUs, etc to create and train a tool that could really replace the top 1000 career production pros working in Hollywood. There are so many other markets AI can target which are potentially far more lucrative than high-end film and video production workflows. Even the handful of blockbuster Summer tent pole movies made each year that cost $200M to make only spend somewhere around $10M or $20M on production labor and tooling below the department head level. That's not enough money to fund AI replacement anytime in the foreseeable future. The total addressable market of high-end film and video production just isn't big enough to be an attractive target for investors to fund going after it.


I think the most vulnerable spots in the industry are in concept art and matte painting, though I also think companies are starting to realize it's not all its cracked up to be. A colleague that also contracts for [big famous FX and animation house we all know and love] said they fired their entire concept art department last year and replaced them with prompt jockeys.... for a few weeks. The prompters could bang out a million "great start" rough drafts in an hour, but then when their boss came around and inevitably said "oh, this one is the one to stick with. Just move this to the right and that to the left and make this bigger and that smaller and make this cloth purple" and they were cooked. They didn't even have the comparatively basic photoshop skills to do a hack job there, let alone make changes by hand-- so they'd struggle with control nets and inpainting and more prompts but the whole thing was one gigantic failure and they were begging the centuries of concept art expertise they unceremoniously booted out the door for forgiveness. And those workflows don't require anywhere near the control that, say, compositing does.

My biggest hope for the professional use of these things is in post-render pre-comp polishing for simulations and pyro. They're so good at understanding patterns and having smooth transitions that they can make a nonsense, physically absurd combination of images blend together perfectly... one of my favorites was a background guy's nose in a sepia toned video was neatly melded into a distant oncoming train. I think that could be really great for smoothing out volume textures and things like that. Given, that probably has more to do with my specialty than anything.

My main problem is that I'm just starting out my career in this field after switching from a decade of python dev work, and then doing some visual design before going to art school where I graduated at the top of my program having mostly concentrated in making cool shit at the Houdini/UE confluence. Two years ago everyone was saying "holy crap you've got the golden skillset," and now everyone's like "oof... hang in there... I guess..." Even aside from the strike aftermath, nobody in the market has any idea what to do right now, especially with juniors, let alone a really weird mixture of junior + senior dev that I am with a few contracts under my belt and a ton of really solid coding experience, but nothing really impressive in the industry itself. Who fucking knows. I think a lot of people in charge of hiring are waiting for a moment where it's going to just be sort of obvious what they need to do, and don't want to hire people into FTEs that are going to be eliminated through ai efficiency gains in 6 months. I don't have a lot of insight into the hiring side of the business though.


Wow, your story about the "FX and animation house" is funny, sad and unsurprising - all at the same time. I'm just surprised they didn't actually test the full workflow before leaping. It reminds me of this tale from actual production people working with Sora https://www.fxguide.com/fxfeatured/actually-using-sora/ which I also found completely unsurprising. It still took a team of three experience pros around two weeks to complete a very modest 90 second video and they needed to reduce their expectations to "making something out of the clips the AI gave us" instead of what they actually wanted. And even that reduced goal required using their entire toolbox of traditional VFX tools to modify the clips the AI generated to match each other well enough. Sure, it's early days and Sora is still pre-alpha but, while some of these problems are solvable with fine-tuning, retraining and adding extensive features for more granular control, some other aspects of these workflow gaps are fundamental to the nature of how NNs work. I suspect the bottom line is that solving some key parts of real-world high-end film/video workflows with the current prompt-based NNs is a case of "you can't get there from here."


For sure. Tooling on top of the core model functionality will absolutely increase the utility of the existing prompt-based workflows, too, but my gut says the diminishing returns on model training is going to keep the "good enough" goalposts much much further into the future with video than with text and still images.


Just need to wait for someone to develop a version of ControlNet that works with this system.


Yeah controlnet-style conditioning doesn't solve for consistent assets, or lighting, framing etc. Maybe its early but it seems hard to get around traditional 3D assets + rendering, at least for serious use-cases.

These models do seem like they could be great photorealism/stylization shaders. And they are also pretty good at stuff like realistic explosions, fluid renders etc. That stuff is really hard with CG.


Yeah, that's a fair point.


It's my understanding that the AI sheen is done on purpose to give people a "tell". It is totally possible right now to at least generate images with no discernible tell.


> It is totally possible right now to at least generate images with no discernible tell.

I have yet to find examples of this


There are numerous tricks and LORAs to make realistic images without the overpolish you get by default:

* https://www.reddit.com/gallery/1fvs0e1

* https://old.reddit.com/r/StableDiffusion/comments/1fak0jl/fi...


Haha, I think I can maybe tell on like one or two of those


In the linked webpage, the following videos would be good enough to trick me:

- The monkey in hotspring video, if not for its weird beard...

- The koala video I would have mistaken for hollywood-quality studio CGI (although I would know it's not real because koalas don't surf... do they?)

- The pumpkin video if played at 1/4 resolution and 2x speed

- The dog-at-Versailles video edit

If the videos are that good, I'm sure I already can't distinguish between real photos and the best AI images. For example, ThisPersonDoesNotExist isn't even very recent, but I wouldn't be able to tell whether most of its output is real or not, although it's limited to a certain style of close-up portrait photography.

https://this-person-does-not-exist.com/en


> limited to a certain style of close-up portrait photography

Not to take away from your point but it's more limited than one might think from this phrase. As an exercise, open that page and scroll so the full image is on your screen, then hover your mouse cursor within the iris of one of the eyes, refresh and scroll again. (Edit: I just noticed there's a delayed refresh button on the page, so one can click that and then move their mouse over the eye to skip a full page refresh.) I've yet to see a case where my mouse cursor is not still in line with the iris of the next not-person.


So I’m probably going to be too closed minded about this: but who the f*ck asked for this and did anyone consider consequences of easily accessible AI slop generation?

It’s already nearly impossible to find quality content on the internet if you don’t know where to look at.


It's only going to be worse and aggregators aka gatekeepers will increase in value immensely.


> who the f*ck asked for this

Have you heard of the quip "Because we can"?


I did and I'm quite happy that this is happening :) It's unleashing a new computing era when you just have to lean back, close your eyes and your vision can materialize without a Hollywood production crew.


And it's great as anyone can use it in whichever way they want since machine generated content does not have copyright protections.

We will finally achieve the dream of everything being in public domain!


My kids both have creative hearts, and they are terrified that A.I. will prevent them from earning a living through creativity. Very recently, I've had an alternate thought. We've spent decades improving the technology of entertainment, spending billions (trillions?) of dollars in the process. When A.I. can generate any entertainment you can imagine, we might start finding this kind of entertainment boring. Maybe, at that point, we decide that exploring space, stretching our knowledge of physics and chemistry, and combating disease are far more interesting because they are real. And, through the same lens, maybe human-created art is more interesting because it is real.


> And, through the same lens, maybe human-created art is more interesting because it is real.

Conversations I have with people in real life almost always come back to this point. Most people find AI stuff novel, but few find it particularly interesting on an artistic level. I only really hear about people being ecstatic about AI online, by people who are, for lack of a better term, really online, and who do not have the skills, know-how, or ability, to make art themselves.

I always find the breathless joy that some people express at this stuff with confusion. To me, the very instant someone mentions "AI generated" I just instantly find it un-interesting artistically. It's not the same as photoshop or using digital art suites. It's AI generated. Insisting on the bare minimum human involvement as a feature is just a non-starter for me if something is presented as art.

I'll wait to see if the utopian vision people have for this stuff comes to fruition. But I have enough years of seeing breathless positivity for some new tech curdle into resignation that it's ended up as ad focused, bland, MBA driven, slop, that I'm not very optimistic.


> I only really hear about people being ecstatic about AI online, by people who are, for lack of a better term, really online, and who do not have the skills, know-how, or ability, to make art themselves.

Yes, I've noticed this. The people who are excited about it usually come off as opportunistic (hence the "breathless joy"), and not really interested in letting whatever art/craft they want to make deeply change them. They just want the recognition of being able to make the thing without the formative work. (I hesitate to point this out, anticipating allegations of elitism.)

Plus, really online people tend to dominate online discussions, giving the impression that the public will be happy to consume only AI generated things. Then again, the public is happy to consume social media engagement crap, so I'm very curious what the revealed preference is here.

The value in learning this stuff is that it changes you. I'll be forever indebted to my guitar teacher partially because he teaches me to do the work, and that evidence of doing the work is manifest readily, and to play the long, long game.


> Insisting on the bare minimum human involvement as a feature is just a non starter for me if something is presented as art

You can make the guidance as superficial or detailed as you like. Input detailed descriptions, use real images as reference, you can spend a minute or a day on it. If you prompt "cute dog" you should expect generic outputs. If you write half a screen with detailed instructions, you can expect it to be mostly your contribution. It's the old "you're holding it wrong" problem.

BTW, try to input an image in chatGPT or Claude and ask for a description, you will be amazed how detailed it can get.


You need an image for an ad. You write a brief and send it to an artist who follows your brief and makes the image for you. You make more detailed briefs, or you make generic briefs. You receive an image. Regardless, did you make that image or just get a response to your brief?

You want a painting of your dog. You send the painter dozens of photos of your dog. You describe your dog in rapturous, incredible, detail. You receive a painting in response. Did you make that painting? Were you the artist in any normal parlance?

When you use chatGPT or Claude you're signing up to getting/receiving the image generated as a response to your prompt, not creating that image. You're involvement is always lessened.

You might claim you made that image, but then you would be like a company claiming they made the response to their brief, or the dog owner insisting they were the painter, which everyone would consider nonsensical if not plain wrong. Are they collaborators? Maybe. But the degree of collaboration in making the image is very very small.


> Did you make that painting? Were you the artist in any normal parlance?

The symphony conductor just waves her hands reading the score, does she make music? The orchestra makes all the sounds. She just prompts them. Same for movie director.


The analogy isn't quite right. The conductor and director spend days collaborating with the symphony and the actors/crew. Parent's example is them literally prompting - via a creative brief - the artist or agency.


The symphony conductor gets credit for being the conductor-- not for being Beethoven. A film director has a thousand times more influence on their final product than a conductor has on theirs, and they still don't try to take credit for the writing, costume, set design, acting, score, special effects, etc. etc. etc. I've yet to see stable diffusion spit out a list of credits after generating an image.


It's still very different. What you describe is exactly what an art director does, which is creative and difficult— there's a good reason many commercial artists end their careers as art directors but none start there. Anybody that says making things that look good and interesting using generative AI is easy or doesn't require genuine creativity is just being a naysayer. However, at most, the art director is credited with the compilation of other people's work. In no situation would they claim authorship over any of the pieces that other people made no matter how much influence they had on them. This distinction might seem like a paperwork difference to people outside of the process, but it's not. Every stroke of the pen or stylus or brush, scissor snip, or pixel pushed is specifically informed by that artist's unique perspective based on their experience, internal state, minute physical differences, and any number of other non-quantifiable factors; there's no way even an identical twin that went to the same school and had the same work experience would have done it exactly the same way with the same outcome. Even using tools like Photoshop, which in professional blank-canvas art creation context use little to no automation (compared to finishing work for photography and such that use more of it.) And furthermore, you can almost guarantee that there's enough consistency in their distinctions that a knowledgeable observer could consistently tell which one made which piece. That's an artistic perspective— it's what makes a piece that artist's own piece. It's what makes something someone's take on the mona lisa rather than a forgery (or, copy I guess if they weren't trying to hide it) of the mona lisa. It's also what NN image generators take from artists. Artists don't learn how to do that— they learn broad techniques— their perspective is their humanity showing through in that process. That's what makes NN image generators learning process different from humans, and why it's can make a polaroid look like a Picasso in his synthetic cubist phase but gets confused about the upper limit for human limb counts. I think generative AI could be used to make statements with visual language, closer to design than art. I definitely think it could be used to make art by making images and then physically or digitally cutting pieces out and assembling them. But no matter how detailed you get in those prompts, there aren't enough words to express real artistic perspective and no matter what, your still working with other people's borrowed humanity usefully pureed and reformed by a machine. These tools are fundamentally completely different than tools like Photoshop. In art school I worked with both physical media and electronic media and the fundamental processes are exactly the same. Things like typography in graphic design are much easier, but you're still doing the same exact process and reasoning about the same exact things on a computer that you do working on paper and sending it to a "paste up man," as they did until the 80s/90s. People aren't just being sour pusses about this amazing new art tool— it's taking and reselling their humanity. I actually think these image generators are super neat — I use them to make more boards and references all the time. But no matter how specific I get with those prompts, I didn't make any of that. I asked a computer and that computer made it for me out of other people's art. A lot of people who are taken by their newfound ability to make polished images on command refuse to believe it, but it's true. It's a fundamentally different activity.


> your still working with other people's borrowed humanity usefully pureed and reformed by a machine

Exactly, isn't it amazing? You can travel the latent space of human culture in any direction. It's an endless mirror house where you can explore. I find it an inspiring experience, it's like a microscope that allows zooming into anything.


Sure it's a lot of fun. I also find it very useful for some things like references and mood boards. No matter how granular you get with control nets or LORAs and how good the models get, you just can't get the specificity needed for professional work and the forms it gives you are just too onerous to mold into a useful shape using professional tools. It's still, fundamentally, asking another thing to make it for you, like work for hire or a commission. Software like Nuke's copycat tool or Adobe's background remover or content-aware fill were professionally useful right off the bat because they were designed for professional use cases. Even then, text prompt image generators are more useful than not in low-effort, high-volume use cases where the extremely granular per-pixel nuance doesn't really matter. I doubt they'll ever be useful enough for anything higher-level than that. It's just fundamentally the wrong interface for this work. It's like saying a bus driver on a specific route with a bus is equally useful to a cab driver with a cab. There are obviously instances where that's true, but no matter how many great things you can show are on that bus route, and no matter how many people it's perfectly suited for, there's just no way a FedEx driver could use it to replace their van.


Very well said. Agree completely.


Just keying on one comment here, which perhaps no one will read:

I was, in fact, a paste-up man in the early 1990s, slapping together copy and ads for a magazine. As such, I was a ping-pong ball in the battle between account management and creative arts - each of them wanted to be the originator of the big and clever ideas. (This is pretty widespread in the industry, and was even a recurring theme in "Mad Men.")

The takeaway here is, people like to be creative. People need to be creative. There will always be an implacable drive to create, one which DALL-E can never satisfy. Gen AI is the artificial sweetener that might temporarily satisfy those cravings, but ultimately artists want to create something from nothing. There's some hope to be found in that, amid the tsunami of AI slop.


Well I really hope that you were easily able to transition out of paste-up because it kind of blows me away how quickly that whole craft just got clobbered. Just like my uncle that specialized in atlas publishing-- luckily he was able to hang on long enough to retire.

I agree that people do want to be creative, and I don't think that people are going to let Gen AI supplant that for them. However, the lower-end of the creative markets doing the low-end high-volume work-- think folks shotgunning out template-based logos on Fiverr-- are the ones that have already been displaced in large numbers, and there are far more of them. While they generally don't have the right skillset to do the higher-end work, their seeing that as the only viable career move is majorly fucking up companies' ability to find workers and vice versa, and for employers that don't know any better, they think the market is saturated which is bringing down wages.

Also, clueless executives just don't realize that having a neural network generate a "80% right" version of your work in a flat PNG file will take more effort to mold into shape for higher-end work than starting from scratch, so they've been making big cuts. A coworker on a contract also works in an animation house that fired their entire concept art department and replaced them with prompt monkeys making half as much money-- the problem was that standard art director changes-- e.g. I want this same exact image and garment, just make those lapels look a little fuller and softer but with sharper angles at the end, and change the piping on that jacket from green to purple-- might have been half an afternoon for a professional concept artist but would be DAYS of work to get art-director right using neural network tools... if for no other reason that the prompt writers just don't have the traditional visual art sophistication to even realize when they've got an appropriate solution, because learning that is a lot harder than learning to draw, and you learn that when you learn how to draw. So all the time they saved on the initial illustration was totally sucked up by art directors not being able to iterate even a tenth as quickly as they used to, and fast iteration was the major selling point for Gen AI to begin with. It simply does not do the task if you absolutely require specificity, and having a raster non-layered png that looks like it already went through post is a beast to edit, even for a skilled post-prod person. Well, three months later, they canned the prompt engineers and were begging their concept artists to come back and work for them again. What a waste of everything.

Why do I even bother torturing myself in forums like this by giving a real-world creative industry counterpoint to the tech crowd perspective, despite many of the most vocal ones being smug, patronizing, and self-aggrandizing? Maybe one executive out there will read this stuff and say "Hmm... maybe I should actually talk to people that work in this field that I trust to see if it's really beneficial to replace our [insert creative department] rather than relying on software execs and their marketing people say is feasible."


> "AI generated" I just instantly find it un-interesting artistically

How familiar are you with what is possible and how much human effort goes towards achieving it?

https://civitai.com/images

Photography, digital painting, 3D rendering -- these all went through a phase of being panned as "not real art" before they were accepted, but they were all eventually accepted and they all turned out to have their own type of merit. It will be the same for AI tools.


I'll be blunt, all of those images look comically generic and extremely "AI".

> Photography, digital painting, 3D rendering

Those are not the same as AI. Using AI is akin to standing beside a great pianist and whispering into his ear that you want "something sad and slow" and then waiting for him to play your request. You might continue to give him prompts but you're just doing that. In time, you might be called a "collaborator" but your involvement begins at bare minimum and you have to justify that you're more involved --- the pianist doesn't, the pianist is making the music.

You could record the song and do more to the recording, or improv along with your own instrument. But just taking the raw output again and again is simply getting a response to your prompt again and again.

The prompt themselves are actually more artistic as they venture into surrealist poetry and prose, but the images are almost always much less interesting artistically than the prompts would suggest.


> I'll be blunt, all of those images look comically generic and extremely "AI".

Ok, now I know you're watching through hate goggles. Fortunately, not everyone will bring those to the party.

> Using AI is akin... [goes on to describe a clueless iterative prompting process that wouldn't get within a mile of the front page]

You've really outed yourself here. If you think it's all just iterative prompting, you are about 3 years behind the tools and workflows that allow the level of quality and consistency you see in the best AI work.


I scrolled through and...have to agree with their impression. I'm confused as to what you thought is being demonstrated by images on https://civitai.com/images of all things, since it's all very high-concept/low-intentionality, to put it nicely. Did you mix it up with a different link?


My litmus test is to simply lie. It weeds out the people hating AI simply because they know or think it is AI. If you link directly to an AI site they're already going to say they hate it or that it all "looks like AI slop". You won't get anywhere trying to meet them at a middle ground because they simply aren't interested in any kind of a middle ground.

> https://www.reddit.com/r/greentext/comments/zq91wm/anons_dis...

Which is exactly the opposite of what the artists claim to want. But god is it hilarious following the anti-AI artists on Twitter who end up having to apologize for liking an AI-generated artwork pretty much as a daily occurrence. I just grab my popcorn and enjoy the show.

Every passing day the technologies making all of this possible get a little bit better and every single day continues to be the worst it will ever be. They'll point to today's imperfections or flaws as evidence of something being AI-generated and those imperfections will be trained out with fine tuning or LoRA models until there is no longer any way to tell.

E: A lot of them also don't realize that besides text-to-image there is image-to-image for more control over composition as well as ControlNet for controlling poses. More LoRA models than you can imagine for controlling the style. Their imagination is limited to strictly text-to-image prompts with no human input afterwards.

AI is a tool not much different than Photoshop was back when "digital artists aren't real artists" was the argument. And in case anyone has forgotten: "You can't Ctrl+Z real art".

Ask any fractal artists the names they were called for "adjusting a few settings" in Apophysis.

E2:

We need more tests such as this. The vast majority of people can't identify AI nearly as well as they think they can identify AI - even people familiar with AI who "know what to look for".

https://www.tidio.com/blog/ai-test/

Artworks (3/4) | Photos (6/7) | Texts (3/4) | Memes (2/2)

Fun excerpt by the way:

> Respondents who felt confident about their answers had worse results than those who weren’t so sure

> Survey respondents who believed they answered most questions correctly had worse results than those with doubts. Over 78% of respondents who thought their score is very likely to be high got less than half of the answers right. In comparison, those who were most pessimistic did significantly better, with the majority of them scoring above the average.


"Rap isn't even music, they aren't even singing!"

You are just expressing the same, uncreative, ignorant opinion that is always expressed when we come upon a NEW ART FORM.


I would say the difference here is with these:

> Photography, digital painting, 3D rendering

You still make these. You sit down and form the art.

When you use AI you don't make anything, you ask someone else to make it, i.e. you've commissioned it. It doesn't really matter if I sit down for a portrait and describe in excruciating detail what I want, I'm still not a painter.

It doesn't even matter, in my eyes, how good or how shit the art is. It can be the best art ever, but the only reason art, as a whole, has value is because of the human aspect.

Picasso famously said he spent his childhood learning how to paint professionally, and then spent the rest of his life learning how to paint like a child. And I think that really encapsulates the meaning of art. It's not so much about the end product, it's about the author's intention to get there. Anybody can paint like a child, very few have the inclination and inspiration to think of that.

You can see this a lot in contemporary art. People say it looks really easy. Sure, it looks easy now, because you've already seen it and didn't come up with it. The coming up with it part is the art, not the thing.


When I make 3D art I instruct a lot of things, how the renderer is configured, lighting details, various systems that need to be tweaked to get the final render to look good.

Using the AI tool chains, you'd start with some generation either via text or image input, then modify various settingas, model, render steps, sampler, loras, then a generative upscaling pass, control nets to extract and apply depth, pose, outlines all etc. A colourful mix of systems and config, not unlike working 3D tool chains.

Its also not unusual to mix and match, handcrafted geometry but projection mapped generated textures and then a final pass in Photoshop or what have you.

Typing "awesome art piece" into ChatGPT is like rendering a donut.


> You still make these. You sit down and form the art.

When you use a camera you don't make anything. You press a button and the camera makes it. You haven't even described it.

When you use photoshop you don't make anything. You press buttons and the software just draws the pixels for you. It doesn't make you a painter.

When you use 3D rendering software you don't make anything. You tell the computer about the scene and the computer makes it. You've barely commissioned it.

It's easy to be super reductive. Easy but wrong.


Sorry, I don't think it's the same because making physical specifications via modifying pixels, or 3D art, or forming a shot is something you do.

It's the difference between making a house with wood and making a house by telling someone to make a house. One is making a house, one isn't.

The problem with AI is that it's natural language. So there's no skill there, you're describing something, you're commissioning it. When I do photoshop, I'm not describing anything, I'm modifying pixels. When I do 3D modeling, I'm not describing anything, I'm doing modeling.

You can say that those more formal specifications is the same as a description. But it's not. Because then why aren't the business folks programmers? Why aren't the people who come up with the requirements software engineers? Why are YOU the engineer and not them?

Because you made it formally, they just described it. So you're the engineer, they're the business analysts.

Also, as a side note, it's not at all reductive to say people who use AI just describe what they want. That is literally, actually, what they do. There's no more secret sauce than that - that is where the process begins and ends. If that makes it seem really uninspired then that's a clue, not an indicator that my reasoning is broken.

You can get into prompt engineering and whatever, I don't care. You can be a prompt engineer then, but not an artist. To me it seems plainly obvious nobody has any trouble applying this to everyone else, but suddenly when it's AI it's like everyone's prior human experience evaporates and they're saying novel things.


Try it sometime. Don't just type one prompt and declare the job done. Try to make something that invokes a reaction in yourself.

AI makes it easy to generate ten thousand random images. Making something of interest still requires a lot of digging in the tools and in your self.


Right, it can require describing and refining over and over. I still don't think that means you did the thing. Otherwise, the business analysts who have to constantly describe requirements would be software engineers, but they're not.

Not that that isn't a skill in it of itself. I just don't think it's a creationary skill. What you're creating is the description, not the product.


You are creating the product but have to go through an unclear layer and through trial and error you try to reach your original vision. No different from painting a picture for an amateur.

The better you get the closer you can get to your original vision.


Best reply I can give ya I already typed up for someone else here: https://news.ycombinator.com/item?id=41743680


If I were trying to convince people that AI art is interesting and creative then I would not choose to highlight the site dedicated to strip-mining the creativity of non-AI artists, to produce models which regurgitate their ideas ad infinitum.


Not to mention extremely suspicious checkpoints that produce imagery of extremely young women. Or in others words women with extremely child like features in ways kids should not be presented.


Sorry, but there's nothing interesting or unique about the images on that site.


I think the main point is that art is interesting precisely because it can transmit human experience. It's communication from another human being. AI "media" completely lacks that. It's more of an expression of the machine-soul, which is tempting us to continue its development until it takes over.


For me, art is more interesting, moving, soul connecting the more it is made by less and less people. Art by one person gives me a unique perspective to the artists mind. AI generated art is the opposite of being created by one person. It's an amalgamation of millions or billions of people's input. To me that's uninteresting, not novel and not mind-expanding at all.


> I only really hear about people being ecstatic about AI online, by people who are, for lack of a better term, really online, and who do not have the skills, know-how, or ability, to make art themselves.

Well put. This is also my experience. And I'm no AI doom-monger or neo-Luddite.


I think a key piece here is that I often consume art from the mindset of, "What was the creator thinking?" What is their worldview? What social situations pushed them to express things in this way?

For video, it's possible AI can feed into the overall creative pipeline, but I don't see it replacing the human touch. If anything, it opens up the industry to less-technical people who can spend more time focusing on the human touch. Even if the next big film has AI generation in it, if it came from someone with a fascinating story and creative insight, I'll still likely appreciate it.


I feel the opposite. I don't care how the sausage was made as long as it's a good sausage. Art was never about the creation process. In fact, before the internet most would never see the process at all. Just go to your local museum and you'll never know how most of pieces were made and that's a good thing. Art is all about the effect on the viewer.


This requires such a shallow definition for the word "art." I think you're more just talking about images. Art is about more, including the process.


Nah, meta-art is not that interesting tbh and the meta art culture is quite shallow in itself. Real art has always been about sharing an experience or an idea with the viewer and the production is completely irrelevant for this.

In other words if I take an AI drawing and lie the end user would still have the same experience as if I was telling the truth. My lie would only affect the meta culture not the art piece or viewers experience.


> I only really hear about people being ecstatic about AI online, by people who are, for lack of a better term, really online, and who do not have the skills, know-how, or ability, to make art themselves.

I generate a lot of art using Stable Diffusion/Flux of my spouse, kids, friends, etc. I was a professional photographer for nearly 10 years - I quit just last year.


People find even randomly generated stuff artistic. I remember the San Francisco Chronicle review of an art piece, which was random cracks in rock caused by heating.

I sort of wondered how you could claim to be the creator of the art when your kiln did all the work, but I suppose they did the important labor of putting it in there.


Consider another angle.

I follow a lot of the new AI gen crowd on Twitter. This community is made up of a lot of creative industry people. One guy who worked in commercials shared a recent job he was on for a name brand. They had a soundstage, actors, sound people, makeup, lighting, etc. setup for 3 days for the shoot. Something like 25 people working for 3 days. But behind that was about 3 months of effort if one includes pre-production and post-production. Think about editing, color correction, sound editing, music, etc.

Your creative children may live in a world where they can achieve a similar result themselves. Perhaps as a small team, one person working on characters, one person doing audio, one person writing a script. Instead of needing tens of thousands of dollars of rented equipment and 25 experts, they will be able to take ideas from their own head and realize them with persistence and AI generation.

I honestly believe these new tools will unlock potential beyond what we can currently imagine.


We have already been through this with music.

It doesn't really work that way. Over time, it really does just devalue the art form in a sense because now anyone can make a recording.

Electronic music is really the best example. In 1995 it took thousands of dollars to have a fully working studio to even produce any track. By 2005, anyone could do this in their bedroom for basically nothing. In 1995 the cost acted as a filter so only those with talent would bother. Once anyone could do it, all electronic music recordings were devalued by the infinite supply.

I thought there would be 1000 Richard James once this happened. Maybe there even are but I have never heard them because there is so much shit to sift through I really don't even listen to electronic music anymore. I don't think there are though. 900 them probably are doing something else because there is no money left in the art form, 90 are making some other style of music with better financial prospects and the 10 that are, I will never hear of or be able to find.


If the barrier to entry is low for high quality production and anyone is able to make good looking videos, I wonder how audience perception would evolve for judging and valuing what is considered 'good'.


Keep in mind: one of the top selling games for children is Roblox. Our perception of what is "good" is very open to reinterpretation by the coming generations.


That will be the end of creative work. Marketing and promotion is already the most difficult part of any creative endeavor. With literally unlimited trash being produced, it'll become impossible.


There's a term to describe this: creative destruction, literally.

We are at the cusp of a full scale commoditization stage of generative AI that will impact all aspects of the creative/software fields.

If you want to know what this creative destruction will look like, look no further than previous centers of innovation like Detroit, the emptying naval shipyards of Busan, the zombie game studios around Osaka as a sign of things to come.

TLDR: AI is going to destroy a lot of white collar, high creativity, high intellect jobs that isn't protected by a union or occupational collective associations which were all created to counter against creative destructions from taking people's livelihoods away.

Unfortunately, 10 years ago when I tried to create a union organization for software engineers/designers and creative workers, it was sabotaged by fellow software engineers who seem highly susceptible to psyops much more than any other group.

We might see a repeat of what happened in Japan after mid 90s, when much of the country's stable and ample jobs disappeared thanks to internet, globalized financiering backed by authoritarian labour market.

Instead this time its not a communist country working together with bankers rather its a small group of technology companies pushing out bankers and creating a sort of a dystopian AI dominated labour field where humans no longer dumpster dive for wages but any remaining labour industry that AI cannot infiltrate aka ppl literally switching careers to stay employed because their old jobs were outsourced to AI.

I didn't even talk about the impact on wages (spoiler: it will enrich the 0.1% while shunning the 99.9% to temporary gigs and unstable employment not unlike regions which have experienced similar creative destruction back in the 90s and early 2000s).

It's hard to see a future without some sort of universal basic income and increased taxation on billionaires who will no longer be able to hide their assets offshore without facing serious headwinds not unlike how Chinese billionaires fear the CCP.


Or maybe, the limiting factor in one's ability to create art will be... creativity rather than the technical skills necessary to make movies, draw, or pluck strings.


Creativity isn’t magic, it’s a skill. There is no creativity without the application of it. By definition creativity produces something. Without skills it’s not possible to produce anything.

The act of creating teaches you to be better at creating, in that way and in that context. This is why people with practice and expertise (e.g., professional artists, like screenwriters and musicians) can reliably create new things.


To an extent. Take cooking for example though- I don't doubt that writing recipes and trying them builds ones creative muscle, on the other hand, I don't think being we'd be at a loss for great chefs if we were to automate the cutting of onions, the poaching of eggs, and the stirring of risotto.


Of course we would. That’s my entire point.

Take poaching eggs for example. Let’s say you automate that 100% so as a human you never need to do it again. Well, how good are your omelettes then? It’s a similar activity — keeping eggs at the right temperature and agitation for the right amount of time. Every new thing you learn to do with eggs — poaching, scrambling, omelettes, soft-cooking for ramen — will teach you more about eggs and how to work with them.

So the more you automate your cooking with eggs the worse you get at all egg-related things. The KitchenBot-9000 poaches and scrambles perfect eggs, so why bother? And you lose the knowledge of how to do it, how to tell the 30-second difference between “not enough” and “too much.”


> Creativity isn’t magic, it’s a skill

I don't agree. There's some skill, some theory, behind it. But mastering this alone is almost worthless.

There's a huge overlap between creatives and mental illness, particularly bipolar disorder. It seems perfectly mentally stable people lack that edge and insight. To me, that signals there is some magic behind it.

And it's magic because then it must not be rationale and it must not make sense, because the neurotypical can't see it.

I think it's sort of like how you can beat professional poker players with an algorithm that's nonsensical. They're professionals so they're only looking at rationale moves; they don't consider the nonsensical.


All artists I have known have spent most of their lives practicing. Just as I have practiced programming.

That's the biggest edge, commitment.

To think that you _need_ to be neurodivergent to be an artist is non-sensical and stating mastering the craft itself is worthless is indicative of a lack of respect for their work.

I'm baffled by this type of comment here in all honesty. Really, broaden your horizons.


Certainly, life-long commitment to some discipline is not something that is in the middle of the bell curve.

I don’t know if neurodivergence might have any overlap, but I wouldn’t be surprise that a study reveals it to be as correlated as the fact that most rich people were born in wealthy families.


> To think that you _need_ to be neurodivergent to be an artist

You will notice I never said this.

All I said, and is true, is there is a correlation between being an artist and being neurodivergent.

> stating mastering the craft itself is worthless

Where did I say this too?

It appears you're having an argument with a ghost. You're correct, that argument is baffling! I wonder then why you made it up if you're just gonna get baffled by it? Seems like a waste of time, no?

Look, art is two things: perspective and skill. One without the other is worthless.

I can have near perfect skill and recreate amazing works of art. And I will get nowhere. Or, I can have a unique and profound perspective but no skill, and then nobody will be able to decipher my perspective!


I'm sorry if I misunderstood, but please clarify how this two quotes don't align with what I said?

> But mastering this alone is almost worthless.

> And it's magic because then it must not be rationale and it must not make sense, because the neurotypical can't see it.

Not trying to take them out of context, but specifying them. You mention, from my understanding, that mastering is almost worthless without the magic, and the magic only being there if you're neurodivergent.

This implies one cannot be a proper artist if not neurodivergent. Now, I could be misinterpreting it, so I apologize in advance.


I never said the magic is "only" there if you're neurodivergent, I said it seems to me neurodivergent people seem to be more likely to have the magic.

> There's a huge overlap between creatives and mental illness

Keyword overlap, but I don't think it's 100%

Magic is maybe not the right word here, but I do think it's indescribable. It's some sort of perspective.

But I stand by this: > that mastering is almost worthless without the magic

How, exactly, you obtain the magic is kind of unknown. But I do think you need it. Because skill alone is just not worth much outside of economics. You can make great corporate art, but you're not gonna be a great artist.

I think if you're perfectly rationally minded, you're going to struggle a lot to find that magic. I shouldn't say it's impossible, but I think it's close to.


Fair, I think the "magic" depends on other factors that may or may not lead to neurodivergence. Those being:

- Life experience

- Exposure/education when young

Of course, these might lead to neurodivergence or might not. The key thing is that the magic is a very unique, personal thing. Human, one can say. Also, through practice you come to understand new perspectives, something that is perhaps lessened in your view.

Either way, I've missunderstood your take to a degree, and had a much more radical interpretation of it.


You spent your life in *your* lane. Why don't you stay there and keep committing.

We'll be over here trying new things, making new art, and expanding the horizon bye


I'm not sure I completely agree. In some ways, developing technical skills can drill creativity out of you and condition you to think in ways that are really quite rigid and formulaic.


99% of humanity have very little interest in creating. They're mimics, they're fine with copying, hitting repost, et al. You see this across all social media without exception (TikTok being the most obvious mimic example, but it's the same on Reddit as well). You see it in day to day life. You see it in how people spend their time. You see it in how people spend their money. And none of this is new.

The public can create vast amounts of spectacular original content right now using Dalle, MidJourney, Stable Diffusion - they have very little interest in doing so. Only a tiny fraction of the population has demonstrated that it cares what-so-ever about generative media. It's a passing curiosity for a flicker of an instant for the masses.

The hilariously fantastical premise of: if we just give people massive amounts of time, they'll dedicate their brains to creativity and exploration and live exceptionally fulfilling lives - we already know that's a lie for the masses. That is not what they do at all if you give them enormous amounts of time, they sit around doing nothing much at all (and if you give them enormous amounts of money to go with it, they do really dumb things with it, mostly focused on rampant consumerism). The reason it doesn't work is because all people are not created equal, all people are not the same, all brains are not wired the same, the masses are mimics, they are unable & unwilling to originate as a prime focus (and nothing can change that).


That's simply untrue. Children have a natural inclination to create art. It is slowly drilled out of them by various factors, in large part, economic pressures. One of my best friends has a natural talent for drawing. He even made a children's book. Guess what? He became a cop because being a graphic artist is too precarious. If we alleviate the pressures that cause people to become closed off to the possibility of creating art, more people will be open to it.


A lot of creativity is generated by spending countless hours sharpening

> the technical skills necessary to make movies, draw, or pluck strings

AI will (hopefully) be an accelerator for the people still putting in the hours. At least it is for coding


Nah, creativity cannot be separated from the means. "The medium is the message". It is precisely the interaction of technical skill and the mind that creates something truly wonderful.


That's not exactly what McLuhan meant by that statement. "The medium is the message" refers more to how the medium itself influences the way a message is perceived by an audience. It is not an assessment of the creative process itself. It's not as though I disagree entirely with what you're saying though. There are certainly ways in which the medium is highly influential over the process of creating something. But it's a mixed bag, and technical skill is not something to be celebrated in all cases. A technically accurate painting is oftentimes quite dull and uninspired. One could argue that creativity isn't just the interaction of skill and mind, but rather the ability to think beyond the medium, to embrace accidents, imperfections, and impulsive decisions.


You don't need any special technical skills to write the next great American novel. Few people actually do it. Talent and dedication are as elusive as ever.


You: escape the oppressive technical limitations of scoring a piece for an orchestra through novel use of technology.

Csound: To make a sine tone, we'll describe the oscillator in a textfile as if it were a musical instrument. You can think of this textfile as a blueprint for a kind of digital orchestra. Later we'll specify how to "play" this orchestra using another text file, called the score.


The issue is that the human performance of those things is precisely how creativity is expressed. You can tell an AI to write a story you envision but if there’s nothing unique in the presentation (or it copies the presentation from existing media to a large extent) you still end up with boring output.


The discipline and care to get good at it are what the things that spur creativity.


Paint didn't replace charcoal. Photography didn't replace drawings. Digital art didn't replace physical media. Random game level generation didn't replace architecture.

AI generated works will find a place beside human generated works.

It may even improve the market for 'artsy' films and great acting by highlighting the difference a little human talent can make.

It's not the art that's at risk, it's the grunt work. What will shift is the volume of human-created drek that employed millions to AI-created drek that employs tens.


Earning a living through creativity doesn't work for the majority of people anyway even without AI in the picture. Creative expression is a thing that exists for its own sake, the people who make a living out of it are lucky outliers.


And so what if they are outliers? It is precisely the outliers that spice up our artistic wealth to make it truly interesting.


"So what" is that OP's children shouldn't be terrified about the prospects of an artistic career because of AI. It is not going from "good career choice" to "long shot", more like "long shot" to "somewhat longer shot".


I suspect the demand for human creative output will shrink, as AI generated content will be so cheap and prevalent, even as it will only ever be an imitation of human art. The same way that most people eat terrible, flavorless tomatoes from the supermarket, instead of the harder to grow heirloom varieties.

But I don't think human creativity is going anywhere. Unless there is some breakthrough that moves it far beyond anything we've seen so far, AI will always be trailing behind us. Human creativity might become a more boutique product, like heirloom tomatoes, but there will always be people who value it.


There might be more creating it than there are those valuing it


I had a similar thought. I knew someone who lived a life of crime, for a long time he was very poor like most criminals, but for a while made it big. He could buy anything he wanted, he always liked suits so bought very nice suits. But they meant nothing to him, he couldn't enjoy them, as he didn't earn then.

I wonder if it will be the same with AI. When you can have anything for nothing, it has no value. So the digital world will have little meaning.


He might be an exception because most people would have no problems riding around in a million dollar car whether they earned it or not.


Nobody cares about driving around in a million-dollar car. They want the money/power/status of the person who owns the million-dollar car. An unearned million-dollar car is practically a liability instead of an asset.


That's my optimistic belief as well but I've also been disappointed at every turn. The future feels like a nihilistic joke constantly competing to plot the most disappointing course forward.

More likely the average person will happily lap up AI generated slop.


If I imagine a random person on the street, they certainly aren’t enjoying fine human arts because it’s made by a real person. They are scrolling TikTok and don’t care if it’s AI generated or not, if they even notice. The people actually caring about art because it is art are maybe 20% of the population.


I think 20% is being generous… more like 2%.


Creativity is about having original ideas. So far, AI isn’t that good at that, and neither at maintaining a consistent idea throughout a production. Will AI be able to come up with a compelling novel series, music album, video game, movie or TV series in ten years? Possibly, but there’s also a good chance that it won’t.


Most creatives work in at-risk jobs like freelance writing, SEO, digital advertising, logo design


The same can be said about plain old I.


Cheaper more effective entertainment is likely to only cause more problems: it will be more addictive, better at hijacking our brains and attention, better at pushing the propaganda goals of the author, better at filling traditional "human needs" of relationships that forever separates us from each other into a civilisation of Hikikomori.

I have little faith in an optimistic view of human nature where we voluntarily turn more toward more intellectual or worthy pursuits.

On one hand, entertainment has often been the seed that drives us to make the imagined real, but the adjacent possible of rewarding adventure/discovery/invention only seems to get more unaffordable and out of reach. Intellectual revolutions are like gold rushes. They require discovery, that initial nugget in a stream, the novel idea that opens a door to new opportunities that draws in the prospectors. Without fresh opportunity, there is no enthusiasm and we stew in our juices.

I suspect the only thing that might save us from total solipsistic brain-in-vat immersion in entertainment... is something like glp-1 type antagonists. If they can help us resist a plate of Danish maybe they can protect us from barrages of Infinite Jest brain missiles from Netflix about incestuous cat wizards or whatever. Who knows what alternatives this new permanently medicated society, Pharma-Sapiens, might pursue instead though.


I believe you're right too. The internet and smartphones are great technology in general, and can do pretty great things but what they've ended up doing was screwing with the reward mechanisms in my brain since I was a teenager. Most optimized use case.

Reading these threads sometimes feels like a bad idea, because you just get new sad ideas on how things will almost certainly be used to make it worse than just the ones you can come up on your own.


We'll be able to start fuzz testing the human brain. A horror film that uses bio-feedback to really push the bits that are actually terrifying you, in real-time. Campaign videos that lean in to the bit that your lizard brain is responding to.


The Onion was ahead of the curve with "New Live Poll Lets Punits Pander To Viewers In Real Time". https://youtu.be/uFpK_r-jEXg


We heard this same argument when cameras were invented. Yet some of the most valuable paintings in the world were created in the 20th century.

We heard it again when electronic music started becoming a thing.

Formula 1 wouldn’t exist if the blacksmiths had their way.

The unknown scares people because they are afraid of their known paradigms being shattered. But the new things ahead are often beyond anything of which we could ever dream.

Be optimistic.


One must not use analogy to analyze individual technologies. People were afraid of the camera, yes, but the camera does not attempt to replace painting. AI attempts to replace photography, painting, and all sorts of art with something that looks like the real thing. Photography never tried to do that, as photographs don't look anything like paintings.


When the camera was invented, it did replace what paintings were used for at the time. Photographs don't look like paintings, but up until the camera paintings were trying to look like photographs. It's no coincidence that impressionism arrived at the same time as the camera.


There is a difference between replacing usage and replacing the exact art and the people who make it. Yes, the camera influenced painting, but it did not destroy it. AI attempts to destroy natural human expression.


You are just wrong.

Before the camera, portrait painting was how most painters would make their living and the camera upended that completely.

On the last line, look up hyperrealist painter on a search engine. That is the reverse , an artistic movement in painting inspired by the photograph.


Is anyone working on a painting robot that would use colors, strokes and textures based off of great painters?


Science is never going to supplant art. They serve two very different functions in society. What I hope is that performance art and experiences that can't be easily replicated by AI become more mainstream. Things like ARGs and multimedia storytelling, where there is a back and forth participatory sort of process between the audience and the creator.


> Maybe, at that point, we decide that exploring space, stretching our knowledge of physics and chemistry, and combating disease are far more interesting because they are real.

It's a compelling thought - we all like hope - and I think it might be realistic if all of humanity were made up of the same kind of people who read hacker news.

But is this not what the early adopters of the internet thought? I wasn't there - this is all second hand - but as far as I know people felt that, once everyone gained the ability to learn anything and talk to anyone, anywhere, humanity would be more knowledgeable, more thoughtful, and more compassionate. Once everyone could effortlessly access information, ignorance would be eliminated.

After all, that's what it was like for the early adopters.

But it wasn't so in practice.

I worry that hopeful visions of the future have an aspect of projecting ourselves onto humanity.


"And, through the same lens, maybe human-created art is more interesting because it is real."

Most human-created art is rather bad. I used to go to a lot of art openings, and we'd look at some works and ask "will this have been tossed in five years?"


Being pleasing to the eye is often not the point. Technical ability is a small part of the art experience. That's one reason a lot of people hate calling image gens "art" - it's so flashy without substance. But it's also a reason I don't think generative AI is much of a threat to the human practice of art-making.

That said, AI is probably a threat to roles in the entertainment industry. But it's also worth noting that much of the creativity was being sucked out of entertainment well before AI arrived.


Im hopeful US will have some subsidy for real creative works like ive seen in europe.

My limited understanding is that AI could generate Netflix top 10 hits that mostly recycle familiar jokes. The creators made a great product, but i expect anyone who attended film school would rather try something new, only issue is Netflix wont foot the bill (i know, they take a few oscar swings a year now).

Recent examples: TV Glow, Challengers, Strange Darling. All movies with specific, unique perspectives, visuals, acting choices, scripts, shots, etc. Think about the perspective in The Wire, The Sopranos, Curb Your Enthusiasm. There is plenty of great work that obviously is nearly impossible to reproduce by an AI and i hope that AI "art" is taxed in a way that funds human projects.


It is easy to do really creative work now but it is even easier to just browse instagram or tiktok. The real winner in the new world will be people with discipline who can use these tech to create stuff without too much capital or resources.


Why would anybody create stuff that the AI companies are just going to instantly subsume and reproduce far more cheaply? There is not going to be a meaningful economy based on creativity in the next 50 years, any more than there is one now. And then it is going to be far worse. The actual "winners" are just going to be the people who through arbitrary processes like fortunate birth or lucky circumstances are granted admission into elite institutions.


That is a very pessimistic take. In my opinion things are much more accessible now. Nobody cares about your background on internet as long as you can provide value.

For example, you can build a wrapper over an LLM focussed on a niche this weekend(using cursor/copilot) and launch it over twitter. It is very much possible with the tools we have. If you market it hard and provide value consumers will line up. This kind of power was not available 50 years back.

Things are easier if you want to hit big. But also it is easy to just be a consumer of media and social media. Depends on which side of the algorithm you are.

Also it does not help to think like a victim of your circumstances. You need to start where you are and try to keep pushing what is possible.


I said 50 years from now.

People like you who think you're going to come out on top of this by throwing everyone else under the bus, treating them like consumers, are a huge part of the problem. That kind of parasitic behavior is not "providing value" except in the cynical way that it let's you devalue other people's lives for your own gain.


So we’ll automate away entertainment jobs but none of the cool science jobs will be automated? I don’t understand how this proposed world will have an available work for scientists but not entertainers.


At least for Meta, this has implications for keeping people engaged in their metaverse.


Recently I've been cutting back on TV in favor of non fiction reading and I feel you have a point. Entertainment comes in many forms and tbh all of them are interesting and rewarding in their own ways so I'm not worried of AI ruining entertainment for us. That's the least of our actual worries and I'm honestly surprised people find this issue so important.

I guess visualizing AI doing political or social damage or AGI mind control is a bit harder than your favorite show being gone.


Most of my entertainment is watching dudes sitting in their chairs talking into a microphone. I find it more entertaining than the billion dollar entertainment industry.


How would you know it's real? AI art could be portrayed as real and most people wouldn't care if it has a stronger emotional effect.


They will be creating for a very small crowd. It will be nice for me, because I can't stand all the blockbuster movies that prioritize stretching physics with unrealistic special effects over plot and dialog.

I think the musicians that are barely hanging on at this point would prefer to create over having to slog around on tours to pay their health insurance. But nobody is paying for creation.


It really bugs me that the first bits AI has targeted are the parts people actually enjoy doing for fun.

As things stand, AI is okay at writing, art/photos, coding, and now, videos.

These are all things people like doing. Even coding is something a lot of people get a ton of pleasure from.


Unless we have god-like robotics I don't see AI making physical art any time soon. We can print out photos but people still buy paintings. We can 3D print but people still buy sculptures. People are paid to design and build beautiful buildings and interiors.

And of course if you can combine skills with sculpture with graphic design you're getting more specialized and are more likely to make a living - even if the field of graphic design is decimated by AI. That's generally how I feel about my skills as a programmer. I'm not just a programmer. So even if AI does most of the work with coding I can still write code for income as long as it's not the only reason I'm getting paid.


I think there will be a body that certifies artistic content as organic similar to food. This will create a premium offering for organic content and a lower tier AI /uncertified level.


I have yet to see any AI produce a single morsel of content of any kind that I would class as even remotely entertaining. So we'll see.


Art and entertainment are different things.


The idea that we won’t care about art is frankly strange. But I think people will still need to make interesting art regardless of the tools.

So far AI doesn’t seem very good at the creative element.


Why would humans explore space when AIs are more intelligent and more physically able to?

Seems more likely we'll just plug ourselves into ever more addicting dopamine machines. That's certainly the trend so far anyway.


Are they gonna stay scared as adults? Lmao

Are you?


AI content is already very dull, the text is dull the music is dull the images and videos are also dull. No one is interested in AI Seinfeld or this short movie that AI created. Their only audience is just people admiring what the machines come to be able to do.

Any AI content that's good, and there are a few of them, actually has plenty of human creativity in it.

There are some AI artist that begin to emerge or there are some AI generated personas out there who are interesting but they are interesting only because the people behind it made it interesting.

I am not fatalistic at all for the creatives. AI is going to wipe out the producers and integrators(people that specialize in putting things together, like coders who code when tasked, painters who paint when commissioned, musicians that play once provided with the score), not the creatives.

The GOTCHA, IMHO, will be people not developing skills because the machine can do it but I guess maybe they will the skills that make the machine sing.


This is really something. The spatial and temporal coherence is unbelievable.


Likely results:

- Every script in Hollywood will now be submitted with a previs movie.

- Manga to anime converters.

- Online commercials for far more products.


Previsband storyboarding will benefit tremendously from this, though ultimately it will be usable for B-roll or second unit stuff. And then? We'll see if this tech levels out or up.


Scripts with AI low quality “movie” with blocking etc is an interesting concept.

Manga to anime already exists.

Commercials, particularly for social/online, already happening as well.


Why do these video generation ones never become usable to the public. Is it just they had to create millions of videos and cherry pick only a handful of decent generations? Or is it just so expensive there's no business model for it?

My mind instantly assumes it a money thing and they're just wanting to charge millions for it, therefore out of reach for the general public. But then with Meta's whole stance on open ai models, that doesn't seem to ring true.


There are a few available to the public. runway.ai and kling are a couple that I see heavily used on Twitter.

I pay for runway right now for experiments and it works. The problem is that maybe 1 out of 10 prompts result in something useable. And when I say useable I have pretty low standards. Since the model pumps out 5 or 10 second clips you have to be pretty creative since the models still struggle with keeping any kind of consistency between shots. Things like lighting, locations, characters can all morph within/between cips.

The issue isn't quality exactly, it is like 80% there. When it works, it is capable of blowing your mind. You can get something that looks like it is a bonafide Hollywood shot. But that is a single 5 second or 10 second clip. So far there is no easy way to reliably piece those together to make even a 1 minute long TikTok.

The real problem is the cost. Since you have to sometimes do 10 prompts to get a single acceptable shot it is like a 10x multiplier on the cost per second of video. That can get very expensive for even short experiments.


Hi zoogeny (and anyone else here) — you can try our new app Nim to address the Runway problems you describe https://alpha.nim.video

We offer both image-to-video (same situation as Runway, need a few attempts to make something awesome) and video-to-video (under the name "Restyle 2.0") - this is our newest tool and is highly reliable, i.e. you can get complex motion (kissing, handshakes, boxing, skateboarding, etc) with controllable changes to input video (changing outfits, characters, backgrounds, styles).

Unlike Runway and Kling, we currently offer a smiple UNLIMITED plan for just $10/mo. Check it out! https://alpha.nim.video


Thanks - will look into this more deeply once I am ready to start integrating generation into my tool.

Do you have an API that can be called? Are you interested in reselling your technology through 3rd party tools?


What's the maximum video dimensions your service can output? with a 1024x1024 image it exports 512x512 on the free plan.


Kling’s new one 1.5 model is WAY better than anything else I’ve tried. Makes runway look terrible. Really good temporal consistency and even gets hair and clothes and stuff right.

They also just added the ability to do lip sync to a moving head and it gets the lighting right too - runways lip sync breaks if there’s any movement at all.

I’m gonna stop pumping Kling on this comment thread now - until they start paying me to advertise!


GTA IV Real Life - Runway Gen 3 AI shows the potential to turn low-fidelity source to something life-like https://youtu.be/FGBSzSO8k6A it would be really cool to this to work locally at playable rates


How much do you pay? Imagine if they could charge premium prices to studio's like $100k/user

that's probably where the quality is, but not the billions


At “the public” Internet scale, if a hundred million people click Generate, imagine if Meta ends up paying a million dollars instantaneously.

- How many clicks of Generate are budgeted for?

- How many clicks should each user’s quota be?

- How much advertising revenue will be earned per click?

- Why should they give away a million dollars?

Right now, AI costs for this are so high that offering this feature ‘for free’ would bankrupt a small country in a matter of days, if everyone on Meta used it once. It doesn’t particularly matter what the exact cost is: it’s simply not tolerable to anyone who owes payment for the services provided.

This is also why the AI industry is trying to figure out how to shift as much AI processing as possible to devices without letting users copy their models to profit off of the training research spend.


Meta owns their data centers, so I don't think that framing is quite right. Increased traffic might cost marginally more in terms of electricity usage, but I think mostly what would happen is the service would degrade.


The hardware serving web requests on Facebook is very different from the hardware used to generate these videos. It’s different kit, that is currently quite expensive and power intensive.

Facebook absolutely does not have a fleet of GPUs idling that could suddenly spring into action to generate a billion of these videos, nor do they have power stations on standby ready to handle the electricity load.


Right, my point is that "paying a million dollars instantaneously" isn't something that Meta would face the way a company with a public cloud infra would, and as a result their motivations / concerns are probably more along the lines of bad user experiences (due to performance bottlenecks) hurting public perception rather than runaway costs bankrupting the company.


Having recently seen cost analysis for hosted enterprise generative AI, we’ll continue to disagree on this point. You certainly are describing valid concerns but Meta never struck me as being particularly worried about how people think of them; and, I am certain this doesn’t have the ’degrade’ capability at the billion users scale — it would have work queue lengths measured in weeks or more, which is useless for social media.


Just release the model and anyone can run locally, there is no cost except for the end user. Meta has the cash flow to do this if they wanted.


Meta probably doesn't want people generating porn (and worse) with their models or derivations of their models, for obvious reputational reasons.


They are in the wrong business if that's the main concern and will get overshadowed by others as tike goes on.


Consistency and continuity is the main problem. Take a look at the “Super Panavision” AI videos on YouTube.

Those videos are a good measure for monitoring AI video improvement.


I'd guess 1 in 10 model demos turn out to be useful product, at best.

This and Sora are particularly annoying, though, for how they put together these huge flashy showcases like they're announcing some kind of product launch and then... nothing. Apparently there's value in just flexing your AI-making muscle now and then.


to be fair, Sora was one of the most mind blowing technology showcases of my life, and openai is successful at raising tons of money


Cost vs profitability is a big factor and those that don't have a product on the market are heavily cherry picking their demos.


There are usable ones

runwayml.com

pika.art

hailuoai.com


klingai.com

lumalabs.ai


When I see lists of URLs like that I can only wonder what a future post archeologist, coming upon this long dusty thread half a decade from now, will find when they try to go to those sites.


I'm confused the demo let me press a button and generate a video, was it not supposed to?


I didn't see a button for that. Just "download paper". Did I miss it?


KlingAI is pretty good - but only 5 second clips for their v 1.5 model which is much better than 1.0

I made this with it (after training a Flux Lora on myself)

https://vm.tiktok.com/ZGdJ6uSh1/

Also interesting - blog post from someone who actually got to use Sora https://www.fxguide.com/fxfeatured/actually-using-sora/

TLDR; it’s still quite frustrating to use


Came here to say this... These companies all want patted on the back for how cool their video models are but we're still waiting on Sora since like last year. More and more publish these "look at us" papers but don't publish the models or even give us access to them.

They do exist, Luma AI DreamMachine is pretty cool. As well as Kling, Minimax, etc. But they aren't anything like Sora or this appear to be. They work but these, while likely cherry-picked, are still a whole new breed of video generation. But who knows if we'll ever actually get to use them or if we're just supposed to reflect on them and think about how cool and impressive Facebook and OpenAI are.


A lot of folks in this thread have mentioned that the problem with the current generation of models is that only 1 in (?) prompts returns something useful. Isn't that exactly what a reward model is supposed to help improve? I'm not an ML person by any means so the entire concept of reward models feels like creating something from nothing, so very curious to understand more.


Bear in mind these systems have already been through the reward-based training, and these are the results that are good enough to show in public.


Which AI's allow you to keep training the model? Most are pre-trained without you knowing how. If you use an open source LLM, you could probably do it, but then you already need to have a lot more understanding of it, and be more technical, and have proper hardware. Most AI's I have seen and worked with don't have an option to keep training it. You just use the model as-is, possibly with a initial prompt to tell the AI in what kind of fashion it should respond.


I'm not so sure how much that is relevant to Meta Movie Gen. I've tried all the tools: Luma, Runway, Kling

Luma is by far the worst and relatively compared to Runway and Kling by far produces the worst quality and unstable video. Runway has that distinctive "photo in the foreground with animated background" signature that turns many off.

Kling and Runway share that same "picture stability" issue that is rampant requiring several prompts before getting something usable (note I don't even include Luma because its output just isn't competitive imho).

This Meta Movie Gen seems to make heavy usage of SAM2 model which gets me super excited as I've always thought that would bring about that spatiotemporal golden chalice we always wanted, evident by the prompt based editing and tracking of objects in the scene (incredible achievement btw).

Until I have the tool ready to try I will withhold any prejudgements but from my own personal experiences with generative video, this Meta movie gen is quite possibly SOTA.

I simply have not seen this level of stability and confidence in output. Resolutional quality aside (which already Kling and Runway are at top of the game), the sheer amount of training data that Meta must have at disposal must be far more than what Kling (scrapes almost the entirety of Western content, copyrights be damned) and Runway can ever hope to acquire, plus the top notch talented researchers and deep learning experts they house and feed, makes me very optimistic that Meta and/or Google will achieve SOTA across the board.

Microsoft on the other hand has been puttering along by going all in on OpenAI (above, below and beside) which has been largely disappointing in terms of deliverability and performance and trying to stifle competition and protect its feeble economic moat via the recently failed regulatory capture attempt.

TLDR: this is quite possibly SOTA and Meta/Google have far more training data then anybody in the existing space. Luma is trash.


Off topic but some day you could live off grid with your own solar fusion mini reactor powering your own hardware that enables creating your own stories, movies and tales. No more need of streaming services. Internet would be to obtain news, goods and buy greatest and latest (or not) data to update your models. Decentralization could be for once not as painful as it is now; however, I still believe every single hardware vendor would try to hook to the internet and make you install an app. Looking forward to this AI revolution for sure.


To me, this feels like a very dystopian take. I watch movies, read books, and listen to music because they are a way to connect with fellow human beings. Taking the human out of the equation also removes any meaning for me.

I get that this is kind of a fundamental line in the sand for most of the "AI art" going around, and it seems like most people fall on one side or the other. "I consume art for entertainment" vs "I interact with art to experience the human condition".

I also don't want to say that AI Art has no value, because I think as a tool to help artists realize their vision it can be very useful! I just don't think that art entirely made by AI is interesting.


Surely if you’ve watched an amazing show you’re likely to share it with a friend, no? I see this bringing likeminded people together in tight, niche communities.


I'm not. The likelihood that such movies (for example) would have anything significant to say about being human seems very low.

If one watches movies, reads books, etc. just to pass the time, maybe this would be some kind of boon. But for those of us looking for meaningful commentary on life, looking to connect with other human beings, this would be some circle of hell. It's some kind of solipsism.


95% of television and movies, to me, are completely uninteresting and not worth watching. the property of being human-made has a pretty low success rate for basically anyone


I wouldn't be so sure. AI can ingest far more information about humans than a human ever could. It has read our stories and understands our languages. AI might have more to say about humans than we do ourselves.

Of course AI can never truly experience being human, it has no emotions, but it is excellent at mimicry and it can certainly provide a meaningful outside perspective.

Is there anything to say about humanity that is not in the training corpus already?


Every new novel of any merit shows that there is. And the world keeps changing. The experience of being human keeps changing.

Nothing AI has yet done has demonstrated anything at the level of art or mastery. I guess I'm unconvinced that throwing a million stories into the blender and synthesizing is going to produce a compelling one.


Maybe people with good story literacy and cultural comprehension will be able to tell the difference for much longer, maybe even indefinitely. But the majority of people, and I dread that includes me, won't, at some point. I've already fallen for some AI generated music and thought "hey, that sounds pretty good, I'll bookmark it". It's genuinely scary.


That doesn't follow at all. If I come up with a meaningful story and use AI to generate clips and stich them together to tell it, that's real art.

If you disagree with that, you're basically saying La Jetee isn't art, which would be a hard sell.


I don't have anything against stitching together clips to tell your story, but I'm unconvinced that these demonstrate anything like that. As I said in another comment, it seems like you'd need to write a screenplay PLUS all the information the director, cinematographer, etc. use to create an actual movie -- everything from direction for how actors portray scenes to decisions on exactly how shots are constructed, to blocking for multiple actors in a scene, to color schemes...

There are a LOT of choices in making a movie, and if you just let the AI make them, you are getting "random" (uncontrolled) choices. I don't think that is going to compare favorably to the real thing.

If you can specify all that, then it's just a tool. Cool. But it's still going to take pro-level skills to use it.


If La Jetee was just some photos stitched together plus meaningful narration, then of course, you could use AI-generated photos.

But would AI be able to quote Vertigo, like La Jetee does? Doesn't art, at least to some degree, require intent (including all intentional subversions of that intent dogma, of course)?


They used to say that the Internet would make people smarter and more knowledgeable.

That prediction became true for like 5% of the population, everyone else is probably stupider than they were before, thanks to social media.

Similarly, I think your prediction will apply to a small subset of humanity.


Impressive.

Always important to bear in mind that the examples they show are likely the best examples they were able to produce.

Many times over the past few years a new AI release has "wowed" me, but none of them resulted in any sudden overnight changes to the world as we know it.

VFX artists: You can sleep well tonight, just keep an eye on things!


Yes, and like pretty much every AI release I've seen, even these cherry-picked examples mostly do not quite match the given prompt. The outputs are genuinely incredible, but if you imagine actually trying to use this for work, it would be very frustrating. A few examples from this page:

Pumpkin patch - Not sitting on the grass, not wearing a scarf, no rows of pumpkins the way most people would imagine.

Sloth - that's not really a tropical drink, and we can't see enough of the background to call it a "tropical world".

Fire spinner - not wearing a green cloth around his waist

Ghost - Not facing the mirror, obviously not reflected the way the prompter intended. No old beams, no cloth-covered furniture, not what I would call "cool and natural light". This is probably the most impressively realistic-looking example, but it almost certainly doesn't come close to matching what the prompter was imagining.

Monkey - boat doesn't have a rudder, no trees or lush greenery

Science lab - no rainbow wallpaper

This seems like nitpicking, and again I can't underestimate how unbelievable the technology is, but the process of making any kind of video or movie involves translating a very specific vision from your brain to reality. I can't think of many applications where "anything that looks good and vaguely matches the assignment" is the goal. I guess stock footage videographers should be concerned.

This all matches my experience using any kind of AI tool. Once I get past my astonishment at the quality of the results, I find it's almost always impossible to get the output I'm looking for. The details matter, and in most cases they are the only thing that matters.


The one thing that immediately stood out to me in the ghost example was how the face of the ghost had "wobbly geometry" and didn't appear physically coupled to the sheet. This and the way the fruit in the sloth's drink magically rested on top of the drink without being wedged onto the edge of the glass as that would require were actually some of the more immediate "this isn't real" moments for me.


The ghost is insanely impressive, it's the example that gave me a "wow" effect. The cloth physic looks stunning, I never thought we would reach such a level of temporal coherence so fast.


I think those types of visual glitches can probably be fixed with more or better training, and I have no doubt that future versions of this type of system will produce outputs that are indistinguishable from real videos.

But better training can't fix the more general problem that I'm describing. Perfect-looking videos aren't useful if you can't get it to follow your instructions.


VFX artists cannot sleep well, they're already being displaced with AI or being forced to use it to massively increase their output.

Here's an example thread: https://www.reddit.com/r/vfx/comments/1e4zdj7/in_the_climate...

I am not trying to be negative, however it is the reality that ML/LLM has eliminated entire industries. Medical transcription for example is essentially gone.


That thread you linked doesn’t seem to align at all with your claims though? The majority of comments do not make the claim that they’re using any GenAI elements.

As someone who’s worked in the industry previously and am quite involved still, very few studios are using it because of the lack of direction it can take and the copyright quagmire. There are lots of uses of ML in VFX but those aren’t necessarily GenAI.

GenAI hasn’t had an effect on the industry yet. It’s unlikely it will for a while longer. Bad business moves from clients are the bigger drain, including not negotiating with unions and a marked decline in streaming to cover lost profits.


I don't see it as that much of a problem. It's like washing machines taking away people's job of washing clothes, what are they gonna do with their time now? Maybe something more productive.


… something more productive than art?

that’s quite a productive thing. art has tremendous value to society.

why don’t we automate the washing machine more instead of automating the artist?


Washing machines and roombas were the low hanging fruits in the real world.

Automating more in the real world is much (much) harder than grabbing the low-hanging fruits in the digital world.


Well we already automated all the easy stuff (washing machines for example), and now we’re automating more stuff as we get better at it.


Because some companies stumbled on this treasure first. Need to milk immediately.


We really have a problem once there are no more jobs left for us humans, and only the people who own capital (stocks, real estate etc) will be able to earn money from dividends.


> We really have a problem once there are no more jobs left for us humans

What is the required amount of labor humans should have to do?


The amount required to pay rent on their continued survival, which in a capitalist society, and excluding members of the capitalist class, will never be zero.


Tbf, the biggest private infrastructure project in the history of humanity is now underway (Microsoft GPU centers), the fastest app to reach #1 on the App Store was released (ChatGPT), and it’s dominating online discourse. Many companies have used LLMs to justify layoffs, and /r/writers and many, many fanart subreddits already talk of significant changes to their niches. All of this was basically at 0 in 2022, and 100 by early 2023. It’s not normal.

Everyone should sleep well tonight, but only because we’ll look out for each other and fight for just distribution of resources, not because the current job market is stable. IMO :)


>Many companies have used LLMs to justify layoffs

they are not allowed to call the recession a recession until January 20


On the contrary I dont think theres anything special with their examples. It probably represent well most output. Think of image generation and the insane stuff people can produce with it. There's no "oh yeah this is just cherry picked"


Are any image / video generation tools giving just the output or the layers, timelines, transitions, audio as things to work with in our old fashioned toolsets?

The problem: In my limited playing of these tools they don't quite make the mark and I would easily be able to tweak something if I had all the layers used. I imagine in the future products could be used to tweak this to match what I think the output should be....

At least the code generation tools are providing source code. Imagine them only giving compiled bytecode.


Keep in mind that these technologies produce more stuff like what they've been trained on, and they need tremendous amounts of training data to pull that off.

It so happens that there are innumerable samples of prose and source code and rendered songs and videos and images to use as this training data.

But that's not so much the case for professional workflows (outside of software development).

If the tools can evolve to generating usefully detailed and coherent media projects instead of just perceptually convincing media assets, it's going to be a while before they get there.


There are some approaches that use an LLM to generate “scripts” (you can think of them as a DSL) for composing/arranging media, essentially driving other models to generate parts of the media. One example is WavJourney: https://audio-agi.github.io/WavJourney_demopage/


They definitely do not give you an Adobe After Effects project. This is because of the way they are trained. I suspect a vast proportion of its training data is not annotated with the corresponding layers, timelines, etc so the model is unable to reproduce it like that. You basically just get video AFAIK.


If you have experience as a graphic designer, you can get very far with any layer based graphic tools like Krita or Affinity in conjunction with proper inpainting against generative image models - in fact that's InvokeAI's entire target user base.


Photo sharing websites (including Facebook) used to be wrappers around ImageMagick with extra features. I love how the backbone of their training involves calling out to ffmpeg. It gives a little hope to those of us who, too, are working on a smaller scale but with similar techniques.

Scale? I have access to an H100. Meta trained their cat video stuff on six thousand H100s.

They mention that these consume 700W each. Do they pay domestic rates for power? Is that really only $500 per hour of electricity?


They're all investing into their own sources of power. I'm sure Zuck has a few deals in place too.

https://www.businessinsider.com/google-considering-nuclear-p...

> Both its competitors, Amazon and Microsoft, have already announced electricity deals with nuclear power stations.

> In March, Amazon inked a $650 million deal to buy electricity from the Susquehanna nuclear power station, per the Financial Times.

> Then, in September, Microsoft signed a 20-year deal to purchase energy from Pennsylvania's Three Mile Island nuclear plant, the plant's owner, Constellation Energy, said in a statement.


Things are about to get weird. We can't control this at any level:

At the level of image/video synthesis: Some leading companies have suggested they put watermarks in the content they create. Nice thought, but open source will always be an option, and people will always be able to build un-watermarked tools.

At the level of law: You could attempt to pass a law banning image/video generation entirely, or those without watermarks, but same issue as before– you can't stop someone from building this tech in their garage with open-source software.

At the level of social media platforms: If you know how GANs work, you already know this isn't possible. Half of image generation AI is an AI image detector itself. The detectors will always be just about as good as the generators- that's how the generators are able to improve themselves. It is, I will not mince words, IMPOSSIBLE to build an AI detector that works longterm. Because as soon as you have a great AI content classifier, it's used to make a better generator that outsmarts the classifier.

So... smash the looms..?


My favorite idea that nobody is talking about is how news organizations are about to get a second life. As soon as it becomes actually impossible to distinguish AI content from human content, news organizations will have the opportunity to provide that layer of analysis in a way that potentially can't be (easily) automated. They are ironically against it but IDK maybe they should be excited about it. Would love someone to poke holes in this.


I have the same suspicion, though I wonder if they won't immediately try to "cut open the golden goose" and decide that misusing why about it trust for short term gain is favorable (for the person making the decision, if not the organization).


Just stop taking any video you see at face value? People managed without videos before video cameras were available, and the written word was never reliable to start with. Maybe the future won’t be that different?


Except that time "before video cameras" didn't coincide with a time in which everyone had a magic device in our pocket that allowed anyone to send a firehose of propaganda our way.

If yellow newspapers were able to push us to war despite us knowing that "the written word was never reliable to start with", what will be the impact of the combination of this technology and the internet used against a population that has been conditioned over generations to trust video.


If “fake news” is anything to go by, the population will quickly be de-conditioned from trusting video.


Absolutely not. You can just go to Twitter or Reddit, like https://www.reddit.com/r/pics/, to see an image with a (e.g. political) caption that purports something to be true and thousands of people will take it onboard as truth. Nobody asks for a source, or they are admonished when they do for apparently disagreeing with the political claim.

You can go on Youtube to see charlatans peddle all sorts of convenient truths with no evidence.

You don't even need AI. The bug is in the human wetware.


So this is basically a regression to a 19th-century level in terms of being able to trust and understand reporting on the world beyond our own front door. People managed before photographic and video evidence was a thing; you could use eyewitness reports from trusted friends and news on the official telegraphs, to the extent that those were trustworthy. But it's certainly still a big step backward from the 20th century, that brief window of time where it was much easier to record physical evidence of an event than to fake it.


Photographic evidence has been subject to manipulation before computers were even a thing, more so after Photoshop became widely available. There has always been forensics for that, which will continue to evolve.

I think the issue with trust is rooted elsewhere - in social relations, politics, and not in AI generated content.


It has, but it used to take a lot more skill to manipulate a photo than to take a photo, and convincing video manipulation was even harder. I'm also skeptical that forensics will be able to keep up, because of the basic principle of antagonistic training -- any technique forensics can use can be applied back into improving the pipeline that generates the image, defeating the forensic tool. That certainly wasn't the case in the 20th century.


What remaining institutions still command any trust?


... Most of them?

Do you read the news at all? If you can't trust any of them, then why even bother?


Such as?


I'm confused. Do you not trust any mainstream media? Where do you get your news? World and local? Eyewitness accounts only?


good advice for internet citizens (too bad the uptake will be too slow). but doesn't address how courts and law should function.


I think pretty soon we will get to the point where there’s some sort of significant boundary at all levels between online and real life because the only way to be sure you’re seeing something real is to be interacting with it in real life. The internet will not be something you visit on a web browser to get information but will become a place you go where you will simply have to acknowledge that nothing is real. Obviously that’s a concern now but I wonder if we’ll get to a point where it’s taken for granted at large that whatever you see on the internet just isn’t real. And I wonder what implications that will have.


> IMPOSSIBLE to build an AI detector that works longterm

    return Math.random() < Math.pow(0.5, (new Date()).getFullYear() - 2023) ? "Not AI" : "AI";
This should increase in accuracy over time.


It turns out that "return 'AI'" is a better strategy when the probability is above 50%: https://www.lesswrong.com/posts/msJA6B9ZjiiZxT6EZ/lawful-unc...


Good point. Here's a patch:

    Math.random = () => 1;


The challenge is to determine what is real, not what is fake.

I think cryptographic signing and the classic web of trust approaches are going to prove the most valuable tools in doing so, even if they're definitely not a panacea.


This comes up a lot. Because synthesis is so generally feasible plus the existence of very powerful editing tools for things like movies and whatnot, I'm guessing that it will simply become the norm to assume that any image, sound, movie, or whatever may be fake. I expect there won't be a way to verify something was synthesised or "real-synthesized" (since images and videos are ultimately synthesized themselves, just from reality instead of other synthesized content). Even with signing and web of trust we can only verify who is publishing something, but not the method of synthesis.


Trusted entities could vouch for the veracity (or other aspects) of things, especially if they are close to the source.

We already implicitly do this: if a news outlet we trust publishes a photo and does not state that they are unsure of its veracity we assume that it is an authentic photo. Using cryptographic signing that news outlet could explicitly state that they have determined the photo to be real. They could add any type of signed statement to any bit of information, really. Even signing something as being fake could be done, with the resulting signed information being shareable (although one would imagine that any unsigned information would be extremely suspect anyway).

The web of trust approach is to have a distributed system of trust that allows for less institutional parties to be able to earn trust and provide 'trusted' information, but there are also plenty downsides to it. A similar distributed system that determines trustworthiness in a more robust way would be preferable, but I am not aware of one.


It can be verified if resulting video contains signed metadata with all intermediate steps needed to produce the video from original recording (which is digitally signed by camera).

Downside is that large original video assets would need to be published, for such verification to work.


You won't be able, as some average person, to trust that what you gets to Twitter, Instagram, or whatever image and video hosting platform gets popular in the future, is real, but 1) I'm not sure you can today anyway, 2) plenty of people don't consume anything from these platforms and get by fine, and 3) what are you even relying on this information for?

Are you concerned about predicting the direction or "real" state of your national economy? Videos aren't going to give you that. Largely, you can't know. Heavily curated statistical reports compiled and published by national agencies can only give you a clear view in retrospect. Are you concerned that a hurricane might be heading your way and you need to leave? Don't listen to videos on social media. Listen to your local weather authority. Are you concerned about whether X candidate for some national office really said a thing? Why? Are any of these people's characters or policy positions really that unclear that the reality or unreality of two seconds worth of words coming out of their mouths are going to sway your overall opinion one way or another?

Things you should actually care about:

- How are you family and friends doing? Ask them directly. If you can't trust the information you get back, you didn't trust them to begin with.

- How should you live your life? Stick with the classics here, man. Some combination of Aristotle, Ben Graham, and the basic AHA guidelines on diet and exercise will get you 95% of the way there.

- How do you fix or clean or operate some equipment or item X that you own? Get that information from the manufacturer.

Things you shouldn't care about:

- Is the IDF or Hamas committing more atrocities?

- Does Kamala Harris really support sex changes for convicted felons serving prison sentences funded by public money?

- Can Koalas actually surf?

Accept at some point that you can't know everything at all times and that's fine. You can know the things that matter. Get information from sources you actually trust, as in individual people or specific organizations you know and trust, not anonymous creators of text on Reddit. If you happen to be a national strategic decision maker that actually needs to know current world events, you're in luck. You have spy agencies and militaries that fully control the entire chain of custody from data collection to report compilation. If they're using AI to show you lies, you've got bigger problems anyway.


The web of trust doesn't seem to scale! All of the online social platforms trend towards centralization for identify verification.

In my (historically unpopular) opinion we have two optional choices outside of but still allowing for this anonymous free-for-all:

A private company like Facebook uses a privileged system of identification and authentication based on login/password/2FA and relying on state-issued identification verification,

OR, what I feel is better, a public institution that uses a common system based on PKI and state-issued identification, eg, the DMV issuing DoD Common Access Cards.

Trusting districts and nation-states could sign each other's issuing authorities.

The benefits are multifaceted! It helps authenticate the source of deep fakes. It helps fight astroturfing, foreign or otherwise. It helps to remove private companies fueled by advertising revenue from being in a privileged position of identification, etc, etc.

I totally understand any downvotes but I would prefer if you instead engaged me in this conversation if you disagree.

I'd love to have this picked apart instead of just feeling bummed out.


I agree the cat is out of the bag, but GANs do not work like that. One of the common failure modes in training a GAN is that the discriminator gets too powerful too quickly and the generator then can no longer learn.

Hard to say anything is impossible off of one point - but discrimination afaik is generally seen as the easier problem of the two, given you only need to give a binary output as opposed to a continuous one.


> It is, I will not mince words, IMPOSSIBLE to build an AI detector that works longterm

Like pretty much any tool involving detection of / protection from erroneous things, it's forever a cat and mouse game. There will always be new viruses, jailbreaks, banned content, 0-days etc. AI detection is no different.


> Nice thought, but open source will always be an option, and people will always be able to build un-watermarked tools.

Thats why you make it punishable by potential prison time if you create/disseminate an non watermarked video generated in this way.


Possible option is for cameras to digitally sign the original video as it is being recorded.


Oi mate, you 'ave a license for producing cryptographic signatures to embed on that footage?


Any chance of this being released open weights? Or is the risk of bad PR too high (especially near a US election)?

It being 30B gives me hope.


Meta text to image model cm3leon[0], was announced july 2023. It wasn't released yet, I think this one might take a while.

[0] https://ai.meta.com/blog/generative-ai-text-images-cm3leon/


I don't think they will ever release it considering it's likely much worse than flux.


> Any chance of this being released open weights?

Considering that Facebook/Meta releases blog posts titled "Open Source AI Is the Path Forward" but then refuses to actually release any Open Source AI, I'm guessing the answer is a hard "No".

They might release it under usage restrictions though, like they did with Llama, although probably only the smaller versions, to limit the output quality.


They have released a ton of open source? Llama 3 includes open training code, datasets, and models. Not to mention open-sourcing the foundation of most AI research today, pytorch.


Llama 3 is licensed under "Llama 3 Community License Agreement" which includes restrictions on usage, clearly not "Open Source" as we traditionally know it.

Just because pytorch is Open Source doesn't mean everything Meta AI releases is Open Source, not sure how that would make sense.

Datasets for Llama 3 is "A new mix of publicly available online data.", not exactly open or even very descriptive. That could be anything.

And no, the training code for Llama 3 isn't available, response from a Meta employee was: "However, at the moment-we haven't open sourced the pre-training scripts".


Sure, the Llama 3 Community License agreement isn't one of the standard open licenses and sucks that you can't use it for free if you're an entity the size of Google.

Here is the Llama source code, you can start training more epochs with it today if you like: https://github.com/meta-llama/llama3/blob/main/llama/model.p...

It's rumored Llama 3 used FineWeb, but you're right that they at least haven't been transparent about that: https://huggingface.co/datasets/HuggingFaceFW/fineweb

For models I prefer the term "open weight", but to assert they haven't open sourced models at all is plainly incorrect.


> Here is the Llama source code

Correct me if I'm wrong, but that's the code for doing inference?

Meta employee told me just the other day: "However, at the moment-we haven't open sourced the pre-training scripts", can't imagine they would be wrong about it?

https://github.com/meta-llama/llama-recipes/issues/693

> For models I prefer the term "open weight"

Personally, "Open" implies I can download them without signing an agreement with LLama, and I can do whatever I want with it. But I understand the community seems to think otherwise, especially considering the messaging Meta has around Llama, and how little the community is pushing back on it.

So Meta doesn't allow downloading the Llama weights without accepting the terms from them, doesn't allow unrestricted usage of those weights, doesn't share the training scripts nor the training data for creating the model.

The only thing that could be considered "open" would be that I can download the weights after signing the terms. Personally I wouldn't make the case that that's "open" as much as "possible to download", but again, I understand others understand it differently.


The source I linked is the PyTorch model, should be all you need to run some epochs. IDK what the pretraining scripts are.


Doesn't the training script need to have a training loop at least? Loss calculation? A optimizer? The script you linked contains neither, pretty sure that's for inference only


Oof you're right - no loss function or optimizer in place, so you'd need add that plus pull in data + tokenizer to get a training loop going.

Apologies - you are right and I was wrong. I would edit my comments but they're past the edit window, will leave a comment accordingly.


Past the edit window - want it to be higher up that only the model architecture is shared, no training scripts, as diggan correctly points out.


That and the NFSW finetunes that will inevitably follow; unlike the text-gen finetunes these could really cause trouble with deepfakes.


Deepfakes are already a reality, the technology is already there and good enough for harm, the genie is not going back to the bottle.

In fact, the more realistic the deepfakes become, the less harmful actual revenge porn and stolen sex videos can be, because of plausible deniability.


We live in a world where you can just say dumb bullshit about Haitians and millions of people will insist it's real.

This "good deepfakes will prevent harm because of plausible deniability" is absurd copium, and utterly divorced from reality.

Speak to victims some time. You are not helping them.


The porntential is immense.

Seriously though. This is the company that is betting hard on VR goggles. And these are engines that can produce real time dreams, 3d, photographic quality, obedient to our commands. No 3d models needed, no physics simulations, no ray tracing, no prebuilt environments and avatars. All simply dreamed up in real time, as requested by the user in natural language. It might be one of the most addictive technologies ever invented.


Or a multi billion dollar fluke like the Metaverse. Time will tell.


All the pieces are in place to create a primitive holodeck. I hope meta continues to invest/burn billions in this space.


Meta is already a target for regulators - they are going to have to be very careful around this. I think this is why the "metaverse" is still more likely to be decentralized than created by a tech giant. Even if Meta wanted to take a libertarian, "dream whatever you want", stance or even a "dream whatever you want so long as it is more or less legal" stance, they would see a regulatory deluge come pouring down on them. There is no way VR will be able to go mainstream without a drawn out fight over content prohibitions. I think the early internet was a bit of a historical outlier in this sense, where it happened to come about when a relatively laissez-faire attitude towards censorship was prevailing and people did not realize the full impact it would have. That is not the case now. People understand on all sides that this technology has the potential to revolutionize our systems of social relations once again, and I suspect that they will be fighting tooth and nail to shape that outcome as they most desire.


> There is no way VR will be able to go mainstream without a drawn out fight over content prohibitions

Could be, but it's a bit dystopian to imagine that the government would have a say on the images you can generate- locally and in realtime- and send straight to your own eyes, don't you think? Dystopian and very difficult to enforce, too.


Did you just pornify a word?


Just wait until content is generated based on dopamine release rates or brain signals instead of through text.


Sorry, but it's probably just going to be used for ads.


Hahahaha. I think Websters Dictionary may be interested in hiring you.


That’s not how dictionaries work.


Ya don't say.


If social media was the scourge of the last decade, the next decade's scourge will be artificial content.

Digital minimalism is looking more and more attractive.


Sitting in my workshop drinking a cup of tea on a break from making some new saw horses and reading these comments. I’m just so grateful I can do something with my hands, I’m delighted, couldn’t be happier.

If the world ends tomorrow. I’m ok so long as I’m not wasting it on Instagram shit post filled with fake content when it goes down.


Absolutely terrifying. Please stop.


Absolutely agree. It's very terrifying and will likely cause mass disruption because it will disintegrate the social fabric that is held together by people needing other people for stuff.


It is also an automated loom. But now we techno creatives are the luddites.


The luddites wanted sane working conditions and to be trained on the use of newly introduced, dangerous machinery that was maiming and killing people and was meant to replace swathes of them overnight to make some rich capitalists richer.

The end result for them wanting humane conditions was getting murdered by the machine owners & the state.

Perhaps the rabidly pro-AI people shouldn't be on the side of the murderous psychopaths who wanted to extract maximum profit by employing children and displacing thousands of people if they don't want to be viewed as similarly psychotic.


Absolutely. What's even the purpose of this thing? Who is it really serving?


To all the folks with negative opinions of this work: you guys are nuts! This work is incredible. Is it the end of the line yet? Of course not, but come on! This is unbelievably cool, and who of you would have predicted any of this ten years ago?


For me, peace in society, a nice world where humans can share what they create, and nature outside and preserved are all much better than "cool", and this "cool" tool threatens all of the above.


It's incredible in the same way an AK-47 is incredible. This sort of thing is going to uproot all of culture and god knows what happens after that.


One thing I've noticed with the set of music generation tools (eg Udio, Suno) is that there's a sort of profound attachment to songs that you create. I've never made music the old fashioned way so I'm guessing the same could be true for that as well, but there are songs I've made on Udio that I personally think are amazing but nobody else really responds to. Conversely I can see similar levels of pride and attachment from others for songs they have created that don't do anything for me.

It's going to be interesting to see how that plays out when you can make just about any kind of media you wish. (Especially when you can mix this as a form of 'embodiment' to realize relationships with virtual agents operated by LLMs.)


"Music is about communication" (John Lennon, IIRC). Don't expect people to profoundly connect to music that is nothing more than a collection of regurgitated ideas.

Not to sound too crass, but a parallel could be drawn to smelling one's own farts and wondering why no one else appreciates the smell.


> Don't expect people to profoundly connect to music that is nothing more than a collection of regurgitated ideas.

Music is one of the worse examples to pick for claiming that people don't regurgitate in art. Everything in music builds off one another, and a lot of music (especially music that's seen as lower quality) is described as being just collections of cliches. The reason why "sad music" sounds sad isn't because there's something about instrument choices, key, chords, melody, tempo etc that is measurably intrinsically "sad" - it's because these are stereotypes that the creator has combined together to invoke a certain association in the listeners. If you were extra cynical, you could describe the entire musical field as people largely conditioning themselves over generations to like certain qualities of sound and hate others.

And that applies to almost all art. Basically everything people make is based on stuff that came before that - and it's frustrating to encounter hubris that assumes there's some magical creative process going on inside human brains that will never ever be even approximated by any other means.


> it's frustrating to encounter hubris that assumes there's some magical creative process going on inside human brains that will never ever be even approximated by any other means

Maybe we will, maybe we won't. In the same way maybe we will be able to create life artificially, maybe not. The AI I am critiquing today is what Tamagotchi is to human life. Sure, you can get attached to it and think it's expressing real emotions and wonder why other people are being "hubris" by not realizing how wonderful it is.


"I microwaved this frozen pizza while I was desperately craving pizza, so it was perfect to me. Why does my friend who usually eats hand-crafted pizza not think it's as good?!"


This is not unique to AI. People simply don't care about your stuff. Ask any regular artist or game developer.


> I've never made music the old fashioned way so I'm guessing the same could be true for that as well

Yes, it is. You should try it.


It’s the same feeling. No different from rebuilding that crazy synth you made one night, succeeding, and then being able to improv/vamp with it during a live session. It is a creative process and I urge anyone who finds the high level aspect of music creation to pursue the lower levels


> I personally think are amazing but nobody else really responds to.

Welcome to making music lol. Since there is so much of it, you have to make the absolute best to even be considered. And then, because so many people make the absolute best, people only care about the persona making the music (as great as you are, you aren’t Taylor Swift, Kendrick Lamar, Damon Albarn). Your friends will never care about your music just because you are friends, don’t fall into that trap. Also nobody cares about music without good lyrics, because again, there is just so much instrumental content out there that sounds the same, lyrics differentiate it with a human, emotional element.

Just make stuff for fun. Listen to it every now and then and feel the magic of “hehe I made that”


> Also nobody cares about music without good lyrics

Well, that's an exaggeration if I've ever seen one. Firstly, so much of current chart music has atrocious lyrics. And secondly, instrumental music is very popular.


You got me, I exaggerated on the internet. Sorry.


U good?


One thing I've noticed with the set of music generation tools (eg Udio, Suno) is that there's a sort of profound attachment to songs that you create.

With all due respect, how could there be when at the click of a button you can generate entire songs? You didn't come up with the chord progression, the structure, the melodic motifs, or the lyrics.

My attachment to my works is directly proportional to the amount of effort it took to create them.


Imagine you've had an idea bouncing around in your head, or even an emotion, for a long time and you've never been able to express it. Then one day you push a button and a piece of art captures what you've been feeling perfectly.

It's not the craft that drives attachment in this case but the emotional resonance of something that you think should exist finally existing.


AI mentioned above is not at the level of capturing and expressing ideas or emotions beyond "a sad rock song about a breakup". Try guiding it to express any clearly formed musical idea.

Author's attachment is to a large degree based on the false notion that they somehow contributed to the creation process.

The generic, frigid, un-interesting "product" that is produced by said AI is why no one other than the prompter is moved by the result.


That’s just not true. I’ve used Suno to generate songs where I have provided all the lyrics. Those lyrics came from a combo of LLMs and my steering/direct edits and then I ran the lyrics through Suno multiple times until I got something I wanted.

I can agree that:

> "a sad rock song about a breakup"

Is probably not going to capture or express your ideas or emotions because you haven’t given it enough. In contrast, writing the lyrics or giving the model a ton more context can absolutely produce something that captures and expresses your ideas and emotions.

At the end of the day I don’t make music for the masses (hell, I’ve only generated a handful of final songs that I’ve liked) but the people I have made them for (or the ones just for me) have enjoyed them quite a bit.

I’m not a songwriter nor am I a musician and I never will be. That’s not where my skills lie and it’s not a skillset I want to learn and hone. AI/LLM tools give me the ability to express myself in a medium that previously was effectively impossible and it makes people I care about smile and that’s good enough for me.


Providing lyrics is called "writing lyrics". If you think that pasting lyrics into a prompt makes you somehow more involved in the process of writing music I don't know what to say.

> can absolutely produce something that captures and expresses your ideas and emotions

The right analogy here is to imagine an infinite museum where you can wander until you find a piece that expresses your emotions. It has nothing to do with the act of your expression, and everything to do with you resonating with a piece produced by someone/something else.

> At the end of the day I don’t make music for the masses

Fair. But you also don't "make music" for yourself. At most you write lyrics.


This is a tad overwrought. There is a creative process, but it’s much more akin to simple producing rather than composing.

My point wasn’t to debate the merit of generated music, it was simply to highlight the effect I described.


It's not closer to producing that it is to composing. In fact, I would say it's closer to composing in the sense that you can at least add lyrics and pick a genre.

Production requires specifying very precise requirements, which the current gen AI is unable to follow. Even at the most fuzzy production level like "a song with strings and a choir", Suno will generate something completely irrelevant. And if you will try to go deeper -- use a classic Moog synth line in the chorus -- don't expect to generate something meaningful.

I won't argue that in the most broad sense, prompt engineering is a creative process. Picking which shoes to wear to work is also a creative process. My argument is that this has barely anything to do with the process of music composition or production. You can literally reuse the same prompt to generate an image or a poem.


I have no explanation.

It’s not a sense of pride or accomplishment. I don’t know what it is. Maybe a small amount of pride. It’s hard to say. But there is a definite connection that feels different listening to songs i requested vs those that other people have.


> You didn't come up with the chord progression, the structure, the melodic motifs, or the lyrics.

Both Suno and Udio allow paid subscribers to upload their own clips to extend from. It works for setting up a beat or extending a full composition from a DAW.

Suno's is more basic than Udio's which allows in painting and can create intros as well as extensions, but the tools are becoming more and more powerful for existing musicians. With Udio you can remix the uploaded clip so you can create the cord progression and melody using one set of instruments or styles (or hum it) and transform it into another.

I also use this feature all the time to move compositions from one service to the other. Suno is better at generating intros and interesting melodies while Udio is better at the editing afterwards.


I can imagine a painter two hundred years ago saying the same thing about photographs. How can you feel attachment to a picture when you did not make each brush stroke?


I think OP is saying they really enjoy the song. Not that they feel it is their magnum opus.


You can absolutely specify your own lyrics and structure to Suno.


If by "structure" you mean "add a verse and a chorus" then sure. Music composition goes slightly beyond that.


link a song you prompted that you personally think is amazing


https://www.udio.com/creators/jcims

yeah its not surprising that you wouldn't volunteer this, man


This is some juvenile shit, man.

What did you hope to prove? That I'm right? That songs that I feel a connection to sound terrible to you? An astute reader with some basic emotional intelligence would notice that is the entire point of my comment. Not that 'mouthbreathers on the Internet don't appreciate my art', but that things I 'create' by typing stuff in on a keyboard still manifests this odd connection that is just going to be enhanced as these capabilities increase.

Those songs were something a friend of mine asked for, not the ones I was talking about. Don't worry, I've protected the internet from my poor musical taste.


you'll notice i didn't use the word "create" or "make" in my reply


I made a silly 1-hour long movie with friends +/- 20 years ago, on DV tape. I would love to use this to actually be able to implement all the things we wanted to achieve back then


The problem with gen AI right now is it still feels fairly obvious. There are numerous YouTube channels that primarily rely on gpt for the visuals. And I don't like them.


I know people who use it day to day in their production workflow for ads and installations. You'd never know if they wouldn't break it down for you. Imagine 1 second scenes which happens so fast your bran just accepts it as "hand made" or professional job. 90% of it was generative AI, but the "good news" is that it still required a human editor who just happened to save a ton of time to make something that wasn't commercially viable because the client wouldn't paid for it otherwise.


Obvious to many, but not most, if Facebook is anything to go by


It is important to note, no matching audio dialog, or even an attempt at something like dialog. This seems to be way beyond current full video generation models.


Some of these look really obviously bad, like the guy spinning the fire and the girl running along the beach. And it completely failed at the bubbles


Interesting perspective, considering a paper ByteDance just released yesterday [1] has much worse video quality. If your comparison is to real videos, then for sure the quality isn't great. If instead you compare to other released research, the this model is one of the best released thus far.

[1]: https://epiphqny.github.io/Loong-video/


Okay, let's give it a participation trophy for being the best of the slop category.


doesn't need to be movie quality, just needs to be tiktok quality and this totally passes the bar.

Are you ready to become a penguin in all of your posts to maximise aquatic engagement? I am.


I've become a robot and a demon to maximise engagement, its called being a vtuber


The spinning fire was one that could easily fool me if a 0.5 shot was in a music video. Context is everything.


I have not had the same feeling as you and i do look at ai art for quite. awhile.

Are you still impressed though?


Yeah, some were impressive, but others looked quite bad. The guy running in the desert looked like a guy floating over the ground only sporadically making contact with the sand. The footfalls in a lot of these videos look pretty janky or "soft".

The clothing changes also have pretty rough edges, or just look like they're floating over the original model. The 3D glasses one looked atrocious. The lighting changes are also pretty lacking.


The most powerful example for me is actually the rain one because it will probably be good enough for lower key effects shots like that to replace a lot of jobs there. More complex generations might look goofy for a while, but if it's good for sky replacement, pyrotechnics, and other line of work effects shots its going to be heavily disruptive.


Facebook just spent 40 billion dollars on their AI infrastructure. Can they recoup those costs with stuff like this (especially after the VI debacle)? I doubt it. AI has been a wild ass jagged wasteland of economic failure since the 1950's and should be used with extreme caution by these companies...Like is it worth peoples time to spend ten\fifteen dollars (they have to eventually charge for this) to let AI create a, to be freank, half-assed valley of the uncanny movie? I respect the technology and what they're trying to accomplish but this just seems like they're going completely all in on an industry that's laid waste to smarter people than Mark.


In Facebook's case they have more to gain than just nifty gen AI features - better ads, content recommendations, etc. The investment in AI infra is a moat, and is why FB's ad platform has proven to be much more resilient to tracking changes than their competitors (e.g. Snap).


You're right, Meta shouldn't invest in a breakthrough technology. They should focus on what really matters: delivering short term value to shareholders


Hippos don't float.


Incredible, simply incredible. You know a paper is seminal when all the methods seem obvious in hindsight! Though I’m not caught up on SOTA, so maybe some of this is obvious in normal-sight, too.

RIP Pika and ElevenLabs… tho I guess they always can offer convenience and top tier UX. Still, gotta imagine they’re panicking this morning!

  Upload an image of yourself and transform it into a personalized video. Movie Gen’s cutting-edge model lets you create personalized videos that preserve human identity and motion.
Given how effective the still images of Trump saving people in floodwater and fixing electrical poles have been despite being identifiable as AI if you look closely (or think…), this is going to be nuts. 16 seconds is more than enough to convince people, I’m guessing the average video watch time is much less than that on social media.

Also, YouTube shorts (and whatever Meta’s version is) is about to get even worse, yet also probably more addicting! It would be hard to explain to an alien why we got so unreasonably good at optimal content to keep people scrolling. Imagine an automated YouTube channel running 24/7 A/B experiments for some set of audiences…


I was looking for that landslide effect (as seen even in Sora and Kling) where land seems moving very disproportionally to everything else. It makes me motion sick. I have not seen those Sora demo videos a second time for that reason.

These are smooth, consistent, no landslide (except sloth floating in water, the stones on right are moving at much higher rate than the dock coming closer), no things appearing out of nowhere. Editing seems not as high quality (the candle to bubble example).

To me, these didn't induce nausea while being very high quality makes it best among current video generators.


Why does it look ... fake?

Before you downvote, don't get this as a belittling the effort and all the results, they are stunning, but as a sincere question.

I do plenty of photography, I do a lot of videography. I know my way around Premiere Pro, Lightroom and After Effects. I also know a decent amount about computer vision and cg.

If I look at the "edited" videos, they look fake. Immediately. And not a little bit. They look like they were put through a washing machine full of effects: too contrasty, too much gamma, too much clarity, too low levels, like a baby playing with the effect controls. Can't exactly put my fingers on, but comparing the "original" videos to the ones that simply change one element, like the "add blue pom poms to his hands", it changes the whole video, and makes the whole video a bit cartooney, for lack of a better word.

I am simply wondering why?!

Is that a change in general through the model that processes the video? Is that something that is easy to get rid of in future versions, or inherently baked into how the model transforms the video?


The models produce a form of average video, but with artificial sharpness added for a form of consistency. A truly and consistently original image requires something these models do not have, which is a world model.


While we're still a fair distance away from creating polished products capable of replacing Hollywood gatekeeping; the bursting of the creative dam is on the horizon and it's exciting! I'm looking forward to when you can write a script and effectively make your own series or movie. Tweaking it as you go to fit your vision without the exhausting a large amount of resources, capital, and human networking to produce similar products pre-AI.


I haven’t had any luck being able to effectively generate compositions with text to image / text to video. Prompts like “subject in the lower third of the frame” have thus far completely failed me. I’m sure this will change in the future but this seems pretty fundamental for any ‘AI Powered Film’ to function the way a film director would.

Curious if anybody has a solution or if this works for that


Hippos can't swim. Things are about to get weird where people will start believing strange things. We already have people believing Trump helped people during the hurricane, with images of him wading through water (that are clearly AI generated if you look close enough). We are going to get a form of model collapse at not just the AI level, but societal one.


This is just the landing page for a research paper? It's hard to understand what the actual production capabilities of this are.


does anyone have an example of an AI generated video that's more than 10 seconds long that doesn't look like garbage? All of these tools seem to generate a weirdly zooming shot of something that turns a little bit and that's about it.

Anything longer than a single clip is just a bunch of these clips stitched together.


Did I miss it or did they not say anything about letting people actually use these models, let alone open sourcing them?


I wonder how they will package this as a product. I mean, there is some advantage to keeping the tool proprietary and wrapping it in a consumer product for Instagram/Facebook.

What I hope (since I am building a story telling front-end for AI generated video) is that they consider b2c and selling this as a bulk service over an api.


The obvious use case for Meta is content generation. They provide the tools to content creators who create new content to post on Facebook/Instagram which increases Meta’s ad inventory


Very cool.

But I'm worried about this tech being used for propaganda and dis information.

Someone with a 1K computer and enough effort can generate a video that looks real enough. Add some effects to make it look like it was captured by a CCTV or another low res camera.

This is what we know about, who knows what's behind NDAs or security clearances.


Same was thought to happen about images but it hasn't. People quickly debunk AI generated content presented as real in replies or community notes. Not a real issue.


Maybe maybe not. As is simple false journalism has caused issues on Facebook in certain countries ( to put it lightly).

It's only going to look more realistic in time...


It is really amazing how consistent this model is in demo videos about world object details over time. This spatial comprehension is really spooky and super amazing at the same time. I hope Meta will release this model with open weights and open code, as they have done for the LLaMA models.


Problem is, the moment they release weights someone will fine tune it to generate porn, including CP. So I wouldn’t hold my breath for the weights release - no legal dept will sign off on something with this much fallout potential.


What can you even say about this stuff? It's another incremental improvement, good job Mark. These new video clips of yours are certainly something. I don't know how you do it. Round of applause for Mark!

I will now review some of the standout clips.

That alien thing in the water is horrifying. The background fish look pretty convincing, except for the really flamboyant one in the dark.

I guess I should be impressed that the kite string seems to be rendered every frame and appears to be connected between the hand and the kite most of the time. The whole thing is really stressful though.

drunk sloth with weirdly crisp shadow should take the top slot from girl in danger of being stolen by kite.

man demonstrates novel chain sword fire stick with four or five dimensions might be better off in the bin...

> The camera is behind a man. The man is shirtless, wearing a green cloth around his waist. He is barefoot. With a fiery object in each hand, he creates wide circular motions. A calm sea is in the background. The atmosphere is mesmerizing, with the fire dance.

This just reads like slightly clumsy lyrics to a lost Ween song.


it's totally wild that your first response is shitting on flaws rather than having your jaw drop at machines producing coherent videos from text.

This is _the worst that machines will ever be at this task_, and most of the improvements that need to be made are a matter of engineering ingenuity, which can be translated to research dollars.


This is Hacker News. That comment was way more positive than I expected for something like this and so I assumed this must be pretty awesome


The fact that this is the worst machines are at something doesn’t necessarily imply they will eventually get much better at it.


It certainly wasn't my intent to trash the whole thing, so I'm sorry it came across that way. They've done well. They combined a whole bunch of techniques in a new way, or at least in a better way than we've seen before. I don't think you should be surprised to see these results today.

> This is _the worst that machines will ever be at this task_

This is wrong. We've seen worse and we've seen far, far worse -- what I mean is that we've seen plenty of iterative development in video generation. Even if you only consider machine-learning based video from text prompts. Then consider other generative systems as well as other video research and technology like motion interpolation, depth map generation, etc.. It's an extremely active field.


Round of applause for this useless unsubstantiated comment



When I was little I used to think it was a shame that I could not show my dreams. I could tell my parents what I dreamt but not show them what I saw (or thought I saw while dreaming). Getting closer


Thinking about abuse potential, is there such a thing as irreversible finger-printing of media generated like this? So that even bad actors couldn't hide the fact that it was generated by AI.


> is there such a thing as irreversible finger-printing

No.



That's very impressive.


Impressive, yet more burning GPUs and pushing CO2 in the atmosphere just for stupid stuff that is only of interest to rich western people...

I'd rather have those people work on climate change solutions


At least companies are behind it which can actually put the money were its needed and compensate it.

At least microsoft and google are on a co2 neutral race.

And all of these clusters doing something can also do research and partially do.

Its valid critisism, but we need to stop co2 production on a lot of other industries before we do that for datacenters. Datacenters save a lot more co2 (just think about not having to drive to a bank to do bank business).


> burning GPUs and pushing CO2 in the atmosphere

My startup develops AI for the nuclear power industry to drive process, documentation, and regulatory efficiency. We like to say "AI needs nuclear and nuclear needs AI".

Big tech has finally realized/gone public that casually saying things like "we're building our next 1GW datacenter" is uhh, problematic[0].

For some time now there has been significant interest/activity in wiring up entire datacenters to nuclear reactors (existing Gen 2, SMRs, etc):

https://finance.yahoo.com/news/nvidia-huang-says-nuclear-pow...

https://www.ans.org/news/article-5842/amazon-buys-nuclearpow...

https://www.yahoo.com/news/microsoft-signs-groundbreaking-en...

https://www.cnbc.com/2024/09/10/oracle-is-designing-a-data-c...

https://thehill.com/policy/technology/4913714-google-ceo-eye...

[0] - https://www.npr.org/2024/07/12/g-s1-9545/ai-brings-soaring-e...


No thank you, I not going to let my beloved progressives be dragged into Luddism just so you can feel a little better about yourself through insignificant changes without a meaningful impact on the environment. Anything with true lasting effects will have to be top down like renewable resources/energy, EVs, viable alternatives to plastic, nuclear energy etc.

The argument should never be about reducing energy usage, rather it should be about how we generate that energy in a clean, renewable way.


> Impressive, yet more burning GPUs and pushing CO2 in the atmosphere just for stupid stuff that is only of interest to rich western people

Better to spend 10x amount of energy on humans that will give the same result?


utter loser mentality


Looking at bolt.new I think all the Studio/IDE type of apps are going to look like that. Could be video or code or docs etc.

I can see myself paying a little too much to have a local setup for this.


Website doesn't work on Firefox and videos don't play on Edge. They should consider asking the AI to make a correct website before having it make hippos swim.


I use Edge at work: the videos played without issue (version 129.0.2792.65 on Windows).

I use Firefox on my personal device: the website worked fine though took an extra "hiccup" to load compared to Edge (version 131.0 on Windows).


The entire page load is completely broken on Edge for me. Bizarre


It works fine for me on Firefox on Linux, weird.


Also Firefox 127.0.1 on Linux, works perfectly (using an NVIDIA GPU).


That's a bit old, isn't it?


It is only a little more than 3 months old, so I would not call that old.

I avoid updating to each new Firefox version, because from time to time they break some features important for me.


All works fine for me here in Edge, odd.


Not working in Safari on my MBP either.


Yeah it doesn't play the video for me on S10+. I can't imagine what they're doing to break that. It's just another disposable consumerist craze anyway.


Alright, I may or may not be a moron, but none of my versions of Firefox can connect to this site because 'some HSTS shit'.

Anyone able to update/inform a dinosaur?


I wonder if one day we’ll have generative recommender systems where, instead of finding videos the algorithm thinks you’ll like, it just generates them on the spot.


I’ve long ago heard it said that the two drivers of technology innovation are the military and porn. And, welp, I don’t see any use of this to the military.


    > Upload an image of yourself and transform it
    > into a personalized video. Movie Gen’s
    > cutting-edge model lets you create personalized
    > videos that preserve human identity and motion.
A stalker’s dream! I’m sure my ex is going to love all the videos I’m going to make of her!

Jokes aside, it’s a little bizarre to me that they treat identity preservation as a feature while competitors treat that as a bug, explicitly trying not to preserve identity of generated content to minimize deepfake reputation risk.

Any woman could have flagged this as an issue before this hit the public.


Pretty much anyone that I’ve talked to that somewhat works in AI industry, the attitude is “let it rip right now, and deal with the consequences as it’s going to happen one way or another”. I’m not sure where I stand on this issue, but the reality is, it’s inevitable whether we want it or not.


> but the reality is, it’s inevitable whether we want it or not.

The "inevitability" of it is mostly a function of the (self-serving) belief that it is inevitable.

Basically, you just cited a bunch of moral cop-outs.


What sort of actions do you think we can take where the dangerous side effects (like creating deepfake pornography) won’t be as easily accessible as illegal streaming of TV shows? Only let big private companies to train models? Make open sourcing of weights illegal? Make usage of LLM tools generally illegal? All those are as enforceable as torrenting around the world.


Ah, the usual "if we don't do it, someone else will".


When I read that text my first thought was making some videos of my mom that passed away, since so few videos of her exist and pictures don't capture her personality


How would videos created from photos, photos that didn't capture her personality, show her personality?


This is a Black Mirror episode: https://en.wikipedia.org/wiki/Be_Right_Back


The fact that your first thought was how you could use this amazing tech to remember a lost family member who you love, and OP's first thought was that it could be used for evil so it shouldn't exist says a ton about each of you.


Well I think the second use sounds creepy, too. I'm sure that says a ton about me.


Nah u not wrong

Both unhealthy


If you put a piece of technology into the world you should spend more time on what consequences that has for the living in the future, not the dead.

As someone who has worked on payments infrastructure before, it's probably nice if your first thought is what great things an aunt can buy for her niece, but you're better off asking what bad actors can do with your software, or you're in for a bad surprise.


Meta aren’t exactly known for responsible use of technology.

I would expect nothing less of Zuck than to imbue a culture of “tech superiority at all costs” and only focus on the responsible aspect when it can be a sales element.


Your dream $11.99 kitchen knife! Perfect for stabbing! IT IS REAL!

* product made without use of AI or any unnatural components. pure mountain iron


joking aside, i do think that as AI continues to pervade public consciousness, we will start to see some creators market their products as being "AI-free" or somesuch, for better or worse. It's like lemon market virtue signaling.


I'm surprised this hasn't already been done (or I'm not aware of it)...

Step 1. Train AI on pornographic videos

Step 2. Feed AI images of your ex

Step 3. Profit




why be weird and use real people for reference?

why be extra weird and use a personal reference?


One positive aspect of it is that at some point people will just not care about nudes. Which is better for the victims of rageporn, not worse.


At what point did someone look at this and think: "Ah yes, this will be good for humanity right now" ?


Seems to have great potential in the VFX industry, for one thing.


I don't think it works like that. It's more "Hey! This tech can make funky videos"


That person would have been fired :-(


Most of the comments here talking about bad actors using this for misinformation, but they're ignoring what Meta does- it collects your information and it sells ads.

Especially based on the examples on this site, it's not a far reach to say that they will start to generate video ads of you (yes, YOU! your face! You've already uploaded hundreds of photos for them to reference!) using a specific product and showing how happy you are because you bought it. Imagine scrolling Instagram and seeing your own face smelling some laundry detergent or laughing because you took some prescription medicine.


The kids kite is flying backwards....


FAQs I found:

Is it available for use now? Nope

When will it be available for use? On FB, IG and WhatsApp in 2025

Will it be open sourced? Maybe

What are they doing before releasing it? Working with filmmakers, improving video quality, reducing inference time


Wonder what a AI generated movie from the same script as original would look like.


I'm not impressed with the quality. Did they mean to make it look so cartoony?


A lot of them don't look cartoony to me. Better then previous video generators


It feels like in the field of AI, a major advancement happens every month now!


Where can I download this model? Meta is the open source AI company right?


Meta is not the open source AI company. LLaMa (or whatever capitialization that was) was leaked. They ran with that because hey it makes them stand out vs your "Open""AI" and Anthropic etc. But if strategy changes they will happily close the drawbridge. In other words it is a corporation and has no inherent persistent ethics.


The paper that comes with this is nearly as crazy as the videos themselves. At a cool 92 pages it's closer to a small book than a normal scientific publication. There's nearly 10 pages of citations alone. I'll have to work through this in the coming days, but here's a few interesting points from the first few sections.

For a long time people have speculated about The Singularity. What happens when AI is used to improve AI in a virtuous circle of productivity? Well, that day has come. To generate videos from text you need video+text pairs to train on. They get that text from more AI. They trained a special Llama3 model that knows how to write detailed captions from images/video and used it to consistently annotate their database of approx 100M videos and 1B images. This is only one of many ways in which they deployed AI to help them train this new AI.

They do a lot of pre-filtering on the videos to ensure training on high quality inputs only. This is a big recent trend in model training: scaling up data works but you can do even better by training on less data after dumping the noise. Things they filter out: portrait videos (landscape videos tend to be higher quality, presumably because it gets rid of most low effort phone cam vids), videos without motion, videos with too much jittery motion, videos with bars, videos with too much text, video with special motion effects like slideshows, perceptual duplicates etc. Then they work out the "concepts" in the videos and re-balance the training set to ensure there are no dominant concepts.

You can control the camera because they trained a dedicated camera motion classifier and ran that over all the inputs, the outputs are then added to the text captions.

The text embeddings they mix in are actually a concatenation of several models. There's MetaCLIP providing the usual understanding of what's in the request, but they also mix in a model trained on character-level text so you can request specific spellings of words too.

The AI sheen mentioned in other comments mostly isn't to do with it being AI but rather because they fine-tune the model on videos selected for being "cinematic" or "aesthetic" in some way. It looks how they want it to look. For instance they select for natural lighting, absence of too many small objects (clutter), vivid colors, interesting motion and absence of overlay text. What remains of the sheen is probable due to the AI upsampling they do, which lets them render videos at a smaller scale followed by a regular bilinear upsample + a "computer, enhance!" step.

They just casually toss in some GPU cluster management improvements along the way for training.

Because the MovieGen was trained on Llama3 generated captions, it's expecting much more detailed and high effort captions than users normally provide. To bridge the gap they use a modified Llama3 to rewrite people's prompts to become higher detail and more consistent with the training set. They dedicated a few paragraphs to this step, but it nonetheless involves a ton of effort with distillation for efficiency, human evals to ensure rewrite quality etc.

I can't even begin to imagine how big of a project this must have been.


Having read the paper, I agree that this is an enormous effort, but I didn't see anything that was particularly surprising from a technical point of view - and nothing of Singularity-level significance. The use of AI to train AI - as a source of synthetic data, or as an evaluation tool - is absolutely widespread. You will find similar examples in almost any AI paper dealing with a system of comparable scale.


Yeah I know, but you sometimes see posts on HN that talk as if AI isn't already being used for self-improvement. I guess the subtlety is that people tend to imagine some sort of generic recursive self-improvement, and overlook the more tightly focused ways it's being used.


When the comments range from "it's the demise of the world" to "it doesn't look quite right" (and everything in-between) you get a sense of just how early we are into this decade's "big new tech thing".


Impressive but meh.

Impressive on the relative quality of the output. And of the productivity gains, sure.

But meh on the substance of it. It may be a dream for (financial) producers. For the direct customers as well (advertisement obviously, again). But for creators themselves (who are to be their own producers at some point, for some)?

On the maker side, art/work you don't sweat upon has little interest and emotional appeal. You shape it about as much as it shapes you.

On the viewer side, art that's not directed and produced by a human has little interest, connection and appeal as well. You can't be moved by something that's been produced by someone or something you can't relate to. Especially not a machine. It may have some accidental aesthetic interest, much like generative art had in the past. But uninhabited by someone's intent, it's just void of anything.

I know it's not the mainstream opinion, but Generative AI every day sounds more and more like cryptocurrencies and NFTs and these kinds of technologies that did not find _yet_ their defining problem to which they could be a solution.


I'm sick of seeing this generative stuff. It's at least 50% of the content I see online these days. At this point it's so refreshing to see real photography and art, made by real humans. I hope we never lose that.


You’re not sick of it. Because 50 percent of the stuff you think isn’t generative actually is.

You love a lot of this generative stuff you just hate idea of knowing it’s generated.


Sorry, but I've been doing photography for so many years I can easily tell the difference. I don't care if you don't believe me though. But yes, this is also about authenticity. There's a big difference between something that happened and something that didn't.


I don't believe you at all. I think you're either a liar or a lying to yourself.

https://www.gounfaked.com/

I think it's possible you may think you would be able to differentiate AI generated photos and real photos. But if you looked at that site and told me the SAME thing, then I would know that you actually are lying to me.


This just seems to serpent eating its own tail and distopian to me, Facebook, a company where people share their own content like videos and pictures now generating content from nothing but AI. To what end?


Was this trained on personal facebook video data?


Harry Potter-style moving pictures are now a reality.


It should be federal law that any video created with GenAI should be watermarked both stenographically and visually. (Same goes for images and audio.. not sure what can be done about ascii)


Stenography is writing in shorthand. What you mean is steganography.

You can also watermark plain text by generating "invisible" patterns.

Of course, in all these cases, the watermarks are trivial to remove: just re-encode the output with an open model. Which is why I hope there will be no federal law that tries to enforce something that is categorically unenforceable.


yes sorry, i think autocorrect got me when I wsn't looking.

If the feds catch you removing watermarks you go to prison. Problem solved.


They didn't post any examples where it fails?


It will be as interesting as our dreams. So maybe personally interesting, like for a small group sitting around a table and taking the piss. But it’s not gonna make a global sensation.


I can finally watch Star Wars the Smurfs edition


Periodic generative AI reminder:

It will not make you creative. It will not give you taste or talent. It is a technical tool that will mostly be used to produce cheap garbage unless you develop the skills to use it as a part of your creative toolkit -- which should also include many, many other things.


So can we try it? Announcements are no good.


Facebook is already flooded with very strange (to put it kindly) AI boomer engagement bait AI images. I cannot help but think about how much worse the problem could get with AI generated videos. But they are not cheap to make right now.


Did this website kill anyone else’s phone?


We live in the future. I just hope we consumers get easy access to these video tools at some point. I want to make personal movies from my favorite books


The text to modify a video looks so cool


Seems like it mops the floor with Sora


This is totally awesome - the tech is out there and whether you use it to make videos or solve long human / world problems is up to you.

Yeah, we might get the bad killer robots. But it's more likely this will make it unnecessary to wonder where on this blue planet you can still live when we power the deserts with solar and go to space. Getting clean nutrition and environment will be within reach. I think that's great.

As with all technology: Yes a car is faster than you. And you can buy or rent one. But it's still great to be healthy and able to jog. So keep your brains folks and get some skills :)


« out there » is slightly optimistic.

The model is not released and probably won’t be for a while.

And it probably costs Meta-scale infra to fine-tune to your needs.


The giant crab-like thing in the background of the Hippo swimming (if a hippo could swim) is the stuff of nightmares.


Why don't videos like this ever trend?

#cabincrew

#scarletjohanson

#amen


Just feed it a book?


student here ,i learn cs and management. And i really Puzzled what i learn now can help me have better life in this era of rapid development of technology.


McDonald’s art.


those penguins are incredibly buoyant.


(commented on wrong thread somehow)


Wrong thread :)


Yet another one-shot, single-clip Instagram machine that can't do a follow-on shot natively.

As it stands, the only chance you have of depicting a consistent story across a series of shots is image-to-video, presuming you can use LoRAs or similar techniques to get the seed photos consistent in themselves.


wow more useless tech


This is great. Honestly imagine we get to a point this technology makes most things so demystified we move on to things that are more difficult.

Like cool a movie doesn’t need to cost $200 million or whatever.

Imagine if those creative types were freed up to do something different. What would we see? Better architecture and factories? Maybe better hospitals?


change style to pencil sketch = absolute gamechanger. (penguins vid)

thats the most amenable approach to ai filmmaking ive seen available yet.

id have to see wayyy more pencil sketch conversions to see exactly whats going on....

...but that right there is the easiest way to hack making movies - with the most control.....so far...


everyone who worked on this should be ashamed


I've kinda given up on the internet at this point. It's sad but comforting. My social networks are just my friends and I've started to get back into reading books and long form blogs. Don't want to be exposed to this endless slop. Every day it gets harder to find something that was so easy before. It's all being buried by endless content. I'm hoping some non AI generative content branch of the internet will be created. Don't know if something like that is possible. Curation seems like the next best step.


It's just a technology. Much like you can still buy "hand made" shoes, there will be people that create curated hand made content. And if they use AI and you can't tell the difference, does it really matter?

I really don't understand why there is so much negativity for a new technology. It's never explained, just taken as a fact and people bemoaning the new state of the world.


> And if they use AI and you can't tell the difference, does it really matter?

It does, yes. To use your own analogy, if one pays for an artisanal product and is served something out of a production line in a factory, that is fraud.

> It's never explained

You’re not paying attention, it is often explained. One huge reason is that it devalues the work of artists who are already struggling to make any money by using their own work as a base without compensation. It’s not hard to find explanations if you really care to spend 5 seconds typing it into a search engine. Heck, I bet that if you asked an LLM, it’d tell you.

https://www.reddit.com/r/Fantasy/comments/zn2e3c/eli5_why_do...


Did washing machine devalue the work of people washing by hand?

Did cameras devalue the work of portrait artists?

ATMs to bank tellers? Tractor to farmer? Car to horses?

Again, it's technology. I guess I shouldn't have said I never heard an argument. I just never heard a good one that can't be applied to pretty much any other technology that came before.


Human attention doesn’t get freed up by creating more content. It gets consumed.

In all your examples -

1) Yes. It was a good thing

2) Yes. It is now a thing done to learn how to draw, and a niche skill

3) Yes, yes, yes.

IF people are bemoaning the devaluing of certain activity, yup it’s true. It happens. There are fewer horses than there were yesterday.

Certain forms of activity get devalued. They are replaced by an alternative that creates surplus. But life goes on to bigger things.

The same with GenAI. Content is increasingly easy to create at scale. This reduced cost of production applies for both useful content and pollution.

Except if finding valid information is made harder, - then life becomes more complex and we don’t go on to bigger and better things.

The abundance of fabricated content which is indistinguishable from authentic content means that authentic content is devalued, and that any content consumed must now wait before it is verified.

It increases the cost of trusting information, which reduces the overall value of the network. It’s like the cost of lemons for used cars.

This is the looming problem. Hopefully something appears that mitigates the worst case scenarios, however the medium case and even bad case are well and truly alive.


who's stopping you from Amish lifestyle? problem seems to be that people want authentic, hand-made 'art' but with price of a mass manufactured tech.


For art - I can get the dissonance. It’s inherently subjective.

I’m concerned with facts and science.

I have to talk past a litany of falsehoods about mental health with my dad before I can get to the actual science that will help him.

I have to remind people about things they studied in the 6th grade about history, to counter whatever hate group BS that has whatsapped itself into their heads.

This is what I am concerned about.

If it was cheaper to create true content, and more expensive to create non factual content, I wouldn’t be arguing about this with people who write code.

It’s just cheaper to create content and more expensive to identify content.


Things like washing machines and cars liberated people from labor and saved their time, which they couldn then use to persue other goals. For many, making art is their goal - generative AI isn't liberating people from the burden of making art, it's making art a non-viable endeavour for millions of people.


Generative AI lets me make films from my desk instead of 7 AM call times, steep bills at the rental houses, and unreliable post editors who fail to meet deadlines.

This isn't just a net win, this puts power in my hand I've never had before.

It's still work and art. No LLM is going to make a compelling story or make the right artistic choices. I do that. But now I get to control way more myself and I'm empowered to see the entire vision though.


I always felt that the beauty of film is in the collaboration between all the people involved to create something larger than themselves. Everybody in the credits brings something irreplaceable to a film, and together they transcend a singular vision to build a work of art.

This generative stuff feels reductive to me. There are no actors, no set designers, prop masters, musicians. Nobody’s bouncing ideas off a colleague or working with three other departments to bring a scene together. It feels less like art when it’s a computer using models based on real stolen art to generate content off of a prompt.


> But now I get to control way more myself.

You are not, the LLM makes bulk of work for you and will choose a lot of things for you for the movie you are making.

>I'm empowered to see the entire vision though.

I think you fail to grasp one important thing about art in general - it is non-verbal by it's nature. You can't go and explain in LLM input some famous painting, it not how it works.


My partner is a graphic designer and now uses a lot of generative AI in her work. It’s very much an iterative process. She would run hundreds (sometimes thousands) of prompts over the course of compositing a single image. There’s huge amounts of editing involved, too. It doesn’t take less time than before when she was primarily an illustrator. But it enables her to do different types of artwork she wasn’t previously able to.

It’s definitely different. And has some bad sides for sure. But professionals using LLMs for creative work tends to be a lot more involved than just typing a prompt.


That's like how LLMs help me with software development. I don't work less, instead I produce more and what I produce is of greater benefit to my clients.


Diffusion models aren't LLMs and aren't necessarily text promoted. You can paint with them.

> You can't go and explain in LLM input some famous painting, it not how it works.

When directing a film, you're issuing verbal commands to your team. It's actually quite similar to prompting. And I almost never get what I envision. Diffusion in a way gets me closer to what's in my head.


I thought that was one of the problems with some of the art models - that you could input the right sequence of words and get exact copies of famous copyrighted images out.


> But now I get to control way more myself

is this a good thing? to some extent sure but I feel like limitations and collaboration leads to better art in many instances


It's like saying that cameras take away oil painting as a viable endeavor. You can still be an oil painter in today's world. It's just a bit more niche, and not many people can make a living doing it. Plenty of people are still going to make art. They're just going to have to get day jobs like the rest of us to support that activity. Or they're going to have to figure out how to do something so weird and unique that AI can't possibly replicate it. Things like multimedia storytelling, ARGs, and performance art could enter a golden age as TV and movies become outmoded.


> Plenty of people are still going to make art. They're just going to have to get day jobs like the rest of us to support that activity.

You make it sound like producing art isn’t a real job and artists are just fucking around all day. Art takes work, it is a job. One where few can earn a living.

Most artists already have “day jobs like the rest of us”. What’s happening now is that even fewer people can afford to even begin to learn or improve their artistry, they have to give up before they start. Which, by the way, will in turn reduce what image generators can consume.


No, it's just that pursuing what you love as a job is a rare privilege. That doesn't mean it isn't work. But it's different from a day job, and if you expect me to believe that pursuing art is not more fufilling than some bullshit white collar job, thats laughable. It's kind of a moot point, because these jobs are going to be displaced. There's no sense in fighting technological innovation; history shows you can't stop it. If you try, some other joker is going to come in and take advantage of the technology, putting you at a disadvantage. This is how capitalism drives innovation, and historically, it's one of the few good things about this particular economic system.

It's a double edged sword, having what you love be tied to a wage. Capitalism mediates the scope of what you are allowed to depict. Market forces have created a scenario where existing IP is a safer bet. Just get out of the game. It rarely produces anything worthwhile anyway. Most art is garbage in this system. Better to do independent things on the side.


> Did cameras devalue the work of portrait artists?

Actually yes.

Ask any professional photographer what they think of the selfie generation.

Yes, you can argue "the best camera is the one you have with you".

But ultimatley it has devalued the value of the professional photographer.

I know photographers who, for example, spend their life swatting away phone camera users who turn up behind them at the spot where they've setup their camera etc. Its like, I've taken the time to find the spot, the angle, the light and you turn up and devalue all that....


I don't understand. Why would people taking pictures for themselves devalue the work of a professional photographer?


Think of anything you can do well and with nuance. People pay you to do it professionally and properly. Now imagine there’s a tool that kind of does what you do, for free but sloppily. Suddenly people no longer pay you to do what you did, and all around you see the output of that subpar tool and the inferior result it produces. Your work has been devalued. No one is paying for it and most don’t understand why it was better, despite the fact that it was.

Have you ever seen those “graphic design is my passion” memes? That’s a good analogy. Effective graphic design isn’t just plastering some words on a page, it’s understanding where those words go, where to break the lines, which font to use, what background is balanced… All things the tool doesn’t do for you but that affect the final result and how it’s perceived at a subconscious level. There is a difference between a good and a bad poster. Even if both have the same information, one of them will be better at its job (e.g. making people pay attention and pay for the concert ticket) even if the bad designer doesn’t understand what or why.


These are false equivalences, or whataboutisms. The hand washers provided a very different ‘thing’. To that of artists. You mention things like getting clean clothes, getting from a to b in a car or on a horse, getting cash from an ATM instead from the teller at a bank branch. These are all completely different to the work of an artist or craftsperson that expresses something. So it’s very reasonable for people to seek insight/inspiration/catharsis/etc from things that they understand to be fully crafted by human minds and hands.


> These are all completely different to the work of an artist or craftsperson that expresses something

The thing is, as was the case with many goods where individual craftspeople were largely displaced by industrialized mass production, the expression of the individual artist or craftsperson was often not what the market was paying for anyway. They were paying for something utilitarian, but limited by the available methods of production.


Do photographers express less than painters? Paint and brush are technologies. They enhance paintings.

I expect there will be AI based art, not sure in which form, and people will still find joy in it.


Considering a significant portion of art appreciation revolves around the artist themselves, the talent and story they possess, and the level of effort and expression that went into the piece… I don’t know about that.

There’s plenty of visually impressive AI art out there right now. No one is celebrating it.


Some is also conceptually interesting and celebrated, for example the niceaunties project out of Singapore: https://www.theguardian.com/world/article/2024/jul/19/auntie...


A quick glimpse and it seems wildly uninteresting, conceptually and visually.


This is awesome, thanks for the link.


> Do photographers express less than painters? Paint and brush are technologies. They enhance paintings.

I didn’t mean to imply artists don’t use technology. And certainly include photography in my definition of things made by the hands and minds of craft people. I have a degree in photography to back that up. But as with all technologies individuals will choose where their line of interest is drawn. There will be AI based art, and their will also be an audience who don’t want it.


Its not any different in principle than any other technological innovation. Arguments to the contrary amount to special pleading. The jobs of many, many craftspeople were displaced in the wake of industrialization and the digital revolution. We all buy and consume things that are mass produced all the time and don't bat an eye. Some artisianal version of these things is available in many cases. You're free to buy the mass produced version or the artisianal version. Technological progress is the engine of capitalism. History shows that you can't stop it, and slowing it just enables competitors to swoop in and eat your lunch. But Capitalism has a shelf life. Feudalism came to an end, and so will the market system. In practice, the implementation into all aspects of production will lead to an insurmountable economic crisis. What comes after is up to us.


My argument is this: technology has been driving down the market value of human labor for the last century at least, and soon the market value of human labor will be pennies.

I specifically use market value because the market valuation mechanism is irrational and its meaning of "value" is different from what people often mean when they talk about value. E.g. the contributions of a fireman are far more valuable to a society than those of a mutual fund manager, but the market value of the fireman's labor is far lower.

Anyway I don't know when it'll happen but the cost of doing many things with AI, automation, etc will soon be very low, and having a human do the thing maybe worse maybe better, will be prohibitively expensive. Even if you argue that in the past new technologies have created new jobs, one can look at the labor market value trend of the last 70 years to see that it seems we've passed some turning point where human labor market value in emerging skillsets can't surpass the ability of technology to make them redundant - by which I mean take a look at them wages, they are slip sliding into oblivion.

Maybe not all jobs, but if even 20% of people can't justify their existence under capitalism with labor in a way that feeds and houses them, that's a historic national crisis.

Stopping technological development probably isn't the long term solution. Unions are great but also just a bandaid on a broken system - don't we want to enjoy the benefits of decreased scarcity?

It's time to start having real conversations about how to organize our society as scarcity diminishes to near 0, or if you don't like that, how to organize our societies with the expectation that 20% of people or more simply won't be able to justify their existence through labor.

If you can think of a way to make capitalism work in such a way without enforcing artificial scarcity, I'm all ears, but I'm skeptical. Probably we need to stop requiring people to justify their existence with labor and just let people have food, shelter, medical care etc in return for nothing at all. We almost certainly already have the resources to allow this, and if we don't today we definitely will in 50 years.


> One huge reason is that it devalues the work of artists who are already struggling to make any money by using their own work as a base without compensation.

All productivity technology marginalizes sellers in the same field that choose not to use it. And somehow, I don't think the objections would be any less if models were trained on exclusivelt material in the public domain.


> It does, yes. To use your own analogy, if one pays for an artisanal product and is served something out of a production line in a factory, that is fraud.

Isn’t this what happened to Etsy? You really can’t tell, so the artisanal goods became mostly factory produced in China. But beyond the romantic and ethical concerns, I wish there was real tangible advantage from buying from an artisan rather than a factory. At least interior designers and custom cabinets are impossible to mass produce…so far.


Ngl I think many of those artists’ work deserve to be devalued. As someone involved in art, there’s a huge amount of art that’s overly concerned with perfecting technique like a robot.

Entire movies are made just to be “one single take without any edits” and no one stops to ask themselves whether or not that’s actually the most impactful way to tell the story. The vast majority of digital art was just filled with all these people who mindlessly copied the same styles of vaguely realistic characters in the same action hero poses and then demanded praise/money for it. It’s like I was meant to praise the artistic talent of someone just because their process was laborious… and the truth was it wasn’t ever good art.

IMO AI art is just forcing many of these artists to take a look in the mirror, and they don’t like what they see.


> As someone involved in art, there’s a huge amount of art that’s overly concerned with perfecting technique like a robot.

>

> Entire movies are made just to be “one single take without any edits” and no one stops to ask themselves whether or not that’s actually the most impactful way to tell the story.

Yeah, craft is an important thing.

I don’t want photorealistic tattoos, personally, but I appreciate that people have the talent and ability to do them.


Craft isn't labor though, and there are plenty of people who think that just doing a laborious task makes their art interesting. It doesn't.


It's that I can't be forced to care about stuff the people that commissioned it, didn't even bother read, review or fact check.

Everything feels like the laziest grind imaginable. For every Harry Potter goes to Berghain, there is an infinite amount of slop making it harder to find good content. Not even mentioning the biases baked into the model and copyright/IP issues and limitations.

To me generated content triggers the same cringy feelings, like those unrequested flash holiday e-cards in your mailbox in 2010. At least those wellmeaning people didn't know better back then.


Subjectively, I find that the infusion of AI content makes the general internet experience worse. Have you ever researched something to find a video that is AI generated with an AI generated voice? It is the SEO spam of video. I think that platforms will go through several phases in their relationship to AI: first, they will like the artificially increased "content creator" numbers. Second, they will like the increase in short term engagement. Third, they will very much dislike the sharp decrease in long term engagement as the market finds whoever can filter low-quality results the best.


Pretty much any "top 10" product video on YouTube.


Well ok, but there's plenty of crap knockoffs that you need to worry about when buying shoes. It's not about being anti-technology. It's about new content being lower quality. You can tell but it takes effort that wasn't necessary before.


I distinguish between content and high effort content. I like both! AI generated content is fine, whatever, but slap an AI generated scene into a woodworking video and you bet it matters to me.

I'm surprised you've never heard any explanations. Here are my personal thoughts on the matter (har har):

https://chatgpt.com/share/6700aac4-8854-8004-b785-0784c6bfee...


Technology is not neutral.


It is not about the technology being bad. IMO it is just a tool. The problem is that tool enables the pollution of good content because it makes so easy to spam everything. It feels like it will undo all the progress of pagerank and original google, and we will be back to altavista/internet portals days with curation.


Never before has there been a technology that allows one to fake competency to this degree. This is occurring at a time when some people are bemoaning the general state of knowledge, the capacity for institutions to deliver on their goals and their mission, and the overall intelligence of the people supposedly in charge.

Along comes a tool that promises to take some of the hardest, most challenging tasks, like photorealistic art, or high quality code engineering, and replace them with slop.

Experts look at that slop and see garbage. Only people without experience, without a trained eye, without taste in code, will think it's great.

It represents a tyranny of juniors and entryists. Where "fake it until you make it" will never even reach the second part of that phrase.

The main thing that AI does, is make anything that comes out of it worthless. Infinite supply with nobody curating it.


> if they use AI and you can't tell the difference

Except you can tell the difference and you will always be able to tell the difference. Its the nature of the beast.

Throwing a bunch of content at AI and telling it to curate it is never going to be the same as a human that intimately knows it.

Its the same with GPT coding. You can always tell the difference between what the machine spat out in 20 seconds and what a knowledgable human would produce.

Artifical Intelligence is by definition artificial.


> And if they use AI and you can't tell the difference, does it really matter?

Yes. It devalues artists further (on top of only existing due to grand scale theft of art). It also is slop devoid of intention.

If that's not enough, it's ecologically devastating. I would rather not waste millions of gallons of water on slop that gets generated and thrown away.


Human information networks are also an ecosystem. Having an abundance of slop stops the flow of information and knowledge.

You have to filter everything that is shared to make sure you aren’t being predated upon.


Precisely - the amount of damage that openai and llms have done to human knowledge sharing and discourse is incalculable. Now we not only have to deal with humans manually writing junk due to perverse incentives, we also have to deal with people passing off machine generated junk at scale.

And we're destroying the environment through rapidly increasing resource consumption to do so. This situation is incredibly fucked.


I'll tell you why. I think it's fair to say that most of the people who visit this site are technologists.

Most of us love technology. So why are people reacting like this?

Because we know how this will be used. We know of grifters and quick-buck schemes. We know that, unfortunately, the web as we know it is rapidly dying. We know of the scammers calling you with your loved ones' cloned voices.

We know how powerful this tech is, how easy it is, and how it will be misused.


Yeah, I have done some grifting myself but quit doing it but still know a lot of really creative grifters. I know what a tool like this can do in their hands. These people managed to ruin a lot of Google searches with just human labour and some very basic automation. I know what evil they can do with good generative AI.


You are saying reality and fake reality are alike, because hey can you spot the difference?

Whole effin' Matrix movies premise was about that, remainder of human civilization rather went to suicidal fight than go back to more comfy fake illusion of reality.

When I watch videos from friends or generally people, I am interested in their actual experiences, not lies they try to push about how they wanted their experiences to look like. That's frankly pathetic, you may of course not share the same view but luckily its dominant view in this world.

When Mad max movies made most of their fights in real desert and it showed, god damn I had huge respect for all involved, thats an art. Green background just doesn't cut it.

You also recharge in same way looking at phone wallpaper (or look at it in VR) of some mountains of forest, or actually being there and touching the trees and rock? One is cheap fake junk, another is proper awesome reality that makes you feel alive like nothing. Can't grok why the fuck do I need to even explain this to somebody, my 4 year old son gets it very well.


I feel as though people are concerned over a hypothetical.

Can anyone show me AI generated content that has fully replaced the alternative?

If this does exist, what exactly is the value that was lost if the “real” thing was so easily replaced?


Yeah there's literally no reason why anyone should be wary at the harmful effects of social media and of AI slop.


I was thinking exactly that: a coalition of people who refuse to use AI, and who refuse to interact with or support others who do use AI. I actually work for Photography Life, and we have already committed to 100% AI free: no generative AI for articles and no gen AI for photos either. I also have a 100% AI-free commmitment on my YouTube channel. Procreate for iPad believes in no AI as well.

But we need more supporters. Place AI-free banners on your site if you have one and send me the link. Encourage others to declare their support against AI. Writing and art is about communicating what humans make because it transmits experience! We have to come together and refuse as much as possible to support those who use AI! Feel free to contact me if you want to collaborate!


coalition makes it sound like its in the millions all coordinating together like these guys : https://en.wikipedia.org/wiki/Luddite


I am happy to declare that I am myself, a Luddite in many ways. No problem with that. I like lots of technology, not gonna lie. I'm a programmer and have a PhD in math, but I think it's gone too far in many respects. And if I have to build a coalition into millions, I'll do it one step at a time.


where do I sign up?

early 2000s style amish would do it for me...


No problems with that. I don’t even call you luddite. I wouldn’t look down on you and call what I don’t like anti art.


Where do I sign?


Mill and factory owners took to shooting protesters and eventually the movement was suppressed by legal and military force, which included execution and penal transportation of accused and convicted Luddites.[5]


The outcome of the times. Likewise it is far more likely harmless AI (like adding dumb AI stuff to a family video) will have stronger suppression by current power systems than society suppressing critics of generative AI. Mostly because the latter boils down to cultural preferences and protectionism, not the real sort of harm that would build collective mass to threaten progress. And the former because people are heavily motivated these days by outrage and abstract future threats, well before the tangible evidence exists of widespread harm.


How do you protect your community from the inevitable arrival of trolls who try to blend in and poison your photo collections with AI generated ones, only so they can say "see? it's so good, they don't even notice!"


The community has to be based on actually knowing people and a progressive level of trust. Just like other things in society: doctors and accreditations make sure that fake doctors don't practice, keeping sensitive information is based on trust, etc. An anti-AI community has to be a real community that is based on knowing and actually working with people, and not some online thing where anyone can sign up.

In other words, it's not a social network or an amusing website with a free account. Nor is it just about photo collections. It's about supporting real people who do not want to support or use AI.


Turning it into a prohibition-style movement where using AI taints you forever unless your confess your sins and pledge total abstinence seems like the opposite for the group that wants to present itself as a return to sanity and moderation. It would be better to say, "Do what you want elsewhere, but we don't do that here."

If you want to go full-radical, bombing a few data centers would probably be more effective than being mean to randos who used Midjourney a few times.


Hey, I am not trying to be mean to anyone. We all are forced to use technology more than we could choose too. Some friends and family of mine use AI and I am not mean to them. I just gently talk about it. Even my Macbook has some AI processor in it though I don't use it.

There's no condemnation whatsoever, except for some who are really pushing AI.


>who refuse to interact with

Zero-tolerance ostracization is mean. If you're not doing that, and simply setting your boundaries - "I don't like the technology, let's talk about something else" - that's one thing. That's not what you were encouraging people to do in GP. I think we should be concise in our rhetoric, because it does have the potential to create needless conflict.

"But you just said-"

I did. People keep saying that AI is an existential threat requiring the most extreme action to stop. So... do people really believe that? Or are they just saying it? The treehuggers have a whole file on their actions[1]; where are the luddites? Is there really conviction, or just people letting off steam at convenient targets.[2]

[1]: https://www.dhs.gov/archive/science-and-technology/publicati...

[2]: I am not encouraging nor do I condone illegal or destructive acts, I'm analogizing a similar movement and its tactics.


I do think there has to be some responsibility taken, rather than being "nice" all the time. And I was speaking of a business sense. If someone uses AI in their own work, I fully support not supporting them. That's not social ostracization, that's just good business sense in not supporting a technology that I don't believe in.

So yeah, let me be clear: I absolutely support boycotting people who use AI when it comes to business decisions, and I absolutely support an economic war of attrition against them.

You know what's mean? Creating a technology that takes other people's jobs en masse like AI. But refusing to do business with those who use AI? I think that's fair play, and if it is at all mean, then it's just karma.


You're conflating the totally sound business decision not to use AI (as it currently exists, in its current legal limbo) in production, with an emotion-driven appeal to normalcy (which itself you have conflated with justice).

Support, boycott, war: these are words when your intent is to not associate with someone at all, not just in a business sense. You're trying to present your argument as soft and hard at the same time, and I'm saying that it's completely disingenuous if you don't choose one or the other: either this is an existential crisis that deserves radical action, or it's not and therefore not worth ruining relationships or unintentionally presenting yourself as hysterical or a less-than-rational bully. I'm asking you to choose for your sake, and for the sake of the argument you end up actually wanting to stand behind.


Imagine a herd of AI agents, indistinguishable from real human, are to interact with you and the content you create. Are you ok with it? Would you even care that they are not real people? Or maybe you'd even prefer them sometimes?

Because that's where we are headed.


If the AI agents are indistinguishable from humans then it doesn’t matter how you feel about them.


But wouldn’t you want to know that something you made affected actual people in meaningful ways? I’d hate to have LLM bots drown out real people with actual opinions and experiences, and never know if I was actually reaching anyone.


I’ve heard similar equivalencies in the past and I don’t know what the root cause of someone feeling this way is? Apathy?

I’ve heard coworkers argue that enjoyment from video games is equally valuable to enjoying time with your family or enjoying a walk. I’m a lifelong gamer and it’s still heartbreaking to hear people say the grass outside is no more valuable to them than what’s going on in a digital world.

What is going on with us?


> I’m a lifelong gamer and it’s still heartbreaking to hear people say the grass outside is no more valuable to them than what’s going on in a digital world.

Digital is just another part of the human experience, albeit much further removed from the surviving-in-the-savannah experience most of our ancestors evolved in.

For folks with significant limitations, screen-based experiences can be a huge enhancement and even a lifeline. A balanced life is such a subjective concept. IME it's easy and natural to judge, but not always fair or productive.

By all means, let's encourage people to try a wide variety of experiences in all available mediums. Yet without scolding or pity when they make choices we don't relate to.


Would you say the same about any other drug?

Not experiencing reality is a bit like dreaming no?


We all have a different experience of reality, even if differences are usually small. So the digital realm is an extension of reality. They aren't mutually exclusive. Ask any kid whose been cyber bullied.

Drugs alter ones perspective, so IMO a case can be made that they too are legitimate part of life. Dreaming too.

What people choose to do with the short time they exist is up to them. I hope my kids have a happy and fulfilling life, ideally giving as much as they take and leaving the world better than they found it. Yet it's not my place to dictate what sources of fulfillment are more or less legit, or even what 'better' looks like. (Of course I still teach them some fundamentals, to think critically, to make informed choices, and to try many things.)


What I mean is if you can’t tell then it doesn’t matter because as far as you’re concerned you could believe you’re talking to an AI or believe you’re talking to a human and be right either way.

It’s kind of the like the “brain in the vat” theory. Whether your brain is in a vat or not doesn’t really matter to you. If it is or if it isn’t is meaningless to you because there’s no difference as far your existence is concerned.


At least until you do something that distinguishes them like try to sell merch or host a meetup.


Anyhow ... All of us hate lie. Done by real human. But you are okay that something pretend as a human is not a human?


Well, yes: I prefer interacting with a human person to doing so with a machine. Simply because of the value I assign to one and the other.


Forums linked to real-world activities, with people posting real pictures, are doing fine. For example I'm on several car forums and life is all good there.

Bonus when there are "verified members" zones: for example on a car forum where another owner of a car of the same brand has to vet you in, certifying he saw you in real-life with a car from that brand.

> Curation seems like the next best step.

I don't disagree.


Focus on in real life experiences, stop consuming low effort media.


I feel that there always will be people loving their craft. Maybe their price will go higher but it also means they might work on more interesting projects because the crap ones will be generated by AI. It also means that commercials, ads will be customized for each person with all the informations gathered on that person, maybe every time. Dynamic video Ads, AI will unlock this potential (and I hope AI will also unlock the potential to definitely filter itself)


> I've started to get back into reading books and long form blogs

How do you know the books and blogs you're reading weren't generated by an AI?


Well there's countless books before AI, before the internet, before the computer to read. Blogs are a bit tricky but I just have to trust that the blogs I follow are writers that love writing and don't care about making mass appeal content.


That was unavoidable. Your (and mine, btw.) reaction is healthy. I hope more people start to withdraw more from the virtual world. Once they do, many will recognize that it's become an addiction: The good feelings of discovering new things on the internet is gone long ago. What is left is the bad feeling, when we don't distract ourselves with it. A typical sign for addiction.

Back to the topic: I'm still worried by the huge danger of mass delusion. Commercially, but much more politically. How easy will it be to create fake movies of war scenes. There've been fake photos around for a long time. By setting up real scenarios and of course by "photoshopping". Now, there's not much left we can trust.

The good and only way out of this to learn to not trust anything you see on the internet or television. We need new ways of trusted communication.


You're just getting old, man.


> reading books

The harder part to me: "books" aren't great in average.

There's an ocean of garbage books basically mass produced by ghost writers, solely targeted at milking a socially hot topic or grifting.

There's an obesity epidemic ? Let's make 3000 new books repackaging classic diets under new catchy titles and see which sticks. AI is coming ? Let's throw money at anyone with a vaguely related blog and expand their 3 hot takes into 300 pages with a nice title and cover.

At the end of the day what we get in long form is often a stretched version of what was fitting into 3 tweets before they tried to monetize it in a book deal.


> The harder part to me: "books" aren't great in average.

If all you're looking for is any book, then sure. But if you have specific interests in mind, or you actually take the time to get beyond an HBR listicle of the "5 best books for [x] topic", you can find so many good books. Too many too read in a lifetime. It's just insane how many great books there are given how many books there are already and are published every year. Even just subscribing to a pub like the NYRB or LRB and reading the reviews/essays has made my TBR list explode with interesting books. Then you have the smaller academic or niche presses that are publishing some great, serious, academic or weirdo fiction stuff, stuff that has no marketing whatsoever.

> At the end of the day what we get in long form is often a stretched version of what was fitting into 3 tweets before they tried to monetize it in a book deal.

Those books are very easy to spot. You can safely ignore 99% of Business, Productivity, Nutrition (add fitness), and pop-science, books, as you already noted. A lot of pre-internet books on those topics are fascinating, though, because they're usually written nothing at all like a blog post, and as examples that there has been very little world-changing written in those topics in the last 30-40 years.


You're right. And then the same thing applies to any generic medium: looking beyond the bulk of it and focusing on the good parts will yield life changing discoveries.

I think the main points are whether there's an incentive for communicate, and whether filtering mechanisms can surface the interesting parts. For a long time, book deals and publications were basically the only venue to monetize ideas at scale.

Nowadays monetization can happen differently and more diverse media. Also a well trained algorithm will often be better than having only human curation, all biases included.

All in all, I've learned more and read more research papers starting from online videos and discussions in the last decade than from any of the books I read in my entire life. That's where I see obsessions on a single medium (books) to miss the mark.


There's more than a lifetime of good literature, good non-fiction, good genre fiction to read. No one is making you read dross.


It's the same for every medium though: good stuff will be good, dross will be dross. So why books in particular ?


The drama on this site regarding AI is absolutely hilarious. You'd think this group would be excited about game changing technology but well over half the crowd is crying over it. You deciding the Internet is broken or whatever is a you thing. The rest of humanity will continue to use, enjoy, and create new interesting stuff with it. As if most shit on the Internet hasn't been mostly trash to begin with. Even in the late 90s most of it was useless. Learn to adapt or drop out and cry I guess. Get a hobby. These tears over mind blowing tech is pathetic.


Sounds like they struck a nerve to make you this aggressive... I think the tech is cool. I also think it's making the web worse, because people use that tech to abuse the system.


It's possible to accept that a new tool is incredibly technically impressive while expressing concerns over how it may be deployed. Now feels like the right time to be having those conversations.


Indeed, seems like a lot of people are getting old and conservative here ;)


It's cool and I use it sometimes. But I am jaded and have seen the next big thing come and go time and time again. I'll partake but I'm not going to be the one that eats the biscuit.


Please try to take a less aggressive tone on HN, lest this become reddit


> I'm hoping some non AI generative content branch of the internet will be created.

Social media has been filled with disinformation and bot generated slop for well over a decade. AI or not, the internet is turning into shit until we come up with ways to filter it out. Maybe we need to build a section of internet where we use identity verification, the anonymous internet has all been turned suspect by malicious actors.


Sadly, something I've realized is that, as the junk gets more abundant, most people's taste seems to get worse.

It's not only because of AI, it was a pre-existing trend. Just look at music, books, movies (can't really call them films anymore). Humans are becoming as soulless as machines. The machines are shaping us in their image, human consciousness will be extinct long before humanity becomes extinct physically.


Is this a quote from Plato? Sure reads like something every single generation that aged out of the prime time has said.


The cost of 'quality' is going up and people don't fully realize that they can't afford it anymore because it is being hidden from them.

Put another way: Anything that has benefited from Moore's law or the drop in energy prices (which itself benefits from moore's law)has been deflationary while anything that requires humans in the loop has become inflationary.

[1]:https://www.visualcapitalist.com/wp-content/uploads/2023/02/...

Music/Movies: In the future, the common person will have endless amount of AI slop to meet whatever they feel like watching. Sounds like abundance right? Well the rich will have real human actors/singers do performances at their behest and only they can afford it.

Books: This is already happening: AI will provide the answer to any question but people will not have the mental fortitude to derive the solution by learning the first principles manually. Its much easier for AI to provide you the direct answer instead of a breakdown of how it got to the answer and why. That requires enormous mental concentration(ie. learning). As a result they will further be enslaved by the AI and the few wealthy people will have access to humans that can truly teach them anything they want. An AI teacher for all is not the same as a human teacher.


Yeah it’s tough. Could get the people to curate the content. Bookmarks used to be a sacred thing. I’m sure if we band together, the best sites are already known to someone, somewhere.

But yeah, read a book. Practice poetry. Go analog. It’s very rewarding!


We need a more thoughtful internet


And yet here you are.


There's nothing wrong with using existing platforms to declare the problems of said platforms. To me the "and yet you are using the internet"-type replies sound rather defensive and shallow. The truth is, sometimes it is possible to dismantle the master's house using the master's tools. Do we not complain about our governments using the countries we live in or the tools provided by the government?

People who point out the deficiencies of technology sometimes must use said technology. Nothing wrong with that.


Mr Gotcha strikes again.

https://thenib.com/mister-gotcha/


Man, everyone is happy with these advancements, and they are impressive.

I’m here looking at users and wondering - the content pipelines are broader, but the exit points of attention and human brains are constant. How the heck are you supposed to know if your content is valid?

During a recent apple event, someone on YT had an AI generated video of Tim Cook announcing a crypto collaboration; it had a 100k users before it was taken down.

Right now, all the videos of rockets falling on Israel can be faked. Heck, the responses on the communities are already populated by swathes of bots.

It’s simply cheaper to create content and overwhelm society level filters we inherited from an era of more expensive content creation.

Before anyone throws the sink at me for being a Luddite or raining on the parade - I’m coming from the side where you deal with the humans who consume content, and then decide to target your user base.

Yes, the vast majority of this is going to be used to create lovely cat memes and other great stuff.

At the same time, it takes just 1 post to act as a lightning rod and blow up things.

Edit:

From where I sit, there are 3 levels of issues.

1) Day to day arguments - this is organic normal human stuff

2) Bad actors - this is spammers, hate groups, hackers.

3) REALLY Bad actors - this is nation states conducting information warfare. This is countries seeding African user bases with faked stories, then using that as a basis for global interventions.

This is fake videos of war crimes, which incense their base and overshadow the harder won evidence of actual war crimes.

This doesn’t seem real, but political forces are about perception, not science and evidence.


We are at a crossroads of technology, where we're still used to the idea that audio and video are decent proof that something happened, in a way in which we don't generally trust written descriptions of an event. Generative AI will be a significant problem for a while, but this assumption that audio/video is inherently trustable will relatively soon (in the grand scheme of things) go away, and we'll return to the historical medium.

We've basically been living in a privileged and brief time in human history for the last 100-200 years, where you could mostly trust your eyes and years to learn about events that you didn't directly witness. This didn't exist before photography and phonograms: if you didn't witness an event personally, you could only rely on trust in other human beings that told you about it to know of it actually happened. The same will soon start to be true again, if it isn't already: a million videos from random anonymous strangers showing something happening will mean nothing, just like a million comments describing it mean nothing today.

This is not a brave new world of post-truth such as the world has never seen before. It is going back to basically the world we had before photo, video, and sound recordings.


That’s an interesting thought.

I think I would not like to live in a world in which democracy isn’t the predominant form of government. The ability of the typical person to understand and form their own opinions about the world is quite important to democracy, and journalism does help with that. But I guess the modern version of image and video heavy journalism wasn’t the only thing we had the whole time; even as recent as the 90’s (I’m pretty sure; I was just a kid), newspapers were a major source. And somehow America was invented before photojournalism, but of course that form of democracy would be hard for us to recognize nowadays…

It is only when we got these portable video screens that stuff like YouTube and TikTok became really important news sources (for better or worse; worse I would say). And anyway, people already manage to take misleading or out of context videos, so it isn’t like the situation is very good.

Maybe AI video will be a blessing in disguise. At some point we’ll have to give up on believe something just because we saw it. I guess we’ll have to rely on people attesting to information, that sort of thing. With modern cryptography I guess we could do that fairly well.

Edit: Another way of looking at it: basically no modern journalist or politician has a reputation better than an inanimate object, a photos or video. That’s a really bizarre situation! We’re used to consulting people on hard decisions, right? Not figuring out everything by direct observation.


I'd argue it's a step or two more manipulative. Not only do bad actors have the ability to generate moving images which are default believed by many, they also have the ability to measure the response over large populations, which lets them tune for the effect they want. One step more is building response models for target groups so that each can receive tailored distraction/outrage materials targeted to them. Further, the ability to replicate speech patterns and voice for each of your trusted humans with fabricated material is already commonplace.

True endstage adtech will require attention modeling of individuals so that you can predict target response before presenting optimized material.

It's not just a step back, it's a step into black. Each person has to maintain an encrypted web of trust and hope nobody in their trust ring is compromised. Once they are, it's not clear even in person conversations aren't contaminated.


> Further, the ability to replicate speech patterns and voice for each of your trusted humans with fabricated material is already commonplace.

Just like the ability to emulate the writing style of your trusted humans was (somewhat) commonplace in the time in which you'd only talk to distant friends over letters.

> Once they are, it's not clear even in person conversations aren't contaminated.

How exactly could any current or even somewhat close technology alter my perception of what someone I'm talking to in-person is saying?

Otherwise, the points about targeting are fair - PR/propaganda has already advanced considerably compared to even 50 years ago, and more personalized propaganda will be a considerable problem, regardless of medium.


The difference between artisanal work, vs mass production is enough to make it separate products.

The rate of production is the incomparable, no matter what the parallels may seem.


I feel as though i am honor-bound to say that this isn't new and we havent really been living in a place where we can trust in the way you claim. Its simply that every year it rapidly becomes more and more clear that there is no "original". you're not wrong i just think its important for people who care about such things to realize this the result of a historical process which has been going on longer than we've all been alive. in fact, it likely started at the beginning of the 100-200 year period you're talking about, but its origins are much much older than that.

read simulacra and simulation: https://0ducks.wordpress.com/wp-content/uploads/2014/12/simu...

or this essay from pre-war germany https://en.wikipedia.org/wiki/The_Work_of_Art_in_the_Age_of_...


Which was the era of insular beliefs, rank superstition and dramatically less use of human potential.

I feel that it’s not appreciated, that we are (were) part of an information ecosystem / market, and this looks like the dawn of industrial scale information pollution. Like firms just dumping fertilizer into the waterways with no care to the downstream impacts, just a concern for the bottom line.


It's not all the way back as long as solid encryption exists: Tim Cook could digitally sign his announcements, and assuming we can establish his signature (we had signatures and stamps 200 years ago) video proof still works.

So we're not going all the way back, but the era of believing strangers because they have photographic or video proof is drawing to a close.


Cryptography is nice here, but the base idea remains the same: you need to trust the person publishing the video to believe the video. Cryptography doesn't help for most interesting cases here, though it can help with another level, that of impersonation.

Sure, Tim Cook can sign a video so I know he is the one who published it - though watching it on https://apple.com does more or less the same thing. But if the video is showing some rockets hitting an air base, the cryptography doesn't do anything to tell you if these were real rockets or its an AI-generated video. It's your trust in Tim Cook (or lack thereof) that determines if you believe the video or not.


All this talk of trust speaks to the larger issue here too - that we've lost so much trust in governments and other important institutions. I'm not saying it was undeserved, but it's still an issue we need to fix.


This is too much work for the human use case.

Practically speaking, no one is going yo check provenance when scrolling through Reddit sitting on the pot.


Interesting thought. An alternative is a world where we can securely sign captured medium.


That only really matters if it's hard to feed generated data into a camera/microphone that does this signing. It's not that hard already (you can just film a screen showing the generated video for a very basic version of this), and if there was significant interest, I'm sure it would become commoditized very quickly. Not to mention that any signing scheme is quickly captured by powerful states.


Before photography was invented, mass communications was all just words on paper, right?

How would you know that the British burned down the white house in 1812? Anyone could fake a paper document saying it so. (Except many people were illiterate.)

As far as I can see you need institutions you can trust.


Everyone is focusing on photography.

1) it’s not the tech. It’s the rate of production. You had only 1 newspaper, no mass media, and boatloads of time in the 1800s

2) Before photography was created we lived in a world steeped in superstition, inequality and ignorance. A tiny economy compared to what we have today.

3) humanity changed with the advent of photography. It ushered in a new standard of proof that modern society depends on to this day.


If this was true, why haven’t we seen it with manipulated pictures?

Maybe I’m not well informed but there seem to be no example for the issues you describe with photos.

I believe it’s actually worse than you think. People believe in narratives, in stories, in ideas. These spread.

It has been like this forever. Text, pictures, videos are merely ways to proliferate narratives. We dismiss even clear evidence if it doesn’t fit our beliefs and we actively look for proof for what we think is the truth.

If you want to "fight" back you need to start on the narrative level, not on the artifact level.


We have seen it with manipulation of pictures.

Hell - look of the fate of the rest of the world online. They’re basically thrown to the fucking wolves.

Minorities are lynched around the world after viral forwards. Autocrats are stronger than ever and authoritarian regimes have flourished like never before.

Trust and safety tools are vastly stronger for English than any other language. See the language resource gap (lost in translation, Nicholas and Bhatia)

In America, the political divide has reached levels unimaginable. People live in entirely different realities.

Images from democrats sides are dismissed as faked and lies take so long to discredit that the issue has passed on, tiring fact checkers and the public.

The original fake news problem of Romanian advert farms focused entirely on conservative citizens.


Also, cost. How many do you have to generate to get something you want? Does it take 1 or a 100 attempts to get something reasonable, and what does it cost for each attempt? Might not affect Hollywood, but someone has to pay for this to be profitable for Meta. How many 5-Gigawatt power stations will be required (what OpenAI wants to build all over the country) if lots of people use this?


Hopefully this becomes the limiting factor, however generating more power isn’t that hard - and it doesn’t solve the rate of production issue.


The Information Bomb. There's a reason military types and spooks are joining the boards of OpenAI and friends.

https://www.goodreads.com/book/show/203092.The_Information_B...

> After the era of the atomic bomb, Virilio posits an era of genetic and information bombs which replace the apocalyptic bang of nuclear death with the whimper of a subliminally reinforced eugenics. We are entering the age of euthanasia.


There is some credence to the idea that the third reich was only possible due to mass media. Radio, television, and movie theatres broadcasting and rebroadcasting information onto a populace that did not have experience with media overload and therefore had no resistance to it.

Not attempting to justify their actions or the outcomes, just that media itself is and has been long known to be a powerful weapon, like the fabled story of a city besieged by a greater army, who opened their gates to the invaders knowing that the invaders were lead by a brilliant strategist.

The invader strategist, seeing the gates open, deduced that there must be a giant army laying in wait and that the gates being open were a trap, and so they turned and left.

Had they entered they would have won easily, but the medium of communication, an open gate before an advancing horde, was enough in and of itself to turn the tide of a pitched battle.

When we reach the point where we can never believe what we see or hear or think on our own, how will we ever fight?


It's just something to put ads next to. Selling ad spots is the business, and investors demand an increase even if they already have 3.5 billion pairs of eyeballs. https://www.404media.co/where-facebooks-ai-slop-comes-from/


This will just lead to people not taking videos as evidence anymore. Just like images of war crime aren’t irrefutable evidence due to staging and photoshop, videos will lose their worth as evidence. Which is actually a good thing in some instances. If someone blackmails you with nudes/explicit videos, you can just ignore it and claim it’s fake.


The solution to that is to make models both open weight and open source. That will equalize the level playing field.


How will that help? How will uncle Joe be able to tell fake videos better with an open source model?


Uncle Joe will just stop assuming that just because there’s a video it is real. That hasn’t been the case for decades. About time uncle Joe caught on.


So what’s the plan to level the playing field in that case? Give everybody an equal amount of compute and ask them what sort of propaganda they’d like to have theirs contribute to?


I only care about being able to express myself more easily

Maybe get a job where interviewers are biased against my actual look and pedigree

Just ignore everyone else’s use of the tool


> Just ignore everyone else’s use of the tool

That's precisely the hard part!


Yeah... African users... oh poor infantile, gullible, creatures... so incapable of discerning truth from falsehood are the ones to be fooled by generative AI...I get the gist


A State-actor could have already done that manipulation using CGI or something. The answer is not to trust the people and persons who one sees as not to be trusted. As per your Israel example, I don’t personality trust them because I have low levels of trust in genocidal regimes, so even if IDF-asset Gal Gadot were to come to my door and tell me that I won a million dollars I would just slam-shut the door in her face, never mind her and her ilk trying to convince me and people like me through videos posted on the internet of whatever it is they are trying to convince people of.

Again, plain common sense just works, most of the times.


Impressive.

It's only going to get better, faster, cheaper, easier.[a]

Sooner than anyone could have expected, we'll be able to ask the machines: "Turn this book into a two-hour movie with the likeness of [your favorite actor/actress] in the lead role."

Sooner than anyone could have expected, we'll be able to have immersive VR experiences that are crafted to each person.

Sooner than anyone could have expected, we won't be able to identify deepfakes anymore.

We sure live in interesting times!

---

[a] With apologies to Daft Punk: https://www.youtube.com/watch?v=gAjR4_CbPpQ


Are we only a few years away from one person/small group made movies where the dialog, acting, location and special effects can be tweaked endlessly for a relatively low cost. If I was a studio exec I'd be worried.


I'm not sure, I see a common pattern with autonomous vehicle, text to image, llms: the last 10% are hard to achieve


I don’t know, for a car the last 10% has a direct relation with "people die" that is obvious to everyone. With the movie made in anyone basement, the risks are far less likely to create such a vivid perception of dramatic end result.

Not that cyber-bullying and usurpation schemes escalating a whole new level being less of a concern in the aftermaths, to be clear.


Less about risk parallels and more about control parallels. The last 10% of fine grained control over a system is hard. Like every time I’ve done text to image prompting and it gives me a great starting point, but cannot get certain details i want, no matter how i ask.


The average person spends 9-11 hours per day consuming media depending on what source you look at. When people are playing games or browsing social media at the same time that they have the latest Netflix show on their TV, you can't tell me that this is really valuable time spent to deepen one's understanding of the human experience; it's a replacement for the human experience.

Most people will not notice if the soundtrack to a new TV show is made by a 5 word AI prompt of "exciting build-up suspense scene music" while they're playing pouring money into their mobile gacha game to get the "cute girl, anime, {color} {outfit}" prompt picture that is SSS rank.

You or I might not care for AI slop, but it's a lot cheaper to produce for Netflix or Zinga or Spotify or whatever, and if they go this route, they don't have to pay for writers, actors, illustrators, songwriters, or licensing for someone else's product. They'll just put their own AI content on autoplay after what you're currently watching, and hope most people don't care enough to stop it and choose something else.


If we judge from AI writing, we can extrapolate what an AI movie would be like. I cannot imagine reading an AI book. It would look and smell like a book but nothing of value or new insight would be inside. Michael Bay might be very interested.


You look back at old movies, and on a technical level they really aren’t as good as contemporary trash productions. But they knew how to weave the camera and a script into something amazing back then even if they didn’t have resolve and aftereffects to polish every shot. A good script writer, editor and cinematographer have a huge impact on the quality of a movie. But these roles are only a small portion of the operating budget of a movie. Filming every single scene is an exhausting undertaking and this constitutes the bulk of a movie production’s budget. If you can get good quality footage without leaving your garage then you can have a small team make a great movie. Maybe not the extent where you simply click a button but to the level that you would launch straight to a streaming service.


Yes, AI will probably fail miserably for a while at least, at making the sort of well written artistic, clever movie that nobody watches. The only ones that need to be worried are the studios making churning out massive blockbusters…


Michael Bay has said that he doesn't like AI.


I stand corrected. I should have remembered that organisms that occupy the same niche have the strongest competition.


If you look at the majority of their catalogue these days, they really aren't trying to squeeze that last 10% out of the movie quality these days anyways, so I doubt it will matter.


A 90% approximation of what somebody wants might be more interesting to that person, than a 99% approximation of what some studio exec wants.


Self driving cars are quite safe and ubiquitous if you're in the right cities


It’s true of everything


Yet VCs are sold that last 10% and an additional 10% on top. No idea why they keep throwing their money into the fire.


I'm grateful for this


Because VCs are compulsive gamblers, and they're convinced the payout if they "win" is enormous


to be fair, that's exactly what the asset class EXISTS for - betting on huge outcomes, no matter the odds. People misunderstand that due to how much of tech is "VC funded" when building stuff that would fare better as a bootstrapped company (or funded by other means)


I wouldn't be. How is any of this going to lead to meaningful art?

I don't think you get "The Green Mile" from something like this.


You don't get "The Green Mile" from this, because it's a tool. You get "The Green Mile" with artistic vision. The tool has to be told what to do. But now a director can shoot a film with actors who don't match the physical description of a character in a story, and then correct their race/gender/figure/whatever with AI. That probably means they save money on casting. A director can shoot a scene inside a blank set and turn it into a palace with AI. That saves money from shooting on location or saves money from having to pay for expensive sets.

So now a director with a limited budget but with a good vision and understanding of the tools available has a better chance to realize their vision. There will be tons of crap put out by this tool as well. But I think/hope that at least one person uses it well.

But because it will make shooting a movie more accessible to people with limited budgets, the movie studios, who literally gatekeep access to their sets and moviemaking equipment, are going to have a smaller moat. The distribution channels will still need to select good films to show in theaters, TV, and streaming, but the industry will probably be changing in a few years if this development keeps pace.


This is the best answer I've seen, but I think what was demonstrated is miles away from this. A lot would need to be able to be specified (and honored) from the prompts, far more than any examples have demonstrated.

I'm not against tools for directors, but the thing is, directors tell actors things and get results. Directors hire cinematographers and work with them to get the shots they want. Etc. How does that happen here?

Also, as someone else mentioned, there is the general problem that heavily CG movies tend to look... fake and uncompelling. The real world is somehow just realer than CG. So that also has to be factored into this.


I think it starts simple. Have you ever been watching a movie or tv show, and it shows the people walking up to the helicopter or Lamborghini and then cut to "they've arrived at their destination no transportation in sight"?

It will start out with more believable green screen backgrounds and b roll. Used judiciously, it will improve immersion and cost <$10 instead of thousands. The actors and normal shots will still be the focus, but the elements that make things more believable will be cheaper to add.

Have you ever noticed that explosions look good? Even in hobby films? At some point it became easy to add a surprisingly good looking explosion in post. The same thing will happen here, but for an increasing amount of stuff.


Interesting that you pick that example in particular. Due to the sheer depth of behinds the scenes takes HBO has provided for Game of Thrones and House of the Dragon, it seems to be the consensus view among effects folks that CG fire and explosions are nearly impossible to get right and real fire is still the way to go.


"Why is it so hard to make fire look good in movies?" (New York magazine, October 2023)

https://www.vulture.com/article/movies-fire-computer-generat...

https://archive.is/u8Ugr


That I could believe, although... there is quite a bit of commentary from film buffs that lots of the stuff done in post doesn't quite look right, compared to older films.

Which doesn't mean it won't keep happening (economics), but it doesn't necessarily mean any improvement in movie quality.


It doesn't look right in a lot of older films either. Plenty of entertaining films were poor quality yet still make money and attract audiences.


My guess is the art form will evolve. When YouTube started, some people thought it would not be able to compete with heavily produced video content. Instead, YouTube spawned a different type of "movie". It was short-form, filmed on phone cameras, lightly scripted, etc. The medium changed the content. I suspect we will see new genres of video content show up once this tech is widely available.


The first real movies made 100 years ago looked like something someone today could put together in their garage on a shoestring budget. AI-generated movies have existed for just two years, and are only going to get better. This is bleeding edge research, and I haven't seen any sign yet of AI models hitting a quality ceiling.


Shameless plug: I just created a short AI film (1) and tried to tell an actual story and trigger emotions. I spent countless hours crafting the script, choosing the shots, refining my prompts, generating images, animating images, generating music, sounds, and so on... For me AI tools are just that - tools. True, you have to yield them some "control", but at the end of the day you are still the one guiding and directing them.

Similarily, a film director "just" gives guidance to a bunch of people: actors, camera operator, etc. Do you consider the movie is his creation, even if he didn't directly perform any action? A photographer just has to push a button and the camera captures an image. Is the output still considered his creation? Yes and Yes, so I think we should consider the same with AI assisted art forms. Maybe the real topic is the level of depth and sophistication in the art (just like the difference between your iPhone pictures and a professional photographer's) but in my opinion this is orthogonal to it being human or AI generated.

To be honest so far we have mostly seen AI video demos which were indeed quite uninteresting and shallow, but now filmmakers are busy learning how to harness these tools, so my prediction is that in no time you will see high quality and captivating AI generated films.

(1) https://artefact-ai-film-festival.com/golden-hours-66f869b36... Please consider liking it!


I have a newborn daughter, so watching this made me cry a little bit, alone at my desk.

If yall needed evidence of these tools giving everyday people the ability to make emotion-tugging creations, I'll send you a picture of the tears!

Now I'm thinking I can finally make the (IMO) dope music videos that come to me sometimes when I'm listening to a song I really love.


I got to the scene where there is a doctor visit (halfway though) and though "NOPE, I'm not going through this[1] again" and closed it.

[1] The first minutes of UP


this is an excellent example that despite all the technical limitations (the ugly image artifacts, the lack of exact image consistency, etc), it's _already possible_ to create something that connects. The "format purists" currently dismiss AI tooling the same way they used to dismiss computer graphics animation back when Toy Story 1 came out.

Excellent work!


I'm in agreement with scudsworth here, but i have a little more nuance, i think. I know how long this took, and how many compromises were made. The only reason this works, at all, is because humans have a massive list of cultural memes and tropes that shorthand "experience" for us. it has the "AI can only generate 2-5 seconds of video before it goes completely off the rails" vibe; which allows it to fit in with the ADHD nature of most video production of the last 30 years - something a lot of people do not like. For an example where this was jarring in the video, when they're drawing or painting, you cut the scene slightly too late, you can see the AI was about to do some wild nonsense.

What happened to the mom? Why does the kid get older and younger looking? why does the city flicker in the beginning? which kid is his in the ballet performance? why do they randomly have "lazy eye"? i could keep going but i think we all get my point.

I can intuit the tropes used by the AI to convey meaning, and i'd be willing to list them all with relevant links for the paltry sum of $50. Be warned, it will be a very large list. Tropes and "memes" are doing 100% of the heavy lifting of this "art".

Sorry, human. As someone who stopped creating art on a daily basis due to market dilution (read: it's too hard to build a fanbase that i care about), i am very critical of most "art" produced anyhow.

this is dogshit.


Very well done! I’m not ashamed to admit that I cried.


I will take a look. Good to hear.


a shameless plug deserves an honest review: this is dog shit


That's not a review. It's an (probably honest) opinion stated like a fact.

I liked it.


sure it is. that's my critical evaluation of this work. if you liked it, i highly recommend the hallmark channel, lifetime, family channel originals, the netflix straight-to-vod swimlane, and a frontal lobotomy.


I, uh, gave some more nuance because i had some free time as a sibling comment to this. I hope we don't get downvoted because someone has to call a spade a spade.


good comment, haha. agree with those points and would add, since im thinking about it again now, that the entire work feels like a fairly (deeply) shallow riff on the opening sequence in pixar's "up". but of course with no stakes or emotional impact whatsoever.


> How is any of this going to lead to meaningful art?

Nearly all the movies that go to theaters aren't "meaningful art". Not only that but what's meaningful to you isn't necessarily what's meaningful to others.

If someone can get their own personal "Godzilla VS The Iron Giant" crossover made into a feature-length film it will be meaningful to them.


They are art compared to getting uncontrolled choices. Who decides what the actors look like? How they move? What emotions their faces are to convey? How the blocking works for a scene? What the color scheme is for the movie? How each shot is taken? Etc.

There is a vast difference between a formulaic hollywood movie and some guy with a camera. If I say "Godzilla vs. The Iron Giant" what is the plot? Who is the good guy? Why does the conflict take place?

AI will come up with something. Will it be compelling even to the audience of one?

As a toy, maybe. As an artistic experience... not convinced.


> Who decides

You still aren't getting it. Movie directors aren't making these decisions either.

What they are doing, is listening to market focus groups and checking off boxes based on the data from that.

A market focus group driven decisions for a movie is just as much, if not more so, of an "algorithm" than when a literal computer makes the decision.

Thats not art. Its the same as if a human manually did an algorithm by hand and used that to make a movie.


This is a common perspective among people that don't realize how much goes into making a movie. That stuff informs which movies get approved and it certainly can inform broader script changes, casting changes, and in some cases editing decisions, but there's a UNIVERSE of other artistic decisions that need to be made. Implying that the people involved are mere technicians implementing a marketing strategy is exponentially more reductive than saying developers and designers aren't relevant to making software because marketing surveys dictate the feature development timeline. A developer's input is far more fungible than an artist working on a feature film.

I assure you, they don't do surveys on the punchiness and strategy used by foley artists; the slope and toe of the film stock chosen for cut scenes by the DP or that those cut scenes should be shot like cut scenes instead of dream sequences; the kind of cars they use; how energetic the explosions are; clothing selection and how the costumes change situationally or throughout the film; indescribably nuanced changes in the actor's delivery; what fonts go on the signs; which props they use in all of the sets and the strategies they use to weather things; what specific locations they shoot at within an area and which direction they point the camera, how the grading might change the mood and imply thematic connections, subtle symbolism used, the specifics of camera movements, focus, and depth of field, and then there's the deeeep world of lighting... All of those things and a million others are contributions from individual artists contributing their own art in one big collaborative project.


> Implying that the people involved are mere technicians implementing a marketing strategy

Well no. Instead, I am implying that they are as much of a "technician" so someone who is putting in a huge amount of effort into making AI videos.

If you want to say that it is perfectly possible for someone to put a high amount of vision and make a large amount of creative decisions into AI videos, then I agree.

> All of those things and a million others are contributions

Yes I agree that there can be a million other contributions to making AI videos. Glad you agree too.


Your sarcastic, bad faith, I-know-you-are-but-what-am-I-level statements attack positions I dont hold, yet avoid addressing anything I said— all of which is based on my professional experience with using generative AI in media creation, and also film production pipelines and workflows. Your retort might have made you feel better about making baseless statements but it certainly didn't lend credibility to those statements or you in anybody else's eyes.


Its not sarcasm.

Its just that other people need to admit that yes it is possible for AI to include a lot of creative decisions.

You, being someone who claimed to care about that should be willing to admit that it is possible to do this.

I agree completely that creative input, such as the creative input that people can put into AI if they choose to do so, is important.

And just like how in other forms of media, yes sometimes people don't put in that creative input. But its still possible if you put in the effort.

And since you didn't address this at all, I can only assume that you agree with me completely.


Never in my entire life have I said it's not possible for the AI image generation process to include a lot of creative decisions. In fact, I've repeatedly said the opposite. Like I said, you're attacking positions I don't have.

I use prompt-based generative AI in creative ways as part of my professional processes all the time. And you know what? They are nowhere close to being useful for generating anything that plays a significant on-screen role in high-end media creation. Anybody who says they are does not know how different the requirements for high-end use cases are.

You're using me as a proxy to toe the line of a ridiculous ideological polemic that I have nothing to do with. If you want to argue with someone that has the uniform, standard-issue set of anti-AI opinions you expect to encounter so you don't have to consider your arguments too much, there's like thousands of them right over on reddit. Easy.


Great, so when I said this:

"Its just that other people need to admit that yes it is possible for AI to include a lot of creative decisions."

You response is actually, "yes, I agree with you. You are correct"

It sounds like you simply agree with me.


You seem to have forgotten what this conversation is about in your herculean effort to still feel right about saying something utterly ridiculous. You said this:

> You still aren't getting it. Movie directors aren't making these decisions either.

> What they are doing, is listening to market focus groups and checking off boxes based on the data from that.

> A market focus group driven decisions for a movie is just as much, if not more so, of an "algorithm" than when a literal computer makes the decision.

> Thats not art. Its the same as if a human manually did an algorithm by hand and used that to make a movie.

I responded as a professional in the field pointing out how ridiculous that take is, and then everything else that you said is putting words in my mouth based on opinions you assumed I had, but do not. I'm not arguing a "side" here-- I'm pointing out that something you said about practices in my field is entirely baseless. I have too much of a professional stake in this to pick a "side" because I actually have to deliver great work to spec, on time, and can't be bothered to field a whole bunch of people either with dunning-kruger confidence in their understanding technology or dunning-kruger confidence in understanding art having the nerve to be condescending while making entirely baseless, glib comments about what I do for a living, and acting like the righteousness of their cause makes it ok to be full of shit. If you want to be able to argue a "side" where you're just vaguely responsible for your grand idea and it doesn't really matter if you're full of it because nobody else there knows what they're talking about, either, then reddit is a tiny little ascii string down the street.


> Never in my entire life have I said it's not possible for the AI image generation process to include a lot of creative decisions. In fact, I've repeatedly said the opposite.

Hey well you said this. This is agreeing with what my point is, and I am glad we were able to clear up any possible misunderstandings, or I convinced you or similar.

As far as I am concerned you don't have any disagreements with my central point, which is good enough for me.

I am glad we cleared up the main miscommunication.


Cool. Nothing says intellectual confidence like refusing to acknowledge someone pointing out that you're spouting complete nonsense. Glad you were able to protect your ego by deciding I was talking about something that was more convenient for you. I sure hope you decide not to challenge your Dunning-Kruger confidence in backing up your ideas with "facts" and "information" based on the "content-aware fill" your brain uses in lieu of actual knowledge of movies as an artform and professional media production. I'm really happy that you think your "central point" means you can use naive assumptions lacking requisite information by orders of magnitude to condescendingly disparage real people's jobs and artistic practices. Surely, lacking knowledge of commercial art production doesn't negatively affect your ability to reason about the usage and effects of generative AI in commercial art production on both a practical and philosophical level. Surely. I hope you'll continue to pontificate about the finer points of this topic while refusing to consider dramatically more informed sources if they don't completely match your line of reasoning.


Some of it is done that way, but by no means all of it. You can easily see the differences, because, say, Wes Anderson movies are not the same as Martin Scorsese movies.

If it were really all just market decisions, directors would have no influence. This is not remotely the case. Nor are they paid as though that were true.


> This is not remotely the case.

Yes I agree that it is not remotely the case that AI videos involve zero creative input.

Thats my point. If anything, there could be a lot more creative input into AI videos, than a bog standard hollywood film.

> Some of it is done that way, but by no means all of it

Oh? So just like how that is the case some of the time for AI videos, but by no means for all of it? Yes, thats my point.


When you fabricate points and statements that work better with your counterpoint than what they actually said, it's a straw man argument. That's what you've got there. That man is definitely made of straw.


You'll never be able to talk to your friends about it. Culture wouldn't be a shared experience. We would all watch our own unique AI generated things.


More likely there will be cliches.


> Nearly all the movies that go to theaters aren't "meaningful art".

No but what they are is expensive, flashy, impressive productions which is the only reason people are comfortable paying upwards of $25 each to see them. And there's no way in the world that an AI movie is going to come anywhere close to the production quality of Godzilla vs Kong.

And like, yeah, their example videos at the posted link are impressive. How many attempts did those take? Are they going to be able to maintain continuity of a character's appearance from one shot to the next to form a coherent visual structure? How long can these shots be before the AI starts tripping over itself and forgetting how arms work?


My suspicion is that, if AI moviemaking actually becomes common, there will be a younger generation of folks who will grow up on it and become used to its peculiarities.

We will be the old ones going "back in my day, you had to actually shoot movies on a camera! And background objects had perfect continuity!" And they will roll their eyes at us and retort that nobody pays attention to background objects anyway.


My suspicion (and fear) is that poor members of the younger generation are going to grow up reading AI kids books and watching AI TV shows, and playing AI generated iPad games, and be less literate, less experienced, less rounded and interesting people as a result. This is already kind of a problem where under-served kids access less, experience less, and are able to do less and I see AI doing nothing but absolutely slamming the gas on that process and causing already under-served kids to be even more under-served. That human created art will be yet another luxury only afforded to the children of the advantaged classes.

And maybe they won't have a problem with it, like you say, maybe that'll just be their "normal" but that seems so fucking sad to me.


If poor kids of the future grow up reading AI book-slop instead of classic books that's going to be due to complicated factors of culture and habit rather than economic necessity. Most of the traditional Western canon of "great literature" is already in the public domain, available for free.

https://standardebooks.org/

For newer in-copyright works, public libraries commonly offer Libby:

https://company.overdrive.com/2023/01/25/public-libraries-le...

It gives anyone with a participating-system library card free electronic access to books and magazines. And it's unlikely that librarians themselves will be adding AI book-slop to the title selection.


> If poor kids of the future grow up reading AI book-slop instead of classic books that's going to be due to complicated factors of culture and habit rather than economic necessity.

To be clear, I'm not talking great literature. I'm talking Clifford the Big Red Dog type stuff.

That said I still have a number of problems with this assertion:

It will absolutely be down, in part, to economic necessity. Amazon's platform is already dealing with a glut of shitty AI books and the key way they get ahead in rankings is being cheaper than human-created alternatives, and they can be cheaper because having an AI slop something out is way less expensive and time consuming than someone writing/illustrating a kid's book.

Moreover, our economy runs on the notion that the easier something is to do, the more likely people are to do it at scale, and vetting your kids media is hard and annoying as a parent at the best of times: if you come home from working your second job and are ready to collapse, are you going to prepare a nutritious meal for your child and set them up with insightful, interesting media? No you're going to heat up chicken nuggets and put them in front of the iPad. That's not good, but like, what do you expect poor parents to do here? Invent more time in the day so they can better raise their child while they're in the societal fuckbarrel?

And yes, before it goes into that direction, yes this is all down to the choices of these parents, both to have children they don't really have the resources to raise (though recent changes to US law complicates that choice but that's a whole other can of worms) and them not taking the time to do it and all the rest, yes, all of these parents could and arguably should be making better choices. But ALSO, I do not see how it is a positive for our society to let people be fucked over like this constantly. What do we GAIN from this? As far as I can tell, the only people who gain anything from the exhausted-lower-classes-industrial-complex are the same rich assholes who gain from everything else being terrible, and I dunno, maybe they could just take one for the team? Maybe we build a society focused on helping people instead of giving the rich yet another leg up they don't need?


...if you come home from working your second job and are ready to collapse, are you going to prepare a nutritious meal for your child and set them up with insightful, interesting media? No you're going to heat up chicken nuggets and put them in front of the iPad.

This is what I mean by "complicated factors of culture and habit." An iPad costs more than an assortment of paper books. Frozen chicken nuggets cost more than basic ingredients. But the iPad and nuggets are faster and more convenient. The kids-get-iPad-and-nuggets habit is popular with middle-income American families too, not just poor families where parents work two jobs. The economic explanation is too reductive.

I'm not trying to say that this is the "fault" of parents or of anyone in particular. When the iPad came out I doubt that Apple engineers or executives thought "now parents can spend less time engaging with children" or that parents thought of it as "a way to keep the kids quiet while I browse Pinterest" but here we are.


I was there (Apple) at the time. Absolutely did NOT expect this thing that Steve thought was a neat way to see the whole NYT front page at once, was going to be the defining MacGuffin of an entire generation of children.


I mean, that's the thing though. We now have had kids parked in front of iPads for a good amount of time, along with other technical innovations like social media, and we have documented scientific proof of the harms they do to children's self-esteem, focus and mental acuity. I don't think the designers of the iPad or even the engineers at Facebook set out to cause these issues, but. they. did. And now we have a fresh technology in the form of AI that whole swathes of "entrepreneurs" are ready to toss into more children's brains as these previous ones were.

Is it too much to ask for a hint of caution with regard to our most vulnerable populations brains?


As a former iPad (OS) designer, and former Facebook feed engineer, of course we're upset about what happened. Most of us fought valiantly, with awareness, against what became the dark forces and antisocial antipatterns. But the promo-culture performance incentive system instituted by HR being based on growth metrics at all costs made all of us powerless to stop it. Do something good for the world, miss your promo or get fired.

Circa 2020 a huge number of fed-up good-intentioned engineers and designers quit. It had no effect, at all.


I'm genuinely sorry that happened to you. That had to be an absolute nightmare of an experience.

To be clear: I am not saying that engineers need to be better at preventing this stuff. I am saying regulators need to demand that companies be careful, and study how this stuff is going to affect people, not just yeet it into the culture and see what happens.


Shades of autotune.

But I have faith that people will notice the difference. The current generation may not care about autotune, but that doesn't mean another generation won't. People rediscover differences and decide what matters to them.

When superhero movies were new, almost everyone loved them. I was entranced. After being saturated with them... the audience dropped off. We started being dissatisfied with witty one-liners and meaningless action. Can you still sell a super-hero movie? Sure. Like all action movies, they internationalize well. But the domestic audiences are declining. It makes me think of Westerns. At one time, they were a hollywood staple. Now, not so much. Yes, they still make them, and a good one will do fine, but a mediocre one... maybe not.


> The current generation may not care about autotune

The previous generation's care about autotune was also flatly wrong. Autotune was used by a few prominent artists then and is more widely used now as an aesthetic choice, for the sound it creates which is distinctly not natural singing, as the effect was performed by running the autotune plugin at a much, much higher setting than was expected in regular use.

Tone correction occurs in basically every song production now, and you never hear it. Hell, newer tech can perform tone correction on the fly for live performances, and the actual singing being done on the stage can be swapped out on the fly with pre-recorded singing to let the performer rest, or even just lipsync the entire thing but still allow the performer to jump in when they want to and ad-lib or tweak delivery of certain parts of songs.

The autotune controversey was just wrong from end to end. When audio engineers don't want you to hear them correcting vocals, you don't hear it. I'd be willing to buy another engineer being able to hear tone correction in music, but if a layman says they do, sorry but I assume that person's full of shit.


There are a bunch of videos (e.g., Wings of Pegasus) on youtube that cover pitch-correction, and there are plenty of examples you can hear.


There's already conversation in AI art about how "Y'all will miss all these weird AI glitches when they're gone!" It will become the new tape hiss. Something people will nostalgically simulate in later media that doesn't have it naturally.


Looking forward to watching this post age like milk.


But what they're describing is a case where someone with the storytelling ability but not the money or technical skills could create something that looks solid.

You're imagining "pls write film" but the case of being able to film something and then adjust and tweak it, easily change backdrops etc could lead to much higher polish on creations from smaller producers.

Would the green mile be any less hard hitting if the lights flickering were caused by an AI alteration to a scene? If the mouse was created purely by a machine?


I don't have a problem with adjusting small elements of the film, but that isn't going to make it a tool for youtubers (or home users off the grid) to tell their own stories.


You're unlikely to get an AI that wins accolades for the same reason that's unlikely with humans: they represent the absolute pinnacle of achievement.

The same AI can still raise the minimum bar for quality. Or replace YouTubers and similar while they're still learning how to be good in the first place.

No idea where we are in this whole process yet, but it's a continuum not a boolean.


What accolades? The Hollywood self-congratulatory conspicuous consumption festivals they use to show how good they are at producing "art" every year? The film festivals where billions of dollars are spent on clothing and jewelry to show off the "class" of everyone attending, which people like Weinstein used to pick victims, and everyone else uses as conspicuous consumption and "marketing" media?

Pinnacle is not the word I'd use. Race to the bottom, least possible effort, plausibly deniable quality, gross exploitation, capitalist bottom line - those are all things I'd use to describe current "art" awards like Grammy, Oscars, Cannes, etc.

The media industry is run by exploiting artists for licensing rights. The middle men and publishers add absolutely nothing to the mix. Google or Spotify or platforms arguably add value by surfacing, searching, categorizing, and so on, but not anywhere near the level of revenue capture they rationalize as their due.

When anyone and everyone can produce a film series or set of stories or song or artistic image that matches their inner artistic vision, and they're given the tools to do so without restriction or being beholden to anyone, then we're going to see high quality art and media that couldn't possibly be made in the grotesquely commercial environment we have now. These tools are as raw and rough and bad performing as they ever will be, and are only going to get better.

Shared universes of prompts and storylines and media styles and things that bring generative art and storytelling together to allow coherent social sharing and interactive media will be a thing. Kids in 10 years will be able to click and create their own cartoons and stories. Parents will be able to engage by establishing cultural parameters and maybe sneak in educational, ethical, and moral content designed around what they think is important. Artists are going to be able to produce every form of digital media and tune and tweak their vision using sophisticated tools and processes, and they're not going to be limited by budgets, politics, studio constraints, State Department limitations, wink/nod geopolitical agreements with nation states, and so on.

Art's going to get weird, and censorship will be nigh on impossible. People will create a lot of garbage, a lot of spam, low effort gifs and video memes, but more artists will be empowered than ever before, and I'm here for it.


> What accolades?

Any accolades, be that professional groups, people's awards, rotten tomatoes or IMDB ratings.

> Race to the bottom, least possible effort, plausibly deniable quality, gross exploitation, capitalist bottom line - those are all things I'd use to describe current "art" awards like Grammy, Oscars, Cannes, etc.

I find them ridiculous in many ways, but no, one thing they're definitely not is a race to the bottom.

If you want to see what a race to the bottom looks like, The Room has a reputation for being generally terrible, "bad movie nights" are a thing, and Mystery Science Theater 3000's schtick is to poke fun at bad movies.

> The media industry is run by exploiting artists for licensing rights.

Yes

> The middle men and publishers add absolutely nothing to the mix. Google or Spotify or platforms arguably add value by surfacing, searching, categorizing, and so on, but not anywhere near the level of revenue capture they rationalize as their due.

I disagree. I think that every tech since a medium became subject to mass reproduction (different for video and audio, as early films were famously silent) has pushed things from a position close to egalitarianism towards a winner-takes-all. This includes Google: already-popular things become more popular, because Google knows you're more likely to engage with the more popular thing than the less popular thing. This dynamic also means that while anyone will be free to make their own personal vision (although most of us will have all the artistic talent of an inexperienced Tommy Wiseau), almost everyone will still only watch a handful of them.

> Art's going to get weird, and censorship will be nigh on impossible.

Bad news there, I'm afraid. AI you can run on your personal device, is quite capable of being used by the state to drive censorship at the level of your screen or your headphones.


Extremely-heavily-CG movies already mostly look like shit compared to ones where they build sets and have props and location shoot, even if somewhat assisted by computer compositing and such (everything is, nowadays). [edit: I don’t even mean that the graphics look bad, but the creative and artistic choices tend to be poor]

The limitations of reality seem to have a positive effect on the overall process of film making, for whatever reason. I expect generative AI film will be at least as bad. Gonna be hard to get an entire well-crafted film out of them.


I'd love to see what'd happen if someone dumped the entire text of the Silmarillion or the Hobbit into one of these models. Assuming context windows and output capacity become large enough.


Especially primed by all the lord of the rings movies, for example; I could see the studio taking all the archived footage, camera angles, all the extra data that was generated in the creation of the films and feeding that into something like this model to create all kinds of interesting additional material.


> I wouldn't be. How is any of this going to lead to meaningful art?

Local art, local actors, local animations telling stories about local culture. A netflix for every city, even neighborhoods. That's going to be crazy fun.


There are plenty of great outsider storytellers and artists. Youtube is proof of this. People mostly do comedy on youtube because that's what the medium supports on a low budget, but AI is going to change that.


I'm really not seeing how that would happen from these examples. It would seem like achieving an adequate, directorial level of control would require writing a novel -- or, anyway, more than a conventional screenplay -- to get the AI to make the movie you wanted.

There is so much that has to be conveyed in making a film, if you want it to say something particular.


> It would seem like achieving an adequate, directorial level of control would require writing a novel -- or, anyway, more than a conventional screenplay -- to get the AI to make the movie you wanted.

And? What's the problem with that? You seem to be locked in a "prompt to get a movie" mindset.


Those are the examples provided. When they deliver pro tools for generating movie clips, I will be more convinced, but that hasn't remotely happened yet.


From this and other comments, I get the impression that most on HN assume the tool will be used exclusively by people without any sort of artistic talent, either plagiarizing existing works and/or producing absolute dreck.

However, I see an interesting middle ground appear: a talented writer could utilize the AI tooling to produce a movie based upon their own works without having to involve Hollywood, both giving more writers a chance to put their works in front of an audience as well as ensuring what's produced more closely matches their material (for better or worse).


Looking back on history I think this will lead to meaningful art (and tons and tons of absolute garbage!).

The printing press led to publishing works being reachable by more people so we got tons of garbage but we also got those few individual geniuses that previously wouldn't have been able to get their works out.

I see similarities in indie video/PC games recently too. Once the tech got to the point that an individual or small group could create a game, we got tons of absolute garbage but also games like Cave Story and Stardew Valley (both single creators IIRC).

Anything that pushes the bar down on the money and effort needed to make something will result in way more of it being made. It also hopefully makes it possible for those rare geniuses to give us their output without the dilution of having to go through bigger groups first.

I'm also excited from the perspective that this decouples skills in the creative process. There have to be people out there with tremendous story telling and movie making skills who don't have the resources/connections to produce what they're capable of.


The printing press enabled the artistic visions of single individuals (the writers) to find a wide audience.

To do something similar, this has to allow the director (or whomever is prompting the AI) to control all meaningful choices so that they get more or less the movie the intend. That seems far away from what is demonstrated.


> How is any of this going to lead to meaningful art?

It's a powerful tool. A painting isn't better because the artist made their own paint. A movie made with IRL camera may not be better than one made with an AI camera.


No, what makes art are choices (and execution). But the examples given were too general and didn't exercise much control over the choices.


The examples given aren't trying to be artistic. They're demonstrating technical capabilities as simply as possible.


We already exist in a world where most of the revenue for film companies comes from formulaic productions. Studio execs certainly worry about how they are going to create profits in addition to any concerns about the qualitative cultural value of the films.


Even formulaic movies from hollywood have directors and actors doing a million things the AI will do randomly unless you tell it otherwise.


Who says even a majority of the content you see online is meaningful art?

The algorithms and people making content for the algorithm were trends that have dominated for years already.

None of that is "real" art, when you are just making something optimized for an algorithm.


> I don't think you get "The Green Mile" from something like this.

But maybe you do get Deadpool & Wolverine 3

Guess where the money is?


If it becomes easy to make "Deadpool & Wolverine", it will no longer be where the money is. Everything that becomes a commodity attracts competition and ceases to be special. (You can see some of that in super-hero movies, which have started to be generic and lost some of their audience.)

But, in reality, even making that kind of film is miles away from these examples.


> If it becomes easy to make "Deadpool & Wolverine", it will no longer be where the money is. Everything that becomes a commodity attracts competition and ceases to be special. (You can see some of that in super-hero movies, which have started to be generic and lost some of their audience.)

Well, given the studios still hold the copyright, they can severely constrain supply to keep profits up.

My suspicion is that this kind of stuff gradually reduces some of the labor involved in making films and allows studios to continue padding their margins.


> I don't think you get "The Green Mile" from something like this.

...yet


If I was a human I'd be worried.


These are tools by and for humans


So are nukes.


I guess cats probably think we are tools for feeding them...


Or this is the top, and the only thing AI will be able to generate is boring and uninspiring clips.

Ever notice how they never show anyone moving quickly in these clips?


> If I was a studio exec I'd be worried

Counterpoint: home "studio" recording has been feasible for decades, but music execs are not ruffled. Sure, you get a Billie Eilish debut album once a generation, but the other 99.99% of charting music is from the old guard. The media/entertainment machine is so much bigger than just creating raw material.


Studio execs don't do any of that stuff anyway. It's the long list of people in the movie credits who should be worried.


I doubt it and if we were no one would be earning money anymore and wouldn't have cash to pay for the cost to run these services.


Does anyone other than PMs when thinking up user stories do shit like this or finds this kinda stuff desirable? It just sounds like a business person who doesn't have a life other than selling their product trying to think up "real user" usecases every time.


Frankly, yes.

Many creative works these days require the effort and input of so many people, so much time, and so much money that they can't have a specific creative vision. Mediums like book, comics, indie movies, and very low budget indie games, where the the end product was created by the smallest number of people, have the most potential to be interesting and creative. They can take risks. This doesn't mean they will be good, most aren't, but it means that the range of quality is much broader, with some having a chance to shine in ways which big budget projects just can't. The issue with small teams and small budgets is that they are inherently limited in what they can create. Better tools allows smaller groups of people to make things that previously would have required an entire studio but without diluting the creative vision.

Will this also result in a tidal wave of low effort garbage? Of course it will. But that can be ignored.


Nope. This is just like when cryptobros would regularly insist that cryptocurrencies would replace banks by the end of the decade. It's safe to assume that anyone who makes such wild predictions is a bagholder who stands to gain financially from said wild predictions coming true, even though they never will.


This sounds gimmicky and worth watching once or twice, then forgetting about. Worthwhile art will continue to be created from a specific person's/group's vision, not an algorithmically generated sum of personal preferences.


I rarely watch movies or read books twice anymore. There's too much content already. The challenge with purely human art at this point is that it will be silenced by the perpetual flood of half-assed generated work. There will be room in elite art circles for more, but at some point the generated stuff will be so ubiquitous (and even meaningful) that anyone without connections is going to have a tough time building an audience for their handcrafted work, unless it happens to be particularly controversial or 'difficult' to make. The demand for visual stimulus will be satisfied by hypertuned AI models. Generative AI is not there quite yet but there's no reason to think it won't be better than 90%+ of purely human content within a decade given the pace of development over the last few years.


I don't buy this narrative at all. People like people and increasingly follow artists because of their personality and overall "brand." No one cares about generated AI art or its creator(s), because it's not interesting. It's also not sharable with other humans; see, for example, the frenzy around going to a Taylor Swift concert. The mass appeal and shared interest is part of the draw.

At best, you'll get something like a generic sitcom. The idea that "all visual stimulus will be satisfied by hypertuned AI models" doesn't line up with how people experience the arts, at all.


That may be the case today but kids are starting to grow up with this stuff as part of their lives, and I don't think we can anticipate the reaction as both they and the models grow in tandem. I think human creativity is much deeper than LLMs, but that is from my human perspective and I can't fully rule out that the LLMs may become better at it at some point in the future. I actually think they're already smarter and more creative than most people (though not more than the potential of any given human if they practiced/trained thoroughly).


I fully agree here. I want to be part of an audience, and as part of that audience I always look at the human development of the things to share - artifacts in the case of fine art, or experiences in the case of performative art. The artist will always be more important than their work to me.

I don't want to carry mechanical solutions labelled culture - deterministic enough, despite hallucinations - into the next generation that follows my own. It's an impressive advancement for automation, sure, but just not a value worth sharing as human development.

That being said, I think GenAI could be a valuable addition in any blueprint-/prototype-/wireframing phase. But, ironically, it positions itself in stark contrast to what I would consider my standards to contemporary brainstorming, considering the current Zeitgeist:

  - truthful to history and research (GenAI is marketing and propaganda)
  - aware of resources (GenAI is wasteful computing)
  - materialistic beyond mere capitalistic gains (GenAI produces short-lived digital data output and isn't really worth anything)


Exactly, there will be a lower barrier to entry, but making content that stands out will require the same (or more) effort.


You’ll never be able to tell them apart.


It depends on how shallow your understanding of media is.

I'm sure this can be used to create entertaining movies that are fun and wacky. I don't think it can create impactful movies.


I think that’s an extremely short sited perspective. There isn’t much that separates a “fun and wacky” movie from something impactful from a cinematography perspective. With the right music, ambience and script you could absolutely do any genre of movie you wish to.


I disagree, I believe your perspective is short-sighted. If you really think what makes a movie "impactful" is the music, ambience, and script then I don't think you have much media literacy.

It's no more ridiculous than saying what makes a painting impactful is the brush strokes. But if I copy Picasso's work stroke for stroke, why am I not Picasso? After all, the dumbass paints like a child, admittedly! How could someone like him ever be considered a great painter?


You forget that there's a human behind the prompt, stitching frames and dialogue together.


If they are stitching then I would consider that a form of art.

However, merely describing something is not doing the thing. Otherwise, the business analysists at my company would be software engineers. No, I make the software, and they describe it.

The end-goal here is humanless automation, no? Then I'm not sure your assumption holds up. If there's no human, I question the value.


> If there's no human, I question the value.

You may question the value but if it’s anything like rugs you won’t be in the majority. People pay a significant premium for artisanal handmade rugs but that being said, more than 95% of the rugs people use are machine made because they’re essentially indistinguishable from a handmade one and are much, much cheaper and just as functional.


No I don't think it'll be anything like rugs


I don't agree. While "just" audio, I've made a few AI songs that have made people tear up and trigger strong emotions.

I think you can do this with video too, just more challenging right now.


I'm sure eventually you can, but I don't think triggering emotions is the correct "KPI" so to speak.

On social media platforms, typically the most popular content triggers the strongest emotions. It's rage-bait however, or sadness bait, or any other kind of emotional manipulation. It tricks the human mind and drives up engagement, but I don't think that is indicative of its value.

To be clear, I'm certain that's not what you're doing, and the music is good. But I think it's complicated enough that triggering emotions isn't enough data to ascertain value.

I don't know, exactly, what combination of measurements are needed to ascertain value. But I'm confident human-ness is part of the equation. I think if people are even aware of the fact a human didn't make something they lose interest. That makes the future of AI in entertainment dicey, and I think that's what fuels the constant dishonesty around AI we're seeing right now. Art is funny in how it works because, I think, intention does matter. And knowledge about the intention matters, too. It maybe doesn't make much sense, but that's how I see it.


Right now there is a ton of stigma around AI art. That stigma fuels a ton of poorly-informed rhetoric against it. There is also tons and tons of casual use of AI art being shitposted for funsies everywhere that reinforces that rhetoric that AI art means "Push button. Receive crap. Repeat."

Meanwhile, as someone who has been engaged with the AI art community for years, and spent years volunteering part-time as a content moderator for Midjourney, the process of creating art via AI with intentionality is deeply human.

As an MJ mod, I have seeeeeeen things.... It's like browsing though people's psyche. Even in public portfolios people bare their souls because they assume no one will bother to look. People use AI to process the world, their lives, their desires, their trauma. So much of it is straight-up self-directed art therapy. Pages and pages, thousands of images stretching over weeks, sometimes months, of digging into the depths of their selves.

Now go through that process to make something you intend to speak publicly from the depth of your own soul. You don't see much of that day to day because it is difficult. It's risky at a deeply personal level to expose yourself like that.

But, be honest: How much deeply personal art do you see day to day? You see tons of ads and memes. But, to find "real art" you have to explicitly dig for it. Shitposting AI images is as fun and easy as shitposting images from meme generators. So, no surprise you see floods of shitposts everywhere. But, when was the last time you explicitly searched out meaningful AI art?


> But, be honest: How much deeply personal art do you see day to day?

You bring up a good point - very little. But, to be fair, those people aren't necessarily trying to convince me it's art.

I think you're mostly right but I am a little caught up on the details. I think it's mostly a thing of where the process is so different, and involves no physical strokes or manipulation, that I doubt it. And maybe that's incorrect.

However, I will also see a lot of people who don't know how to do art pretending like they've figured it all out. I also see the problem with that. It wouldn't be such a problem if people didn't take such an overly-confident stance in their abilities. I mean, it's a little offensive for that guy mucking around for an hour to act like he's DiVinci. And maybe he's a minority, I wouldn't know, I don't have that kind of visibility into the space.

I think a lot of the friction comes from that. Shitposts are shitposts, but I mean... we call them shitposts, you know? They, the people that make them, call them shitposts. There's a level of humility there I haven't necessarily seen with "AI Bros".

I think, if you really love art, AI can be a means to create a product but it can also be a starting point to explore the space. Explore styles, explore technique, explore the history. And I think that might be missing in some cases.

For a personal example, I'm really into fashion and style. I love clothes and always have. But it's really been an inspiration to me to create clothes, to sew. I've done hand sewing, many machine stitches too. And I don't need to - I could explore this in a more "high-level" context, and just curate clothing. But I think there's value in learning the smaller actions, including the obsolete ones.


Check out

https://x.com/ClaireSilver12

https://www.clairesilver.com/collections

from the POV of fashion illustration. Her "corpo|real" collection took something like 9 months to create and was published nine months ago.


"Turn this book into a two-hour movie with the likeness of [your favorite actor/actress] in the lead role."

This sounds marginally above fanfiction, so I do think it'll be very easy to tell them apart. "Terminator, except with Adam Sandler and set on Mars" is a cute, gimmicky idea, not a competitor to serious work.


Well, yeah. If you explicitly try to come up with a cute, gimmicky idea, it's not going to be serious. Taste still matters regardless of paint, cameras or computers.


Maybe, but I guarantee you this is going to get banned in the US for "safety" or "misinformation" reasons eventually (with large backing from Hollywood).


You will if you go see someone pick on a guitar at an open mic!


Isn't a specific person's vision basically their personal preferences?


Not really. A vision implies a particular kind of project, presumably created by someone with expertise and some well-thought through ideas about what it ought to be. Personal preferences just mean that someone likes X qualities.

To use a real-world example: if the Renaissance-era patrons had merely written down their preferences and had work made to match those preferences, it's highly unlikely that you'd have gotten the Mona Lisa or David.

Which is to say that, there will definitely be some interesting and compelling art made with AI tools. But it will be made by a specific person with an artistic vision in mind, and not merely an algorithm checking boxes.


You sound like two minute papers.


In enthusiasm perhaps, but when I play that in my own head with the voice of Dr. Károly Zsolnai-Fehér, he doesn't script his videos like that. I can't recall a single instance of him using triple repetition like that.


what a time to be alive.


But mostly just porn.


I think AI porn is overhyped. We've had the ability to create realistic photos and short vids for over a year now and onlyfans creators are still doing fine. AI porn is just a niche for stuff that humans don't want to perform.


I think AI porn is chocked full of fascinating moral quandaries. It kind of transcends all other types of GenAI for the amount of hard questions it asks society.

As they say, porn is always the leading spear of technology. It's something to keep an eye on (no pun intended) to understand how society will accept/integrate generated content.


It's definitely interesting from a moral and tech perspective.

However, commercially it seems like a niche within the existing structures of porn. Mostly competing with the market for animated stuff. At least that's where it is right now, and its already at photorealistic parity with human content creators.


'still'

We have not had the ability to create interesting ai porn vids yet. How would we? Meta just showed movie gen.

But i'm pretty sure the short images of moving woman very subtle might gotten the one or other of. Just wait a little bit until you can really create wat you are looking for.


You obviously have not spent any time on civit.ai if you're saying that.

What scares me most is that in my opinion, by far the best prompt writers are the ones who are deeply "motivated" and "experienced" with prompting. Often the best prompters have only one hand on the keyboard at a time.


I'm aware of civit.ai but i talk about short clips and i rechecked but didn't find short ai clips which are more than a model moving a little bit.


HN doesn't feel like right place to share links, but have a look at what's available on fanvue.

I'm not sure exactly what models the account owners use, but I think it's a mix of Stable diffusion video touched up with adobe tooling.


You have no idea how much butthurt their is from specifically artists who draw at the AI NSFW models which exist.

I can trivially fine-tune and create more art from certain artists in an hour than they have produced themselves in their whole careers. This makes a lot of people very upset.


> Sooner than anyone could have expected, we'll be able to ask the machines: "Turn this book into a two-hour movie with the likeness of [your favorite actor/actress] in the lead role."

I've been doing this with ChatGPT, except it's more of a "turn into a screenplay" then "create a graphic of each scene" and telling it how I want each character to look. It's works pretty well but results in more of a graphic novel than a movie. I'm definitely been waiting for the video version to be available!


this user is a prime example of "consoooooooooooooom!!!!!!!!!!"


We are super cooked! I love the future!!!!


It looks impressive, yet I'm not feeling very impressed. If only I could get as high as you do from watching those demo reels.

Out of curiosity, what is it that people do with these things? Do they put them on TikTok?


> It looks impressive, yet I'm not feeling very impressed.

The demos was made by nerds (said with love) with a limited time window. Wait until the creatives get a hold of the tool.


that sounds awful. we need to start asking ourselves just because we can, do we need to fulfill all of our prurient desires?


"we will be able to" -> "someone with a financial interest in my believing them says that we will be able to"


You know it’s going to happen:

“I want a funny road trip movie staring Jim Carey and Chris Farley, based in Europe, in the fall, where they have to rescue their mom played by Lucille ball from making the mistake of marrying a character played by an older Steve Martin.”

10 minutes later your movie is generated.

If you like it, you save it, share it, etc.

You have a queue of movies shared by your friends that they liked.

Content will be endless and generated.


I don't know about you, but my friends and family are boring af. I wouldn't want to watch their queue of noise.

I do hope that more talented people will have more leverage to create without the traditional gatekeeping, but I also doubt this will happen as the gatekeepers are all funding AI tooling as well.


The content bubble apocalypse, where no one is ever watching the same thing and we lose all cultural connections to each other. At least until someone figures out an algorithm/prompt to influence the content, yvan eht nioj style.


The opposite will occur. Very little will change from how people consume content today. There won't be endless amounts of quality content, there will still be very little high quality content. There will be brief bursts of large amounts of garbage that nobody pays attention to (as a small percentage of people flirt with generative media and quickly lose interest; and the vast majority never bother at all).

The extreme majority will all watch the same things just as they do today. High quality AI content will be difficult to produce and will be nearly as limited in the future as any type of high quality content is today. The masses will stick to the limited, high quality media and disregard that piles of garbage. Celebrity will also remain a pull for content, nothing about that will ever change (and celebrity will remain scarce, which will assist in limiting what the masses are interested in).

By and large people only want to go where other people are at. Nothing about AI will change that, it's a trait that is core to humanity. The way that applies to content is just the same as it does a restaurant: content is a mental (and sometimes physical) destination experience just as a restaurant or vacation trip is.


I think both things will be true. We will enjoy common media that everyone else enjoys, not because of its high quality, but because it's shared shared within our social circles. And we will also generate highly personalized media for our own enjoyment, because we will have full control over it. The quality of this media won't necessarily be "garbage", and will likely be on par with professional productions. It will just be much more personal than anything a professional team could create for us.

Though a reason we would gravitate towards common media more is if what someone brought up in the comments here comes to pass, and celebrities/actors license their likeness to studios only, and amateur tools are not licensed to use them. Though I think there will always be crafty/illegal ways around this. Also, likeness probably won't be worth much, if we can generate any type of character we like anyway. I, for one, couldn't be happier for celebrities and the cultural obsession around them to disappear.


Rich people will afford the subscription costs for curated and verifiable content.

Plebs will get the mass produced stuff, just like it has been for junk food.

In the information case, even if you wanted to sell good quality, verifiable content, how are you going to keep up with the verification costs, or pay people when someone can just dupe your content and automate its variations?

People who are poor dont have the luxury of time, and verifications cannot be automated.

Most people dont work in infosec or Trust and safety, so this discussion wont go anywhere, but please just know - we dont have the human bandwidth to handle these outcomes.

Bad actors are more prolific and effective than good, because they dont have to give a shit about your rules or assumptions.


It'll require tens to hundreds of hours to script the flow of the AI content, to edit, make adjustments, clean it up, make the scenes link together smoothly, fix small glitches. Even with far more advanced AI, it won't come together like a movie people would enjoy watching, without vast human labor involved.

Sub one percent of people are going to be willing to put in the hours to do it.

The bulk of the spammed created content will be: the masses very briefly playing with the generative capabilities, producing low quality garbage that after five minutes nobody is interested in and then the masses will move on to the next thing to occupy a moment of their time. See: generative image media today. So few people care about the crazy image creation abilities of MidJourney or Flux, that you'd think it didn't exist at all (other than the occasional related headline about deepfakes and or politics).


Much of those editing steps could be streamlined and/or straight up automated so that estimate will come way down over time


> Generation failed: I'm sorry, but I cannot generate videos of real people. Please try another prompt.


Also no violence, or alluding to conspiracies, or historical events.


My girlfriend and I already sometimes use Suno like this for music. Just generate a bunch of songs under a specific genre (our favorite right now is nordic folk, dubstep) and just listen through. If we want to learn about or remember something, we make a Suno song about it. The songs are almost all bangers too so it's not even a chore to listen through!


Rick & Morty introduced the concept of interdimensional cable in 2014. Ten years later, it's a reality. Crazy stuff.


Unless I'm missing something, this technology's harmful potential outweighs the good. What is the great outcome from it that makes society better? MORE content? TikTok already shows that you can out-influence Hollywood/governments in 10 seconds with your smartphone. Heck, you can cause riots through forwarding text messages on WhatsApp [1]. Not everything that can be done should be done, and I think this is just too harmful for people to work on. I wish we'd globally ban it.

[1] https://www.dw.com/en/whatsapp-in-india-scourge-of-violence-...


I agree with Neil Degrasse Tyson AI will kill the Internet https://www.youtube.com/watch?v=SAuDmBYwLq4

Though maybe there's hope if..

1. All deepfake image & video tech are enforced to add watermark labels & all websites that publish are force to label fake too.

2. Crazy idea but a govt issued Internet ID (ID.me is closest to that now with having to use to file taxes with the IRS) where your personal repuation and credit score are affected by publishing fake/scam/spam crap on the Internet ..affectively helping to destroy it. I want good actors on the web not ones that are out for a buck and in turn destroying it.


1. will never happen, it’s way too interesting that people won’t try to make an open source version where the watermark can easily be removed by users. Unless you actually criminalize it and put people in jail multiple years for building anything close to deepfakes, you won’t be able to prevent that.


That's why websites uploaders need to read the meta data and found out how it was originally generated then publish / label it ..AI generated it its first creation was such.


And how would you actually enforce that? What would happen if I as a private person AI-generated something on my computer and upload it without the metadata? Would I go to prison?


governments need to enforce that into all upload(ing) tech (web browser builders, apple's iphone sdk, androids, etc) and require all websites/apps to publish/label the metadata showing AI generated or not.


> TikTok already shows that you can out-influence Hollywood/governments in 10 seconds with your smartphone

I can accept the premise that TikTok is trying to do this. Do we have any objective measurement on how effective it has been?


There’s literally non stop article and studies on misinformation of every category. The evidence is beyond abundant.

I’m not suggesting TikTok themselves is trying to do this, but it (and twitter, instagram, facebook, etc etc) is shaping people’s world views.


My sense is that there is abundant evidence of something, but I'm unable to judge the holistic effect size and direction.

My default perspective is that because humans are so adaptable, every technology shapes our world views. TikTok and Instagram impact us, but so does the plow and shovel. We have research that shows IG harming self-image in some segments of teen girls; what I have not seen evaluated much is how Youtube DIY videos bring self-esteem through teaching people skills on how to make things. These platforms also connect people - my wife had a very serious but rare complication in pregnancy, and her mental health was massively improved by being able to connect with a group of women who had been through/were going through something similar.

My overall point is that it's not very interesting to me to say that technology shapes our world views. Which views? In which way, to what extent? Is it universal, or a subpopulation? Are there prior indications, or does it incept these views? Which views? How much good or harm? How do we balance that?

But what we are left with is a very small view through the keyhole of a door into a massive room that is illuminated with a flickering flashlight. We then glom onto whatever evidence supports our biases and preconceptions, ignoring that which is unstated, unpopular, or violates our sense of the world.


It's already bad, but the amount of garbage that is going to flood YouTube now is going to make it unusable.


No, it won't. Youtube is kind-of like Reddit or HN. "Good" stuff floats to the top, "bad" stuff disappears into nothing.


That quality filter may come from highly-tuned personalization.

I remember seeing low-quality but viral content on YouTube, so I kept telling it "Don't Recommend This" for quite a while (month-ish). Now it's better, but the recommendation algorithm needs a lot of samples labeled negative.


I've never had to do this much, just a few cases. However I'm also super-cautious not to view slop on my main account, instead doing so in a private tab.

So not feeding the algorithm seems to work as well.


Yeah, if you decide to check out slop once your feed will be drowned in spam. Similar to how I looked at a couple more pro-Russian videos to get s nuanced perspective and then my whole feed was filled with conspiracy shit and nazi stuff.



In general this is true but I find the pandering that creators do to please the algorithm to make them produce worse videos, which is frustrating.


“Good” is not what “the algorithm” is selecting for.


nobody is forcing you to subscribe to ai video channels. the creators you currently enjoy won't just start making ai videos?


oh yeah, YouTube is a pristine environment right now


Yeah, but it's going to become 100x worse.

And for the average person, 1000x worse.


Depends on algo, not content. There s enough content even without people gaming the algo.


I’m already sick of all the bullshit thumbnails to be honest. It’s at the point where I feel like giving up my premium subscription.


You would maybe like this YouTube click-bait reducing browser plugin:

https://dearrow.ajay.app/


Thank you , my issue is, YouTube should be doing this for me. I’m paying them for a service.


Well I wouldn't have called it, but I think Meta is in the lead. They beat Apple to AR and affordable VR. Their AI tooling has basically caught up to OpenAI and at this rate will pass them - is anyone else even playing? Maybe their work culture is just better suited to realizing these technologies than the others.

They're not really showing signs of slowing down either. Hey, Zuck, always thought you were kind of lame in the past. But maybe you weren't a one trick pony after all.


> is anyone else even playing?

Deepmind. Protein folding and solving math problems is just less sexy.


Additional Links: https://x.com/AIatMeta/status/1842188252541043075 https://ai.meta.com/static-resource/movie-gen-research-paper

From Twitter/X:

Today we’re premiering Meta Movie Gen: the most advanced media foundation models to-date.

Developed by AI research teams at Meta, Movie Gen delivers state-of-the-art results across a range of capabilities. We’re excited for the potential of this line of research to usher in entirely new possibilities for casual creators and creative professionals alike.

More details and examples of what Movie Gen can do https://go.fb.me/kx1nqm

Movie Gen models and capabilities Movie Gen Video: 30B parameter transformer model that can generate high-quality and high-definition images and videos from a single text prompt.

Movie Gen Audio: A 13B parameter transformer model that can take a video input along with optional text prompts for controllability to generate high-fidelity audio synced to the video. It can generate ambient sound, instrumental background music and foley sound — delivering state-of-the-art results in audio quality, video-to-audio alignment and text-to-audio alignment.

Precise video editing: Using a generated or existing video and accompanying text instructions as an input it can perform localized edits such as adding, removing or replacing elements — or global changes like background or style changes.

Personalized videos: Using an image of a person and a text prompt, the model can generate a video with state-of-the-art results on character preservation and natural movement in video.

We’re continuing to work closely with creative professionals from across the field to integrate their feedback as we work towards a potential release. We look forward to sharing more on this work and the creative possibilities it will enable in the future.


If nothing else it will produce some amazing material for this account, once the content farms get their hands on it: https://x.com/FacebookAIslop


Crashes Firefox mobile. Looks pretty impressive on Chrome! Apparently hosted only


Hippos can't actually swim though.


I have watched some films recently, and they are full of weird mistakes. A bunch of balloons can't lift your house into the air. DeLoreans can't actually travel through time. Gamma rays don't give you superhuman strength. A 6502 CPU couldn't power an advanced AI for killer robots from the future. So unrealistic.


Was my first reaction too when seeing the video at the top. But then after thinking about it, it makes sense as an example, you want to showcase things that aren't real but look realistic. A hippo swimming looks real, but it isn't as they don't swim.


Haha, this is the first thing I thought of too. I knew adult hippos walk on the bottom, but from looking at existing videos it looks like small (baby/pygmy) hippos do too, they don't float at the surface like this.


[dead]


Scam, don't click this.


They all have dead eyes. It's creepy.


I strongly believe this technology is bad


AI destroying mankind has always been a theme, but it's probably not going to be in the form of a physical war. This picture generation will probably have a very negative impact on mankind. Imagination is one of the driving forces of mankind, and so is our desire to realize the things we desire. When these generators become a common thing, human imagination and intelligence will start inevitably degrading. Nobody will bother painting pictures, making music, making video games or movies, when anyone can just see and hear whatever they want instantly. And people will brobably work all day long just so come home and pay to use these generators. This has the potential to destroy mankind by destryoing the human spirit.


What about an endless supply of custom-tailored pr0n? Apart from the spirit, might simply wipe out actual physical reproduction.


People have been saying the same thing about Internet streaming of music for example.

It didn't destroy musicians, just changed how they make their money.

Musicians today generally publish music for free, and make money from concerts and merch sales instead (the publishing platforms generate money from ads around the publishing).

Nothing will change about that with the onset of AI generated music - most music is already free today, so you pay for personal experience (i.e. a concert) instead.


Music streaming didn't replace the artist themself, only the way their art was delivered. Music streaming also didn't posess the ability to interfere with democracy.




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: