That already happened. Cameras, lights and a computer that can edit are hundreds of time cheaper than they were a few decades ago. Kids can make videos, but they aren't making movies anyone wants to watch.
You always take the contrarian opinion to my comments, CyberDildonics.
> That already happened. Cameras, lights and a computer that can edit are hundreds of time cheaper than they were a few decades ago.
We're not talking about the same thing. None of these trends towards fast and dirty filmmaking incorporated Gen AI. You couldn't get Hollywood-quality out of an Android phone and a boom microphone. That will change.
> Kids can make videos, but they aren't making movies anyone wants to watch.
They're certainly watching each other's content. YouTube is filled with lots of young creators with enormous audiences. That trend will continue to grow.
You couldn't get Hollywood-quality out of an Android phone and a boom microphone.
You can shoot video that looks as good or better than 35mm film from the 90s on a video camera that is in the low hundreds of dollars, if that. People can go out and make movies, but that doesn't mean they are making things people want to watch.
I think you are underestimating how detailed and exact film making really is for "hollywood" quality. You need extreme directability at every stage of the pipeline.
People being able to use a system that creates something decent but makes you accept whatever you get has existed for a long time through animations with game engine characters.
You can direct the generation already for Stable Diffusion with ControlNet and OpenPose (a pose library), it's not too far off to rig an animation of a certain movement and then have the AI generate the exact movements you want. If you look at something like ComfyUI, there is an entire node-based workflow akin to game development software, it's not a "type prompt, get output" process at all, as the AI can't read your mind (yet), so you must put in the effort to get out what you want. It's really not much different than making an animated or CGI film, only now the CGI is photorealistic.
You are making the mistake of equating incremental progress with the most extreme litmus test for quality. Posing is great, but again in a system with a lot of automation, once what you want isn't coming out automatically, you are sunk.
Already I could see concept art, animatics, previs and other areas benefiting a lot from these tools. Almost nothing so far is temporally coherent or actually final quality.
There is still a lot of use for that, but to replace actual final shots in films everything has to be perfect. The lighting of every element has to match, the perspective has to match, the model and deformations have to be perfect etc. That isn't going to happen with 2D techniques that hallucinate details from other photos.
> You are making the mistake of equating incremental progress with the most extreme litmus test for quality
Sure, but at the same time don't think that incremental progress cannot get to a superlative product, either. A translator app might improve incrementally but when it gets to 99.9% accuracy, the way people use it transforms from a mere tool to being relied upon everywhere they might need it. We see this with every technology today, from cars to smartphones.
We already have bad CGI that most except the most eagle-eyed moviegoers will not notice or care about, if AI generation tech improves such that you can trust the output 99.9% of the time, then AI generation will be what is predominantly used. It's simply good enough and much cheaper than manually rigging models or filming actors (nevermind that Photoshop and video editing style tools are already being made for AI generation that will get you the exact output you want, more often than not).
A translator needs to convert some text to another that has the same information.
I think you are misunderstanding what happens with the current image generation stuff out there. You give it text, it gives you something that might look plausible for what you described. It ends up full of artifacts and temporally extremely unstable.
We already have bad CGI that most except the most eagle-eyed moviegoers will not notice or care about,
I don't know what you mean by this exactly. CGI in movies takes an extreme amount of labor and every shot out of the hundreds of thousands that go in to a film are very exacting, being directed to a detail level that people can't notice while watching a movie in real time.
Your comment basically boils down to "what if it gets good enough, then it will be good enough".
Remember that the original claim here was that "kids will make hollywood blockbusters in their bedrooms".
Some people saw early cars and predicted they would fly too. You could also say that phone cameras have gotten much better and they will eventually replace "hollywood cameras".
Show me something where you can keep typing in text to change the image, move the lights, move the camera and have it be temporally stable. Then you might have a tool that has a chance to be used for a final image and even then, it's still a far cry from "kids making a hollywood blockbusters in their bedrooms".
> I think you are misunderstanding what happens with the current image generation stuff out there. You give it text, it gives you something that might look plausible for what you described. It ends up full of artifacts and temporally extremely unstable.
Have you used ComfyUI with inpainting? From what you're saying, I don't believe you have, or perhaps the last time you used image generation was with Midjourney or another "type in text and get an output" tool. In reality, the field has evolved to contain entire workflows such that you can get something with no artifacts, it just takes time and effort, as with other sorts of art too.
> CGI in movies takes an extreme amount of labor and every shot out of the hundreds of thousands that go in to a film are very exacting, being directed to a detail level that people can't notice while watching a movie in real time.
Oh they can notice alright, Corridor Digital on YouTube has videos of them reacting to bad CGI. Even in discussions online I see people complaining about bad CGI, especially in Marvel media. But my point is that those films still make billions of dollars, so at some point most people simply don't care and will watch it anyway.
> *Your comment basically boils down to "what if it gets good enough, then it will be good enough". [...] Show me something where you can keep typing in text to change the image, move the lights, move the camera and have it be temporally stable.
Well, yes, that is my point, that when it gets good enough, kids will be making blockbusters in their bedrooms. I'm not a future seer though so it's not like I can give you an exact timeline, but based off what I know of the field, it's trending towards that. You could say it's incremental progress, but again, at some point it'll be good enough that most people looking won't care about artifacts, just like bad CGI as above. If you want a fully temporally coherent image when moving a virtual camera around, wait a few years.
I have, have you? It is high level automation. Go ahead and show me an image where you can move the camera, move the lights and keep it temporally stable.
Oh they can notice alright, Corridor Digital on YouTube
No they don't. I know you think you watching some youtubers kick and scream means you understand an entire industry, but it isn't true. You don't watch a video on brain surgery and then think you understand everything that's going on either and the guys you are talking about are not exactly brain surgeons.
Picking a single bad shot out of hundreds is good for a youtube video but it has nothing to do with anything I was saying and it has nothing to do with noticing all the detail work that goes in to every shot. You didn't even understand what I was saying in the first place.
But my point is that those films still make billions of dollars, so at some point most people simply don't care and will watch it anyway.
People don't care about what? Hundreds of millions spent on CGI? People do care, it is extremely difficult and you have no idea what you are talking about.
kids will be making blockbusters in their bedrooms
If there were any truth to this, kids would be able to at least write the scripts to blockbusters in their bedrooms so show me that first.
"If you want a fully temporally coherent image when moving a virtual camera around, wait a few years."
Based on what? These 2D techniques that were just invented? The fact that people can paint colors and a badly photo shopped looking image comes out?
I have no doubt that people will continue to refine what they are working on, but the claim was "kids will make hollywood blockbusters in their bedrooms". What do you think the actual visual effects people will do with these techniques?
You can actually go back 25 years and see techniques for extracting a foreground from a background called natural image matting. You could look at that and say "in the future no one will need green screens, no one will need outlines, compositing will be automatic" but a quarter of a century later, it is still being worked on and is just another tool.
> Go ahead and show me an image where you can move the camera, move the lights and keep it temporally stable.
Again, I'm not a future seer, and I also never said that this technology exists right now. In fact, I explicitly said that it will take "a few years" until we get to that stage.
They show it being used for normal photos as well as for AI generated images, seems to work fine. It's not yet at the level of realtime movement of the camera and lights, but something like this also exists: https://v.redd.it/kewv3epujbra1.
You said:
> being directed to a detail level that people can't notice while watching a movie in real time.
I said that people do notice, maybe not most, but yes, some people do notice. If you would like to make a different claim, then do so. It's not about some YouTuber, it's the fact that at least some people do notice bad CGI, while you said "people can't notice" it.
> People do care, it is extremely difficult and you have no idea what you are talking about.
I think you're arguing my point for me, I never said it wasn't difficult. My point was that most people can't tell good CGI from bad, some can, but most people will watch the movie anyway and thus the movies rack up billions of dollars. Not sure what point you're making, that people do care about hundreds of millions of dollars of CGI? I mean sure, like I said, some do care but unlike those who actually work in the field, most don't.
I can similarly write the most beautiful software for a CRUD app, using all the best practices, but similarly users will not care about that, beyond some ancillary benefits like fast loading times.
> kids would be able to at least write the scripts to blockbusters in their bedrooms so show me that first.
People write scripts for fun, yes. I'm not sure what there is to show you, look on fanfic sites or /r/readmyscript or /r/writingprompts. Even further, we already have LLMs today so in the future I doubt that people will be writing scripts from scratch without the use of AI entirely.
> Based on what? These 2D techniques that were just invented? The fact that people can paint colors and a badly photo shopped looking image comes out?
Based on this sentence, I can tell you haven't used the state of the art recently, even if you say you used something like ComfyUI with inpainting.
> I have no doubt that people will continue to refine what they are working on, but the claim was "kids will make hollywood blockbusters in their bedrooms". What do you think the actual visual effects people will do with these techniques?
Did I claim that visual effects people will not continue to make even better blockbusters? It raises the floor up, it does not diminish the ceiling. I think you're arguing something I never even said.
> You can actually go back 25 years and see techniques for extracting a foreground from a background called natural image matting. You could look at that and say "in the future no one will need green screens, no one will need outlines, compositing will be automatic" but a quarter of a century later, it is still being worked on and is just another tool.
Things like Ultimatte with Unreal Engine's virtual sets do just that actually. But even if they didn't, I could say the same thing about any technology, to be honest. Also, I never claimed that AI is not "just another tool."
All in all, it seems like you're arguing against things I never claimed. Where did I say that kids would be automatically making blockbusters in their bedrooms with no input from them whatsoever? My point, again, is that people will be making current blockbuster level content in their houses, that they will use AI tools (yes, tools, since you think I'm arguing that AI is not a tool) to do so, and that they will still need to put in human effort to do so. It does not say anything about future blockbusters with the same tools used by professionals, which will undoubtedly be better.
Then why do you keep pretending you can predict the future?
I said that people do notice, maybe not most, but yes, some people do notice. If you would like to make a different claim, then do so. It's not about some YouTuber, it's the fact that at least some people do notice bad CGI, while you said "people can't notice" it.
You still don't even understand what I'm saying. There is a huge amount of explicit detail demanded by the people making movies and it all needs to be exact.
People write scripts for fun, yes. I'm not sure what there is to show you
There is nothing to show, because kids aren't writing "hollywood blockbuster scripts" in their bedrooms even though they could just type it out on their computer.
It only seems that easy to something who knows absolutely nothing about writing.
Based on this sentence, I can tell you haven't used the state of the art recently, even if you say you used something like ComfyUI with inpainting.
Or maybe these results aren't as great as you think.
Things like Ultimatte with Unreal Engine's virtual sets do just that actually
They absolutely do not. Ultimatte is not a natural image matting plugin. The virtual set stuff isn't even in the same ballpark as natural image matting, it is a live screen behind people.
My point, again, is that people will be making current blockbuster level content in their houses
You don't even understand what that means. Lots of people read article headlines and watch 30 second youtube clips, but it is a mistake to buy into so much hype without understanding that new tools are still just a piece of the puzzle.
Alright, seems like you can't or are not articulating what you profess to claim, instead expecting me to read your mind and when I can't, saying "yoU StilL doN't EvEN uNdeRStaNd," and it is clear you don't even use the tools you're dismissing, so I think this is an unproductive conversation for me. Goodbye.
Automation produces something plausible, high end movies need something exact.
Plausible is fine for animatics and previs, not for hundreds of millions of dollars.
The amount of work required is so vast you could automate away 90% of it and it is still out of reach for one person to make a "hollywood blockbuster in their bedroom" just as it is for one person to launch themselves to the moon and make it back.
If you don't believe me, try to make a single shot of CG pool balls on a live action pool table. Nothing but spheres, it should be easy right? Automate some of it with 'AI' if you can.