Video quality seems really good, but limitations are quite restrictive "Our model encounters challenges when processing extremely long videos (e.g. 200 frames or more)".
I'd say most videos in practice are longer than 200 frames, so lot more research is still needed.
Sure, but that represents a lot of fast cuts balanced out by a selection of significantly longer cuts.
Also, it's less likely that you'd want to upscale a modern movie, which is more likely to be higher resolution already, as opposed to an older movie which was recorded on older media or encoded in a lower-resolution format.
Huh, I thought this couldn't be true, but it is. The first time I noticed annoyingly fast cuts was World War Z, for me it was unwatchable with tons of shots around 1 second each.
The first time I noticed how bad the fast cuts are we see in most movies was when I watched Children of Men by Alfonso Cuarón, who often uses very long takes for action scenes:
So sad they didn’t keep to the idea of the book. Anyone who hasn’t read this book you should, it bares no resemblance to the movie aside from the name.
It's offtopic, but this is very good advice. As near as I can tell, there aren't any real similarities between the book and the movie; they're two separate zombie stories with the same name, and honestly I would recommend them both for wildly different reasons.
And similarly, I, Robot, which is much more enjoyable when you realize it started as an independent murder-mystery screenplay that had Asimov’s works shoehorned in when both rights were bought in quick succession. I love both the movie and the collection of short stories, for vastly different reasons.
It’s style is based on the oral history approach used by Studs Terkel to document aspects of WW2 - building a big picture by interleaving lots of individual interviews.
The lost world is also a great book. It explores a lot of interesting stuff the film completely ignores. Like that the raptors are only rampaging monsters because they had no proper upbringing having been been born in the lab with no mama or papa raptor to teach them social skills
Disagree, Jurassic Park was an amazing movie on multiple levels, the book was just differently good, and adapting it to film in the exact format would have been less interesting (though the ending was better in the book.)
I think like the motorcycle chase that they borrowed from the lost world in Jurassic world, they also have a scene with those tiny dinosaurs pecking someone to death.
The textures of objects need to maintain consistency across much larger time frames, especially at 4k where you can see the pores on someone's face in a closeup.
I'm sure if you really want to burn money on compute you can do some smart windowing in the processing and use it on overlapping chunks and do an OK job.
I believe the relevant data point when considering applicability is the median shot length to give an idea of the length of the majority of shots, not the average.
It reminds me of the story about the Air Force making cockpits to fit the elusive average pilot, which in reality fit none of their pilots...
Freal. To the degree that i compulsively count seconds on shots until a show/movie has a few shots over 9 seconds then they "earn my trust" and i can let it go. Im fine
Easily solved, just overlap by ~40 frames and fade the upscaled last frames of chunk A into the start of chunk B before processing. Editors do tricks like this all the time.
It's not so much that it would be impratical (video streaming, like HLS or MPEG-Dash, requires to chunk videos in pieces of roughly this size) but you'd lose the inter-frame consistency at segments boundaries, and I suspect the resulting video would be flickering at the transition.
It could work for TV or movies if done properly at the scene transition time though.
You could probably mitigate this by using overlapping clips and fading between them. Pretty crude but could be close to unnoticeable, depending on how unstable the technique actually is.
Tale as old as time, in graphics papers it's "our technique achieves realtime speeds" and then 8 pages down they clarify that they mean 30fps at 640x480 on an RTX 4090.
Break into chunks that overlap by, say, a second, upscale separately and then blend to reduce sudden transitions in the generated details to gradual morphing.
The details changing every ten seconds or so is actually a good thing; the viewer is reminded that what they are seeing is not real, yet still enjoying a high resolution video full of high frequency content that their eyes crave.
If you're using this for existing material you just cut into <=8 second chunks, no big deal. Could be an absolute boon for filmmakers, otoh a nightmare for privacy because this will be applied to surveillance footage.
I'd say most videos in practice are longer than 200 frames, so lot more research is still needed.