Pixar's Render Farm

kylegordon · on Jan 3, 2021

I have many fond memories of my first real job as sysadmin/render wrangler for a small animation company about fifteen years ago.

Cores and memory were indeed what mattered, and as long as frames could be fitted into memory then the machine was left to render the frame and then save it back out to shared disk. There was no virtualization, just raw compute hardware running at full capacity 24/7. The scene was already split into different layers and rendered as sequences of frames that were then composited into final products.

First job, first mini-datacenter in a new building, and there I am asking contractors to knock a hole in the building to fit air handlers due to the heat we were expecting to generate. Was great fun.

It taught me a lot about automated systems management.

nom · on Jan 2, 2021

Oh man, I wanted this to contain much more details :(

Whats the hardware? How much electric energy goes into rendering a frame or a whole movie? How do they provision it (as they keep #cores fixed)? They only talk about cores, do they even use GPUs? What's running on the machines? What did they optimize lately?

So many questions! Maybe someone from Pixar's systems department is reading this :)?

aprdm · on Jan 2, 2021

Not Pixar specifically but Modern VFX and Animation studios usually have a bare metal render farm, they usually are pretty beefy -- think at least 24 cores / 128 GB of RAM per node.

Usually in crunch time if there's not enough nodes in the render farm they might rent nodes connecting them to their network for a period of time, or they might use the cloud, or they might get budget to increase their render farms.

From what I've seen the Cloud is extremely expensive for beefy machines with GPUs, but, you can see that some companies use it if you google [0] [1].

GPUs can be used for some workflows in modern studios but I would bet the majority of it is CPUs, those machines are usually running a Linux distro and the render processes (like vray / prman , etc.). Everything runs from a big NFS cluster.

[0] https://deadline.com/2020/09/weta-digital-pacts-with-amazon-...

[1] https://www.itnews.com.au/news/dreamworks-animation-steps-to...

tinco · on Jan 2, 2021

Can confirm cloud GPU is way overpriced if you're doing 24/7 rendering. We run a bare metal cluster (not VFX but photogrammetry) and I pitched our board on the possibilities. I really did not want to run a bare metal cluster, but it just does not make sense for a low margin startup to use cloud processing.

Running 24/7 for three months, it's cheaper to buy consumer grade hardware with similar (probably better) performance. "Industrial" grade hardware (Xeon/Epyc + Quadro) it's under 12 months. We chose consumer grade bare metal.

On thing that was half surprising, half calculated in our decision was despite the operational overhead how much less stressful running your own hardware is. When we ran experimentally on the cloud, a misrender could cost us 900 euro, and sometimes we'd have to render 3 times or more for a single client. Bringing us from healthily profitable to losing money. The stress of having to get it right the first time sucked.

jack2222 · on Jan 2, 2021

I've had renders cost $50,000! Or CTO was less than amused

cosmodisk · on Jan 3, 2021

I hope you didn't have to bin them, as Vogue did with one of their photoshoots costing a bank:)

dtgriscom · on Jan 3, 2021

Sounds interesting: reference?

cosmodisk · on Jan 5, 2021

In the documentary: The September Issue

rbanffy · on Jan 3, 2021

> Running 24/7 for three months, it's cheaper to buy consumer grade hardware

If you have a steady load cloud makes little sense. It only makes sense if you have a tight deadline (as is not that uncommon with video and VFX) and can't fit it within your deployed capacity.

malthejorgensen · on Jan 3, 2021

How do you manage the bare metal cluster? (E.g. apt/yum updates but also networking and such)

erosenbe0 · on Jan 3, 2021

I'm a bit out of date but if we are talking about rendering (not data retrieval workloads) I believe the best way is fundamentally the same as it was 25 years ago: network boot, mostly network storage, and applying local config overlays based on MAC address or equivalents. Exactly what push or pull techniques are in vogue I am not sure but definitely no running package managers on each node. You want as little as possible locally -- just a scratchpad disk that can be rebuilt automatically in minutes.

tinco · on Jan 3, 2021

When it was 3 nodes, and then 6 nodes, the answer was very unprofessionally. I didn't get the budget for a system administrator, and I spent all my budget on developers that could build our application and automate our preprocessing, overlooked system administration skills. So besides the DoE, managing 3 small teams and being the lead developer, I also am the system administrator.

So no fancy answer, our 3D experts got TeamViewer access to the nodes running Windows Pro. Sometimes our renders fail on patch Tuesday because I forgot to reapply the no-reboot hack.

We're professionalizing now at 12 nodes, we got to the point where the 3D experts don't need to TeamViewer in, so we're swapping them to headless Linux. No idea on the update management yet, but they're clean nodes running Ubuntu server.

Narann · on Jan 3, 2021

Network solutions highly depends on the physical infrastructure, but for setup maintenance, you can often see SaltStack.

KaiserPro · on Jan 2, 2021

> How do they provision it

ex VFX sysadmin here. I'm not sure if they use their own scheduler or not. IF they do, they use tractor(might be tractor 2 now), which looks after putting the processes in the right places. Think K8s, but actually easy to use, well documented and reliable. (just not distributed, but then it scales way higher and is nowhere near as chatty)

They would have a whole bunch of machines, some old some new, some with extra memory, for particle sims, some with extra cores for just plain rendering. Each machine will be separated into slots, which are made up of a fixed number of cores. Normally memory is guarded but CPU is not (ie, you only get 8 gigs of ram, but as much CPU as you can consume. Context switching the CPU is fast, memory not so much.) I'm not sure on how pixar does it, but at a large facility like ilm/framestore/dneg the farm will be split into shows, with guaranteed minimum allocation of cores. this is controlled by the scheduler. crucially it'll be over subscribed, so jobs are ordered by priority.

As for actual hardware provisioning, thats quite cool. In my experience there will be a bringup script that talks to the iLo/iDrac/other management system. When a machine is plugged in, it'll be seen by the bringup script, download the xml/config/other goop that tells the bios how to configure and boot from the network, connect to the imaging system and install whatever version of linux they have.

As for power per frame, each frame will be made up of different plates, so if you have a water sim, that'll be rendered separately, along with other assets. These can then be combined afterwards in nuke to tweak and make pretty without having to render everything again.

That being said, a crowd shot with lots of characters with hair, or a water/smoke/ice effect can take 25+ hours per frame to render. So think a 100core/thread machine redlining for 25 hours, plus a few hundreds TB of spinny disk. (then it'll be tweak 20ish times)

optimisations wise, I suspect it's mostly getting software to play nice on the same machine, or beating TDs to make better use of assets, or adjusting the storage to make sure its not being pillaged too much.

berkut · on Jan 2, 2021

Rumour in the industry is that Pixar don't use Tractor themselves, and have a custom solution in Emeryville :)

KaiserPro · on Jan 2, 2021

Lol, that doesn't surprise me at all. Pixar were the last of the original companies to custom make everything....

berkut · on Jan 3, 2021

I think it's more that Tractor's not really very highly thought of in the industry :)

It works, and roughly does what it says on the tin, but most of the bigger studios (other than MPC and DNeg who do use it) have better custom solutions.

meifun · on Jan 3, 2021

Do you have a link to Tractor to let folks check it out?

easton · on Jan 4, 2021

https://renderman.pixar.com/tractor

aprdm · on Jan 2, 2021

Out of curiosity, did you move outside VFX and if yes to what industry, have you been enjoying and what motivated you?

Cheers

KaiserPro · on Jan 2, 2021

I spent tenish years in VFX. I moved away in 2014, because the hours and pay were abysmal. I still love the industry.

I moved to a large profitable financial news paper, which had cute scaling issues (ie they were all solved, so engineers tried to find new and interesting ways to unsolved them )

I then moved to a startup that made self building machine readable maps, which allowed me to play with scale again, but on AWS (alas no real hardware). We were then bought out by a FAAMG company, so now I'm getting bored but being paid loads to do so.

Once the golden handcuffs have been broken, I'd like to go back, but only if I can go home at 5 every day...

aprdm · on Jan 3, 2021

Interesting, thanks for sharing! I've been in VFX around ~6 years and was in the sw industry before (and the hw industry before it).

I find VFX really fun as far as job! Sometimes I do think about leaving mostly for Pay reasons but the pay has been decent enough recently (basically FAANG base pay without RSU/Bonus...).

It is interesting how we have a lot of big scale problems that goes unrecognized, I find the problems really challenging. Compared to when I worked in the software industry we had a team 10x as big for a problem 100x simpler.

Outside of some big tech companies, biology, oil industry and finance, I cannot imagine many companies having such a big scale on number of cores/memory/disk.

Working in Pipeline I haven't found crazy hours yet, has been mostly a 8h/day job that I can disconnect after I am done. Also with Covid some people even switched to 4 days weeks which is quite interesting.

Anyho, thanks for sharing your perspective!

KaiserPro · on Jan 3, 2021

MAybe we will work together soon!

berkut · on Jan 3, 2021

Were you at The Foundry (I think I know who you are)? If so, I think we were both there at the same time!

KaiserPro · on Jan 3, 2021

If you worked on katana then I think I know who you are too!

mroche · on Jan 2, 2021

Former Pixar Systems Intern (2019) here. Though I was not part of the team involved in this area, but I have some rough knowledge around some of the parts.

> Whats the hardware?

It varies. They have several generations of equipment, but I can say it was all Intel based, and high core count. I don't know how different the render infra was to the workstation infra. I think the total core count (aggregate of render, workstation, and leased) was ~60K cores. And they effectively need to double that over the coming years (trying to remember one of the last meetings I was in) for the productions they have planned.

> How much electric energy goes into rendering a frame or a whole movie?

A lot. The render farm is pretty consistently running at high loads as they produce multiple shows (movies, shorts, episodics) simultaneously so that there really isn't idle times. I don't have numbers, though.

> How do they provision it

Not really sure how to answer this question. But in terms of rendering, to my knowledge shots are profiled by the TDs and optimized for their core counts. So different sequences will have different rendering requirements (memory, cores, hyperthreading etc). This is all handled by the render farm scheduler.

> What's running on the machines?

RHEL. And a lot of Pixar proprietary code (along with the commercial applications).

> They only talk about cores, do they even use GPUs?

For rendering, not particularly. The RenderMan denoiser is capable of being used on GPUs, but I can't remember if the render specific nodes have any in them. The workstation systems (which are also used for rendering) are all on-prem VDI.

Though with RenderMan 24 due out in Q1 2021 will include RenderMan XPU, which is a GPU (CUDA) based engine. Initially it'll be more of a workstation facing product to allow artists to iterate more quickly (it'll also replace their internal CUDA engine used in their propriety look-dev tool Flow, which was XPU's predecessor), but it will eventually be ready for final-frame rendering. There is still some catchup that needs to happen in the hardware space, though NVLink'ed RTX8000's does a reasonable job.

A small quote on the hardware/engine:

>> In Pixar’s demo scenes, XPU renders were up to 10x faster than RIS on one of the studio’s standard artist workstations with a 24-core Intel Xeon Platinum 8268 CPU and Nvidia Quadro RTX 6000 GPU.

If I remember correctly that was the latest generation (codenamed Pegasus) initially given to the FX department. Hyperthreading is usually disabled and the workstation itself would be 23-cores as they reserve one for the hypervisor. Each workstation server is actually two+1, one workstation per CPU socket (with NUMA configs and GPU passthrough) plus a background render vm that takes over at night. The next-gen workstations they were negotiating with OEMs for before COVID happened put my jaw on the floor.

dahart · on Jan 2, 2021

> They only talk about cores, do they even use GPUs?

They’ve been working on a GPU version of RenderMan for a couple of years.

https://renderman.pixar.com/news/renderman-xpu-development-u...

lattalayta · on Jan 2, 2021

Also, renderfarms are usually referred to "in cores", because it's usually heterogeneous hardware networked together over the years. You may have some brand new 96 core 512 GB RAM machines mixed in with some several year old 8 core 32 GB machines. When a technical artist is submitting their work to be rendered on the farm, they often have an idea of how expensive their task will be. They will request a certain number of cores from the farm and a scheduler will go through and try to optimize everyone's requests across the available machines.

Narann · on Jan 3, 2021

> They only talk about cores, do they even use GPUs?

From my experience (animated movies) GPU is still very experimental because of how limited it can scale, and definitely not use in render farm.

And I'm not even talking about the cost.

GPU rendering demos focus on speed, but of the biggest problem with full feature is flexibility. The more complex your image is, the more problems/artifacts you will "create" on it. Your render time can be x100 faster, if you need to spend two days to fix a problem for each shot, the quality VS speed ratio completely fall over.

Everything get easier, slowly, so maybe one day we will have a 100% GPU farm on big budget projects, but for now, CPU is the most predictable way to manage large scale rendering for both sides (sysadmin/artists).

daxfohl · on Jan 2, 2021

And when leasing cores, who do they lease from and why?

bluedino · on Jan 2, 2021

I like the picture of the 100+ SPARCstation render farm for the first Toy Story

https://mobile.twitter.com/benedictevans/status/766822192197...

erk__ · on Jan 2, 2021

This reminds me of one of the first FreeBSD press releases. [0]

> FreeBSD Used to Generate Spectacular Special Effects

For the then upcoming movie The Matrix

> Manex Visual Effects used 32 Dell Precision 410 Dual P-II/450 Processor systems running FreeBSD as the core CG Render Farm.

[0]: https://www.freebsd.org/news/press-rel-1.html

edit: added that it was for The Matrix.

ksec · on Jan 3, 2021

Wow. I assume for Rendering Farm it is all Linux now?

lattalayta · on Jan 2, 2021

I always liked the neon sign they have outside their current renderfarm

https://www.slashfilm.com/cool-stuff-a-look-at-pixar-and-luc...

mmcconnell1618 · on Jan 2, 2021

Can anyone comment on why Pixar uses standard CPU for processing instead of custom hardware or GPU? I'm wondering why they haven't invested in FPGA or completely custom silicon that speeds up common operations by an order of magnitude. Is each show that different that no common operations are targets for hardware optimization?

berkut · on Jan 2, 2021

Because the expense is not really worth it - even GPU rendering (while around 3/4 x faster than CPU rendering) is memory constrained compared to CPU rendering, and as soon as you try and go out-of-core on the GPU, you're back at CPU speeds, so there's usually no point doing GPU rendering for entire scenes (which can take > 48 GB of RAM for all geometry, accel structures, textures, etc) given the often large memory requirements.

High end VFX/CG usually tessellates geometry down to micropolygon, so you roughly have 1 quad (or two triangles) per pixel in terms of geometry density, so you can often have > 150,000,000 polys in a scene, along with per vertex primvars to control shading, and many textures (which can be paged fairly well with shade on hit).

Using ray tracing pretty much means having all that in memory at once (paging sucks in general of geo and accel structures, it's been tried in the past) so that intersection / traversal is fast.

Doing lookdev on individual assets (i.e. turntables) is one place where GPU rendering can be used as the memory requirements are much smaller, but only if the look you get is identical to the one you get using CPU rendering, which isn't always the case (some of the algorithms are hard to get working correctly on GPUs, i.e. volumetrics).

Renderman (the renderer Pixar use, and create in Seattle) isn't really GPU ready yet (they're attempting to release XPU this year I think).

dahart · on Jan 2, 2021

> Because the expense is not really worth it

I disagree with this takeaway. But full disclosure I’m biased: I work on OptiX. There is a reason Pixar and Arnold and Vray and most other major industry renderers are moving to the GPU, because the trends are clear and because it has recently become ‘worth it’. Many renderers are reporting factors of 2-10 for production scale scene rendering. (Here’s a good example: https://www.youtube.com/watch?v=ZlmRuR5MKmU) There definitely are tradeoffs, and you’ve accurately pointed out several of them - memory constraints, paging, micropolygons, etc. Yes, it does take a lot of engineering to make the best use of the GPU, but the scale of scenes in production with GPUs today is already firmly well past being limited to turntables, and the writing is on the wall - the trend is clearly moving toward GPU farms.

berkut · on Jan 2, 2021

I write a production renderer for a living :)

So I'm well aware of the trade offs. As I mentioned, for lookdev and small scenes, GPUs do make sense currently (if you're willing the pay the penalty of getting code to work on both CPU and GPU, and GPU dev is not exactly trivial in terms of debugging / building compared to CPU dev).

But until GPUs exist with > 64 GB RAM, for rendering large scale scenes, it's just not worth it given the extra burdens (increased development costs, heterogeneous sets of machines in the farm, extra debugging, support), so for high-end scale, we're likely 3/4 years away yet.

dahart · on Jan 2, 2021

I used to write a production renderer for a living, now I work with a lot of people who write production renderers for both CPU and GPU. I’m not sure what line you’re drawing exactly ... if you mean that it will take 3 or 4 years before the industry will be able to stop using CPUs for production rendering, then I totally agree with you. If you mean that it will take 3 or 4 years before industry can use GPUs for any production rendering, then that statement would be about 8 years too late. I’m pretty sure that’s not what you meant, so it’s somewhere in between there, meaning some scenes are doable on the GPU today and some aren’t. It’s worth it now in some cases, and not worth it in other cases.

The trend is pretty clear, though. The size of scenes than can be done on the GPU today is large and growing fast, both because of improving engineering and because of increasing GPU memory speed & size. It’s just a fact that a lot of commercial work is already done on the GPU, and that most serious commercial renderers already support GPU rendering.

It’s fair to point out that the largest production scenes are still difficult and will remain so for a while. There are decent examples out there of what’s being done in production with GPUs already:

https://www.chaosgroup.com/vray-gpu#showcase

https://www.redshift3d.com/gallery

https://www.arnoldrenderer.com/gallery/

berkut · on Jan 2, 2021

The line I'm drawing is high-end VFX / CG is still IMO years away from using GPUs for final frame (with loads of AOVs and Deep output) rendering.

Are GPUs starting to be used at earlier points in the pipeline? Yes, absolutely, but they always were to a degree in previs and modelling (via rasterisation). They are gradually becoming more useable at more steps in pipelines, but they're not there yet for high-end studios.

In some cases, if a studio's happy using an off-the-shelf renderer with the stock shaders (so no custom shaders at all - at least until OSL is doing batching and GPU stuff, or until MDL actually supports production renderer stuff) studios can use GPUs further down the pipeline, and currently that's smaller scale stuff from what I gather talking to friends who are using Arnold GPU. Certainly the hero-level stuff at Weta / ILM / Framestore isn't being done with with GPUs, as they require custom shaders, and they aren't going to be happy with just using the stock shaders (which are much better than stock shaders from 6/7 years ago, but still far from bleeding edge in terms of BSDFs and patterns).

Even from what I hear at Pixar with their lookdev Flow renderer things aren't completely rosy on the GPU front, although it is at least getting some use, and the expectation is XPU will take over there, but I don't think it's quite ready yet.

Until a studio feels GPU rendering can be used for a significant amount of the renders (that they do, for smaller studios, the fidelity will be less, so the threshold will be lower for them), I think it's going to be a chicken-and-egg problem of not wanting to invest in GPUs on the farms (or even local workstations).

boulos · on Jan 3, 2021

I think you’re right about the current state (not quite there, especially in raw $$s), but the potential is finally good enough that folks are investing seriously on the software side.

The folks at Framestore and many other shops already don’t do more than XX GiB per frame for their rendering. So for me, this comes down to “can we finally implement a good enough texture cache in optix/the community” which I understand Mark Leone is working on :).

The shader thing seems easy enough. I’m not worried about an OSL compiled output running worse than the C-side. Divergence is a real issue, but so many studios are now using just a handful of BSDFs with lots of textures to drive, that as long as you don’t force the shading to be “per object group” but instead “per shader, varying inputs is fine”, you’ll still get high utilization.

The 80 GiB parts will make it so that some shops could go fully in-core. I expect we’ll see that sooner than you’d think, just because people will start doing interactive work, never want to give it up, and then say “make that but better” for the finals.

shaklee3 · on Jan 3, 2021

GPUs do exist with 64+GB of rao, virtually. A dgx2 has distributed memory where you can see the entire 16x32GB of address space backed by nvlink. And that technology is now 3 years old, and it's even higher now.

foota · on Jan 2, 2021

Given current consumer GPUs are at 24 GB I think 3-4 years is likely overly pessimistic.

berkut · on Jan 2, 2021

They've been at 24 GB for two years though - and they cost an arm and a leg compared to a CPU with a similar amount.

It's not just about them existing, they need to be cost effective.

lhoff · on Jan 2, 2021

Not anymore. The new Ampere based Quadros and Teslas just launched with up to 48GB of RAM. A special datacenter version with 80Gb is also already announced: https://www.nvidia.com/en-us/data-center/a100/

They are really expensive though. But chassis and rackspace also isn't free. If one beefy node with a couple GPUs can replace have a rack of CPU only Nodes the GPUs are totally worth it.

I'm not too familiar with 3D rendering but in other workloads the GPU speedup is so huge that if its possible to offload to the GPU it made sense to do it from a economical perspective.

erosenbe0 · on Jan 3, 2021

Hashing and linear algebra kernels get much more speedup on a GPU than a vfx pipeline does. But I am glad to see reports here detailing that the optimization of vfx is progressing.

erosenbe0 · on Jan 3, 2021

Desktop GPUs could have 64GB of GDDR right now but the memory bus width to drive those bits optimally (in primary use case of real-time game rendering, not offline) would up the power and heat dissipation requirements beyond what is currently engineered onto a [desktop] PCIE card.

If 8k gaming becomes a real thing you can expect work to be done towards a solution, but until then not so much.

Edit: added [desktop] preceding PCIE

fluffy87 · on Jan 3, 2021

There are already GPUs with >90GB RAM? DGX-A100 has a version with 16 A100 GPUs, having each 90 Gb.. that’s 1.4TB of GPU memory on a single node.

berkut · on Jan 2, 2021

I should also point out that ray traversal / intersection costs are generally only around 40% of the costs of extremely large scenes, and that's predominantly where GPUs are currently much faster than CPUs.

(I'm aware of the OSL batching/GPU work that's taking place, but it remains to be seen how well that's going to work).

From what I've heard from friends in the industry (at other companies) who are using GPU versions of Arnold, the numbers are no-where near as good as the upper numbers you're claiming when rendering at final fidelity (i.e. with AOVs and Deep output), so again, the use-cases - at least for high-end VFX with GPU - are still mostly for lookdev and lighting blocking iterative workflow from what I understand. Which is still an advantage and provides clear benefits in terms of iteration time over CPU renderers, but it's not a complete win, and so far, only the smaller studios have started dipping their toes in the water.

Also, the advent of AMD Epyc has finally thrown some competitiveness back to CPU rendering, so it's now possible to get a machine with x2 as many cores for close to half the price, which has given CPU rendering a further shot in the arm.

boulos · on Jan 3, 2021

Dave, doesn’t that video show more like “50% faster”? Here’s the timecode (&t=360) [1] for the “production difficulty” result (which really doesn’t seem to be, but whatever).

Isn’t there a better Vray or Arnold comparison somewhere?

As in my summary comment, an A100 can now run real scenes, but will cost you ~$10k per card. For $10k, you get a lot more threads from AMD.

[1] https://m.youtube.com/watch?v=ZlmRuR5MKmU&t=360

shaklee3 · on Jan 3, 2021

What do you mean by a lot more threads? Are you comparing an epyc?

boulos · on Jan 3, 2021

Yeah, that came off clumsily (I’d lost part of my comment while switching tabs on my phone).

An AMD Rome/Milan part will give you 256 decent threads on a 2S box with a ton of RAM for say $20-25k at list price (e.g., a Dell power edge without any of their premium support or lots of flash). By comparison, the list price of just an A100 is $15k (and you still need a server to drive the thing).

So for shops shoving these into a data center they still need to do a cost/benefit tradeoff of “how much faster is this for our shows, can anyone else make use of it, how much power do these draw...”. If anything, the note about more and more software using CUDA is probably as important as “ray tracing is now sufficiently faster” since the lack of reuse has held them back (similar things for video encoding historically: if you’ve got a lot of cpus around, it was historically hard to beat for $/transcode).

shaklee3 · on Jan 3, 2021

The reason I asked is I did a performance trade-off with a v100 and dual epyc rome with 64 cores, and the v100 won handily for my tasks. That obviously won't always be the case, but in terms of threads you're now comparing 256 to 5000+, but obviously not apples to apples.

erosenbe0 · on Jan 3, 2021

Is the high priced 256 thread part that interesting for rendering? You can get 4 of the 64 thread parts on separate boards and each one will have its own 8 channel ddr instead of having to share that bandwidth. Total performance will be higher for less or same money. Power budget will be higher but only a couple dollars a day, at most. But I haven't been involved in a cluster for some time, so not really sure what is done these days.

dahart · on Jan 3, 2021

Yes, this example isn’t quite as high as the 2-10x range I claimed, but I still liked it as an example because the CPU is very beefy, and it’s newer and roughly the same list price as the GPU being compared. I like that they compare power consumption too, and ultimately the GPU comes out well ahead. There are lots of other comparisons that show huge x-factors, this one seemed less likely to get called out for cherry picking, and @berkut’s critique of texture memory consumption for large production scenes is fair... we’re not all the way there yet. But, 50% faster is still “worth it”. In the video, Sam mentions that if you compare lower end components on both sides, the x-factor will be higher.

Narann · on Jan 3, 2021

> There is a reason Pixar and Arnold and Vray and most other major industry renderers are moving to the GPU

The reason is that those softwares need to be sold to many, and a big part of studios are doing advertise and series. GPU rendering is perfect for them as they don't need/can't afford large scale render farms.

About your example, that not honest. It's full of instances and perfect use case for a "Wow" effect but it's not a production shot. Doing a production shot required complexity management on the long run, even for CPU rendering. On this side, GPU is more "constrained" than CPU, management is even more complex.

ArtWomb · on Jan 2, 2021

Nice to have an industry insider perspective on here ;)

Can you speak to any competitive advantages a vfx-centric gpu cloud provider may have over commodity AWS? Even the RenderMan XPU looks to be OSL / Intel AVX-512 SIMD based. Thanks!

Supercharging Pixar's RenderMan XPU™ with Intel® AVX-512

https://www.youtube.com/watch?v=-WqrP50nvN4

lattalayta · on Jan 2, 2021

One potential difference is that the input data required to render a single frame of a high end animated or VFX movie might be several hundred gigabytes (even terabytes for heavy water simulations or hair) - caches, textures, geometry, animation & simulation data, scene description. Often times a VFX centric cloud provider will have some robust system in place for uploading and caching out data across the many nodes that need it. (https://www.microsoft.com/en-us/avere)

And GPU rendering has been gaining momentum over the past few years, but the biggest bottleneck until recently was availabe VRAM. Big budget VFX scenes can often take 40-120 GB of memory to keep everything accessible during the raytrace process, and unless a renderer supports out-of-core memory access, then the speed up you may have gained from the GPU gets thrown out the window from swapping data

pja · on Jan 2, 2021

As a specific example, Disney released the data for rendering a single shot from Moana a couple of years ago. You can download it here: https://www.disneyanimation.com/data-sets/?drawer=/resources...

Uncompressed, it’s 93Gb of render data, plus 130Gb of animation data if you want to render the entire shot instead of a single frame.

From what I’ve seen elsewhere, that’s not unusual at all for a modern high end animated scene.

berkut · on Jan 2, 2021

To re-enforce this, here is some discussion of average machine memory size at Disney and Weta two years ago:

https://twitter.com/yiningkarlli/status/1014418038567796738

lattalayta · on Jan 2, 2021

Oh, and also, security. After the Sony hack several years ago, many film studios have severe restrictions on what they'll allow off-site. For upcoming unreleased movies, many studios are overly protective of their IP and want to mitigate the chance of a leak as much as possible. Often times complying with those restrictions and auditing the entire process is enough to make on-site rendering more attractive.

cubano · on Jan 3, 2021

Did you really just say that one frame can be in the TB range??

Didn't you guys get the memo from B. Gates that no one will ever need more than 640k?

KaiserPro · on Jan 2, 2021

because GPUs in datacenters are expensive.

Not only that, they are massive, kickout a whole bunch of heat in new and interesting ways. worse still they depreciate like a mofo.

the tip top renderbox of today is next years comp box. a two generation old GPU is a pointless toaster.

ced · on Jan 3, 2021

> while around 3/4 x faster than CPU rendering

My understanding is that for neural networks, the speedup is much more than 4x. Does anyone know why there's such a difference?

erosenbe0 · on Jan 3, 2021

Sure. Training neural nets is somewhat analogous to starting on the top of a mountain looking for the lowest of the low points of the valley below. But instead of being in normal 3d space you might have 1000d determining your altitude, so you can't see where you're going, and you have to iterate and check. But ultimately you just calculate the same chain of the same type of functions over and over until you've reached a pretty low point in the hypothetical valley.

OTOH, Vfx rendering involves a varying scene with moving light sources, cameras, objects, textures, and physics. Much more dynamic interactions. This is a gross simplification but I hope it helps.

lattalayta · on Jan 2, 2021

Pixar renders their movies with their commercially available software, Renderman. In the past they have partnered with Intel [1] and Nvidia [2] on optimizations

I'd imagine another reason is that Pixar uses off-the-shelf Digital Content Creation apps (DCCs) like Houdini and Maya in addition to their proprietary software, so while they could optimize some portions of their pipeline, it's probably better to develop for more general computing tasks. They also mention the ability to "ramp up" and "ramp down" as compute use changes over the course of a show

[1] https://devmesh.intel.com/projects/supercharging-pixar-s-ren...

[2] https://nvidianews.nvidia.com/news/pixar-animation-studios-l...

aprdm · on Jan 2, 2021

FPGA is really expensive for the scale of a modern studio render farm, we're talking around 40~100k cores per datacenter. Because 40~100k cores isn't Google scale either it also doesn't seem to make sense to invest in custom silicon.

There's a huge I/O bottleneck as well as you're reading huge textures (I've seen textures as big as 1 TB) and writing constantly to disk the result of the renderer.

Other than that, most of the tooling that modern studios use is off the shelf, for example, Autodesk Maya for Modelling or Sidefx Houdini for Simulations. If you had a custom architecture then you would have to ensure that every piece of software you use is optimized / works with that.

There are studios using GPUs for some workflows but most of it is CPUs.

nightfly · on Jan 2, 2021

I'm assuming these 1TiB textures are procedural generated or composites? Where do this large of textures come up?

_3r2w · on Jan 2, 2021

1 terabyte sounds like an outlier, but typically texture maps are used as inputs to shading calculations. So it's not uncommon for hero assets in large-scale VFX movies to have more than 10 different sets of texture files that represent different portions of a shading model. For large assets, it may take more than fifty 4K-16K images to adequately cover the entire model such that if you were to render it from any angle, you wouldn't see the pixelation. And these textures are often stored as mipmapped 16 bit images so the renderer can choose the most optimal resolution at rendertime.

So that can easily end up being several hundred gigabytes of source image data. At rendertime, only the textures that are needed to render what's visible in the camera are loaded into memory and utilized, which typically ends up being a fraction of the source data.

Large scale terrains and environments typically make more use of procedural textures, and they may be cached temporarily in memory while the rendering process happens to speed up calculations

CyberDildonics · on Jan 2, 2021

I would take that with a huge grain of salt. Typically the only thing that would be a full terabyte is a full resolution water simulation for an entire shot. I'm unconvinced that is actually necessary, but it does happen.

An entire movie at 2k, uncompressed floating point rgb would be about 4 terabytes.

aprdm · on Jan 2, 2021

Can be either. You usually have digital artists creating them.

https://en.wikipedia.org/wiki/Texture_artist

CyberDildonics · on Jan 2, 2021

Texture artists aren't painting 1 terabyte textures dude.

forelle2 · on Jan 3, 2021

The largest texture sets are heading towards 1TB in size, or at least they were when I was last involved in production support. I saw Mari projects north of 650gb and that was 5 years ago. Disclaimer : I wrote Mari, the vfx industry standard painting system.

Note though these are not single 1TB textures, they’re multiple sets of textures, plus all of the layers that constitute them. Some large robots In particular had 65k 4K textures if you count the layers.

CyberDildonics · on Jan 3, 2021

I think we both realize that it's a bit silly to have so much data in textures that you have 100x the pixel data of a 5 second shot at 4k with 32 bit float rgb. 650GB of textures would mean that even with 10gb ethernet (which I'm not sure is common yet) you would wait at least 12 minutes just for the textures to get to the computer before rendering could start and rendering 100 frames at a time would mean 100GB/s from a file server for a single shot. Even a single copy of the textures to freeze an iteration would be thousands in expensive disk space.

I know it doesn't makes sense to tell your clients that what they are doing is nonsense, but if I saw something like that going on, the first thing I would do is chase down why it happened. Massive waste like that is extremely problematic while needing to make a sharper texture for some tiny piece that gets close to the camera is not a big deal.

forelle2 · on Jan 3, 2021

Texture caching in modern renders tends to be on demand and paged so it is very unlikely the full texture set is ever pulled from the filers.

Over texturing like this can be a good decision depending on the production. Asset creation often starts a long time before shots or cameras are locked down.

If you don’t know how an asset is to be used it makes sense to texture all of it upfront as if it will be full screen, 4K.

Taking an asset off final to ‘Upres’ it for a can be a pain in the ass and more costly than just detailing it up in the first place.

In isolation it’s a insane amount of detail and given perfect production planning it is normally not needed, but until directors lock down the scripts and shots it can be the simplest option.

CyberDildonics · on Jan 3, 2021

> Texture caching in modern renders tends to be on demand and paged so it is very unlikely the full texture set is ever pulled from the filers.

This was easier to rely on in the days before ray tracing, when texture filtering was consistent because everything was from the camera. Ray differentials from incoherent rays aren't quite as forgiving.

> If you don’t know how an asset is to be used it makes sense to texture all of it upfront as if it will be full screen, 4K.

4k textures for large parts of the asset in the UV layout can be an acceptable amount of overkill. That's not the same as putting 65,000 4k textures on something because each little part is given its own 4k texture. I know that you know this, but I'm not sure why you would conflate those two things.

> Taking an asset off final to ‘Upres’ it for a can be a pain in the ass and more costly than just detailing it up in the first place

It is very rare that specific textures need to be redone like that and it is not a big deal.

650GB of textures for one asset drags everything from iterations to final renders to disk usage to disk activity to network usage down for every shot in a completely unnecessary way. There isn't a fine line between these things, there is a giant gap between that much excessive texture resolution and needing to upres some piece because it gets close to the camera.

> Asset creation often starts a long time before shots or cameras are locked down.

This is actually fairly rare.

> In isolation it’s a insane amount of detail and given perfect production planning it is normally not needed, but until directors lock down the scripts and shots it can be the simplest option.

That's rarely how the time line fits together. It's irrelevant though, because there is no world where 65,000 4k textures on a single asset makes sense. It's multiple orders of magnitude out of bounds of reality.

I am glad that you have that insane amount of scalability as a focus since you are making tools that people rely on heavily, and I wish way more people on the tools end thought like this. Still, it is about 1000x what would set off red flags in my mind.

I apologize on behalf of whoever told you that was necessary, because they need to learn how to work within reasonable resources (which is not difficult given modern computers), no matter what project or organization they are attached to.

forelle2 · on Jan 4, 2021

Mari was designed in production at Weta, based off the lessons learned from, well, everything that Weta does.

Take for example, a large hero asset like King Kong.

Kong look development started many months before a script was locked down. Kong is 60ft tall, our leading lady is 5’2”.

We think we need shots where she’ll be standing in Kong’s hands, feet, be lifted up to his face, nose etc.

So we need fingers prints that will stand up at 4K renders, tear ducts, pores on the inside on the nose, etc etc but we don’t know. All of which will have to match shot plates in detail.

We could address each of these as the shots turn up and tell the director (who owns the company) he needs to wait a few days for his new shot, or you can break Kong into 500 patches and create a texture for each of the diffuse, 3 spec, 3 subsurface, 4 bump, dirt, blood, dust, scratch, fur, flow etc etc inputs to our shaders.

Let’s says we have 500 UDIM patches for Kong so we can sit our leading lady on the finger tips, and 20 channels to drive our shaders and effects systems.

When working the artist uses 6 paint layers for each channel ( 6 is a massive underestimate for most interesting texture work).

So we have 500 patches * 20 channels * 6 layers which gives us 60k images. Not all of these will need be at 4K however.

For Kong replace any hero asset where shots will be more placed “in and on” the asset rather than “at”. Heli carriers, oil rigs, elven great halls, space ships, giant robots.... The line between asset and environment is blurred at that point and maybe think “set” rather than “asset”

CyberDildonics · on Jan 4, 2021

500 separate 4k textures patches for a character covered in fur is excessively wasteful. Things like 3 4k subsurface maps on 500 patches maps on a black skinned creature that is mostly covered by fur is grossly unnecessary no matter who tells you it's needed.

We both know that stuff isn't showing up on film and that the excess becomes a CYA case of the emperor's new clothes where no one wants to be the one to say it's ridiculous.

> When working the artist uses 6 paint layers for each channel ( 6 is a massive underestimate for most interesting texture work).

This is intermediary and not what is being talked about.

aprdm · on Jan 4, 2021

Your opinion on something doesn’t mean much when confronted with real world experiences from the biggest studios.

CyberDildonics · on Jan 4, 2021

Maybe some day I'll know what I'm talking about. Which part specifically do think is wrong?

aprdm · on Jan 4, 2021

Focusing on the technical steps and what might be technically feasible or not versus the existing world and artists workflows. Also speaking as an authority that knows best patronizing who actually works in the industry.

CyberDildonics · on Jan 4, 2021

> Focusing on the technical steps and what might be technically feasible or not versus the existing world and artists workflows.

I would say it's the opposite. There is nothing necessary about 10,000 4k maps and definitely nothing typical. Workflows trade a certain amount of optimization for consistency, but not like this.

> patronizing who actually works in the industry.

I don't think I was patronizing. This person is valuable in that they are trying to make completely excessive situations work. Telling people (or demonstrating to them) they are being ridiculous is not his responsibility and is a tight rope to walk in his position.

> Also speaking as an authority that knows best

If I said that 2 + 2 = 4 would you ask about a math degree? This is an exercise in appeal to authority. This person and myself aren't even contradicting each other very much.

He is saying the extremes that he has seen, I'm saying that 10,000 pixels of texture data for each pixel in a frame is enormous excess.

The only contradiction is that he seems to think that because someone did it, it must be a neccesity.

Instead of confronting what I'm actually saying, you are trying to rationalize why you don't need to.

aprdm · on Jan 4, 2021

> This person is valuable in that they are trying to make completely excessive situations work. Telling people (or demonstrating to them) they are being ridiculous is not his responsibility and is a tight rope to walk in his position.

Usually the way VFX works is that technology (R&D) is very moved away from production. The artist job is getting the shot done regardless of technology and they have very short deadlines. They usually push the limits.

Digital artists are not very tech savvy in a lot of disciplines, it is not feasible to have a TD in the delivery deadlines of the shots for a show.

The person at Weta also told you how Weta actually worked in Kong which is very typical. You don't know upfront what you need. And you dismissed it as something unnecessary, still, is how every big VFX studio works. Do you feel that you know better and/or everyone is doing something wrong and hasn't really thought about it? If that is the case you might have a business opportunity for a more efficient VFX studio!

CyberDildonics · on Jan 4, 2021

Your post is an actual example of being patronizing. Before I was just trying to explain what the person I replied to probably already knew intuitively.

> how Weta actually worked in Kong which is very typical

It is not typical to have 10,000 4k maps on a creature. What has been typical when rendering at 2k is a set of 2k maps for face, torso, arms and legs. Maybe a single arm and leg can be painted and the UVs can be mirrored, though mostly texture painters will layout the UVs separately and duplicate the texture themselves to leave room for variations.

> it is not feasible to have a TD in the delivery deadlines of the shots for a show.

Actually most of the people working on shots are considered TDs. Specific asset work for some sequence with a hero asset is actually very common, which makes sense if you think about it from a story point of view of needing a visual change to communicate a change of circumstances.

4k rendering (was the 2017 king kong rendered in 4k?) and all the closeups of king kong mean that higher resolution maps and more granular sections are understandable, but it doesn't add up to going from 16 2k maps to 10,000 4k maps. Maps like diffuse, specular and subsurface albedo are also just multiplicative, so there is no reason to have multiple maps unless they need to be rebalanced against each other per shot (such as for variations).

You still never actually explained a problem or inconsistency with anything I've said.

forelle2 · on Jan 4, 2021

An interesting exercise might be working out a texture budget for this asset.

https://youtu.be/PBhCE97ZN98

This was created with the requirement that the director be able to use it at will. Closeups. Set replacements, destruction, the works.

You don’t have a shot breakdown or camera list.

You’ve got 6 months of pre production to support 1000 shots. Once in production you will be the only texture artist supporting 30 TDS.

How do you spend your 6 months to make sure production runs smoothly?

I’m kinda interested in your experience of this stuff as the numbers you’re quoting for 2k work are, in my experience, waaaaaay off and are closer to how a high end games asset would currently be textured.

I don’t disagree with you that the numbers involved are crazy when taken in isolation but it is (or at least was 5 years ago) a very common workflow at ILM, Weta, Dneg, DD, R&H Framestore etc etc. The quoted high numbers are the very upper end but many thousands of assets on hundreds of productions have been textured at what I believe you would consider “insane” detail levels.

forelle2 · on Jan 4, 2021

If you’ve worked in high end production I would love to work at your facility.

You clearly understand many of the issues involved but downplay the complexity in running high end assets in less than perfect production.

Unless the industry has changed dramatically in 5 years shot changes, per shot fixes, variants (clean, dirty, destroyed), shader tweaks, happen on every single show I’ve ever been part of.

Render time and storage is one factor as is individual artist iteration, but the real productivity killer is an inter discipline iteration.

Going from a “blurry texture” note in comp to a TD fix to a texture “upres” is potentially a 5 person, 4 day turn around. I would trade a whole bunch of cpu and storage to avoid that.

Computers are cheap, people are expensive, people coordinating even more so.

aprdm · on Jan 4, 2021

I do not think you said anything wrong, is much less about what you're saying and more about how you're saying (as if it was a simple thing to get right and people are dumb for not doing it in an optimal way).

> Actually most of the people working on shots are considered TDs.

That's not true in the studios I've been. TDs is usually reserved to more close to pipeline folks that aren't doing shot work (as in, delivering shots). They're supporting folks doing so.

For the record, I haven't downvoted you at all.

dahart · on Jan 2, 2021

> There are studios using GPUs for some workflows but most of it is CPUs.

This is probably true today, but leaves the wrong impression IMHO. The clear trend is moving toward GPUs, and surprisingly quickly. Maya & Houdini have release GPU simulators and renderers. RenderMan is releasing a GPU renderer this year. Most other third party renderers have already gone or are moving to the GPU for path tracing - Arnold, Vray, Redshift, Clarisse, etc., etc.

corysama · on Jan 2, 2021

Not an ILMer, but I was at LucasArts over a decade ago. Back then, us silly gamedevs would argue with ILM that they needed to transition from CPU to GPU based rendering. They always pushed back that their bottleneck was I/O for the massive texture sets their scenes through around. At the time RenderMan was still mostly rasterization based. Transitioning that multi-decade code and hardware tradition over to GPU would be a huge project that I think they just wanted to put off as long as possible.

But, very soon after I left Lucas, ILM started pushing ray tracing a lot harder. Getting good quality results per ray is very difficult. Much easier to throw hardware at the problem and just cast a whole lot more rays. So, they moved over to being heavily GPU-based around that time. I do not know the specifics.

greggman3 · on Jan 3, 2021

AFAIU the issue with GPU rendering is generally you have to design the assets to be GPU friendly. So while you can get a huge speed up at rendering time you get a huge slowdown creating the assets in the first place because you have new issues. Using normal maps and displacement maps instead of millions of polygons. Keeping textures to the minimal size that will get the job done, etc...

Is any of that true?

corysama · on Jan 3, 2021

Not for the ILM use-case. I expect they would stick to finely-tessellated geometry. The challenge would be moving all of that data across the PCI bus in and out of the relatively limited DRAM on the GPU. It would require a very intelligent streaming solution. Similar to the one they already have to stream resources from storage to the CPU RAM of various systems.

dahart · on Jan 2, 2021

> Can anyone comment on why Pixar uses standard CPU for processing instead of custom hardware or GPU?

A GPU enabled version of RenderMan is just coming out now. I imagine their farm usage after this could change.

https://gfxspeak.com/2020/09/11/animation-studios-renderman/

I’m purely speculating, but I think the main reason they haven’t been using GPUs until now is that RenderMan is very full featured, extremely scalable on CPUs, has a lot of legacy features, and it takes a metric ton of engineering to port and re-architect well established CPU based software over to the GPU.

Moru · on Jan 3, 2021

Isn't CPU's sturdier too? GPU's running 24/7 is rumored to not be working very well after a few years.

boulos · on Jan 2, 2021

Amusingly, Pixar did build the "Pixar Image Computer" [1] in the 80s and they keep one inside their renderfarm room in Emeryville (as a reminder).

Basically though, Pixar doesn't have the scale to make custom chips (the entire Pixar and even "Disney all up" scale is pretty small compared to say a single Google or Amazon cluster).

Until recently GPUs also didn't have enough memory to handle production film rendering, particularly the amount of textures used per frame (which even on CPUs are handled out-of-core with a texture cache, rather than "read it all in up front somehow"). I think the recent HBM-based GPUs will make this a more likely scenario, especially when/if OptiX/RTX gains a serious texture cache for this kind of usage. Even still, however, those GPUs are extremely expensive. For folks that can squeeze into the 16 GiB per card of the NVIDIA T4, it's just about right.

tl;dr: The economics don't work out. You'll probably start seeing more and more studios using GPUs (particularly with RTX) for shot work, especially in VFX or shorts or simpler films, but until the memory per card (here now!) and $/GPU (nope) is competitive it'll be a tradeoff.

[1] https://en.wikipedia.org/wiki/Pixar_Image_Computer

brundolf · on Jan 2, 2021

That wikipedia article could be its own story!

ArtWomb · on Jan 2, 2021

There's a brief footnote about their REVES volume rasterizer used in Soul World crowd characters. They simply state their render farm is CPU based and thus no GPU optimizations were required. At the highest, most practical level of abstraction, it's all software. De-coupling the artistic pipeline from underlying dependence on proprietary hardware or graphics APIs is probably the only way to do it.

https://graphics.pixar.com/library/SoulRasterizingVolumes/pa...

On a personal note, I had a pretty visceral "anti-" reaction to the movie Soul. I just felt it too trite in its handling of themes that humankind has wrestled with since the dawn of time. And jazz is probably the most cinematic of musical tastes. Think of the intros to Woody Allen's Manhattan or Midnight in Paris. But it felt generic here.

That said the physically based rendering is state of the art! If you've ever taken the LIE toward the Queensborough Bridge as the sun sets across the skyscraper canyons of the city you know it is one of the most surreal tableaus in modern life. It's just incredible to see a pixel perfect globally illuminated rendering of it in an animated film, if only for the briefest of seconds ;)

mhh__ · on Jan 2, 2021

Relative to the price of a standard node, FPGA's aren't magic : You have to find the parallelism in order to exploit it. As for custom silicon, anything on a close to a modern process costs millions in NRE alone.

From a different perspective, think about supercomputers - many supercomputers do indeed do relatively specific things (and I would assume some do run custom hardware), but the magic is in the interconnects - getting the data around effectively is where the black magic is.

Also, if you aren't particularly time bound, why bother? FPGAs require completely different types of engineers, and are generally a bit of pain to program for even ignoring how horrific some vendor tools are - your GPU code won't fail timing for example.

Moru · on Jan 3, 2021

And probably needing a complete rewrite of all the tooling they use.

Arelius · on Jan 4, 2021

Honestly? Because they have a big legacy in CPU code, and because of mostly political reasons they haven't invested in making their GPU (Realtime preview) renderer production ready till very recently. There are some serious technical challenges to solve, and not having GPUs with tons of ram among them, but the investment to solve them hasn't really been there yet.

colordrops · on Jan 2, 2021

I'm "anyone" since I know very little about the subject but I'd speculate that they've done a cost-benefit analysis and figured that would be overkill and tie them to proprietary hardware, so that they couldn't easily adapt and take advantage of advances in commodity hardware.

CyberDildonics · on Jan 2, 2021

GPUs are commodity hardware, but you don't have to speculate, this was answered well here:

https://news.ycombinator.com/item?id=25616527

jlouis · on Jan 2, 2021

Probably because CPU times fall within acceptable windows. That would be my guess. You can go faster with FPGA or silicon, but it also has a very high cost, on the order of 10 to 100 as expensive. You can get a lot of hardware for that.

brundolf · on Jan 2, 2021

In addition to what others have said, I remember reading somewhere that CPUs give more reliably accurate results, and that that's part of why they're still preferred for pre-rendered content

dahart · on Jan 2, 2021

> I remember reading somewhere that CPUs give more reliably accurate results

This is no longer true, and hasn’t been for around a decade. This is a left-over memory of when GPUs weren’t using IEEE 754 compatible floating point. That changed a long time ago, and today all GPUs are absolutely up to par with the IEEE standards. GPUs even took the lead for a while with the FMA instruction that was more accurate than what CPUs had, and Intel and other have since added FMA instructions to their CPUs.

enos_feedler · on Jan 2, 2021

I believe this to be historically true as GPUs often “cheated” with floating point math to optimize hardware pipelines for game rasterization where only looks matter. This is probably not true as GPGPU took hold over the last decade.

brundolf · on Jan 2, 2021

Ah, that makes sense

banana_giraffe · on Jan 2, 2021

One of the things they mentioned briefly in a little documentary on the making of Soul is that all of the animators work on fairly dumb terminals connected to a back end instance.

I can appreciate that working well when people are in the office, but I'm amazed that worked out for them when people moved to work from home. I have trouble getting some of my engineers to have a stable connection stable enough for VS Code's remote mode. I can't imagine trying to use a modern GUI over these connections.

mroche · on Jan 2, 2021

The entire studio is VDI based (except for the Mac stations, unsure about Windows), utilizing the Teradici PCoIP protocol, 10Zig zero-clients, and (at the time, not sure if they've started testing the graphical agent), Teradici host cards for the workstations.

I was an intern in Pixar systems for 2019 (at Blue Sky now), and we're also using a mix of PCoIP and NoMachine for home users. We finally figured out a quirk with our VPN terminal we sent home with people that was throttling connections, but the experience post-that fix is actually really good. There are a few things that can cause lag (such as moving apps like Chrome/Firefox), but for the most part unless your ISP is introducing problems it's pretty stable. And everyone with a terminal setup has two monitors, either 2*1920x1200 or 1920x1200+2560x1440.

I have a 300Mbps/35Mbps plan (turns into a ~250/35 on VPN) and it's great. We see bandwidth usage ranging from 1Mbps to ~80 on average. The vast majority being sub-20. There are some outliers that end up in mid-100s, but we still need to investigate those.

We did some cross country tests with our sister studio ILM over the summer and was hitting ~70-90ms latency which although not fantastic, was still plenty workable.

dgrant · on Jan 3, 2021

Hi. I used to work at Teradici. It was always interesting that Pixar went with VDI because it meant the CPUs that were being used as desktops during the day could be used for rendering at night. Roughly speaking. The economics made a lot of sense. A guy from Pixar came to Teradici and gave a talk all about it. Amazing stuff.

Interesting contrast with other companies that switched to VDI where it made very little sense. VMware + server racks + zero clients compared to desktops never made economic sense, at the time. But oftent here is some other factor that tips things in VDI's favour.

mroche · on Jan 3, 2021

Yep, all of their workstations were dual socket servers, where each socket was a workstation VM with PCIe passthrough, and each getting their own hostcard+GPU. Each VM had dedicated memory, but no ownership of the cores they were pinned to, so overnight if the 'workstations' were idle, another VM (also with dedicated memory) would spin up (the other VMs would be backgrounded) and consume the available cores and add itself to the render farm. An artist could then log in and suspend the job to get their performance back (I believe this was one of the reasons behind the checkpointing feature in RenderMan).

The Teradici stuff was great, and from an admin perspective having everything located in the DC made maintenance SO much better. Switching over to VDI is a long term goal for us at Blue Sky as well, but it'll take a lot more time and planning.

a_e_k · on Jan 3, 2021

That's one reason for the checkpoint feature, yes, but there are others. A few years back (Dory-era), I participated in a talk at SIGGRAPH '15 about some of them:

https://dl.acm.org/doi/abs/10.1145/2775280.2792573

http://eastfarthing.com/publications/checkpoint.pdf

jfindley · on Jan 3, 2021

Few years ago I spoke to some ILM people about their VDI setup, which at the time was cobbled together out of mesos and a bunch of xorg hacks to get VDI server scheduling working on a pool of remote machines with GPUs (I think they might even have used AWS intially but not sure - this is going back a fair few years now). I was doing a lot of work with mesos at the time, and we chatted a bit about this as our work overlapped a fair bit.

Are you still using a similar sort of setup to orchestrate the backend of this, and if so have you published anything about it? I've had a few people ask me about this sort of problem lately and there aren't too many great resources out there I can point people new to this sort of tech towards.

JustinGarrison · on Jan 5, 2021

I worked at WDAS and remember talking to the ILM team testing out the mesos VDI stuff. AFAIK it never left the POC stage but it was a really neat demo.

My team at WDAS mirrored pretty closely what Pixar did with VDI although we didn't fully switch to it for different reasons (power and heat constraints in the datacenter and price). IIRC the VDI hosts had static VMs and the teradici connection manager did all the smarts of routing user requests to a VM. There was no dynamic orchestration for us because we only had 60ish users using full VDI VMs, but even our plans of hundreds of users was still to use teradici and standard VMs on each host.

We rendered different than Pixar which also made our system a bit more static. We didn't have a separate render VM and instead rendered directly on the workstation VM when users were idle or disconnected.

mroche · on Jan 3, 2021

I wish I could answer this, but I really can't. Not because of any NDA, just that I don't know. I wasn't involved with the workstation team at Pixar (or ILM at all); I was part of the Network and Server Admin [NSA] team, specifically focused on OpenShift. There are a lot of tools that Pixar use that I don't have the full picture of how they work together.

Here at Blue Sky we are in our infancy for thin client based work. Remote terminals aren't too new as they were used for contract workers and artists who needed to WFH on the prior show, but we don't have VDI as we still use deskside workstations. For COVID, the workstations have been retrofitted with Teradici remote workstation hostcards and we send the artists home with a VPN client and zero client, utilizing direct connect. It was enough to get us going, but we have a long road ahead in optimizing this stack and eventually (if our datacenters can handle it) switching over to VDI.

lattalayta · on Jan 2, 2021

That is correct. It's pretty common for a technical artist to have a 24-32 core machine, with 128 GB of RAM, and a modern GPU. Not to mention that the entirety of the movie is stored on a NFS and can approach many hundreds of terabytes. When you're talking about that amount of power and data, it makes more sense to connect into the on-site datacenter.

__turbobrew__ · on Jan 3, 2021

I’m guessing Pixar is using a distributed file system opposed to traditional NFS? Do you have any idea what storage system render farms tend to use?

At my workplace we have a smallish HPC center and ended up moving off of NFS at about 2PB of storage since we were starting to hit the limits of NFS (think 1TB of RAM and 88 cores on a single NFS server).

aprdm · on Jan 3, 2021

Everywhere I worked has been traditional NFS, and I've seen more than 3 times the figure you quoted working well. Usually you have different mountpoints/vfs`s in different servers for different kinds of files.

__turbobrew__ · on Jan 3, 2021

Interesting, maybe the scientific computations we are doing are more I/O intense than render applications? How do studios manage disaster recovery? What happens when a multi petabyte NFS server keels over? Are there tape drive backups? It seems risky to have a such a critical system serviced by only a single node.

mprovost · on Jan 3, 2021

At Weta we divided up the NFS servers into "src" and "dat" - "src" was everything made by artists, and "dat" was the output from the renderwall. We backed up "src" every night, but "dat" was never backed up. Every once in a while there would be some mass deletion event but it was always faster to re-render the lost data than to restore from backups.

Also none of the high end commercial filers are single node - they're all clusters of varying sizes.

dagmx · on Jan 3, 2021

They're serviced by multiple nodes and have very strong backup policies.

At rhythm and Hues, you could request footage all the way back from the founding of the studio for example.

CG work is fairly IO intensive for tasks like rendering where you're reading hundreds or even thousands of geometry caches per frame. But for other things, your IO isn't as frequent since it's not about constant r/w as there are long computational or artist time between saves and reads.

JustinGarrison · on Jan 5, 2021

Same here. I had some passing information from Pixar, WDAS, and ILM and they were pretty much all NFS. Lots of NFS caching (avere) and high performance NFS appliances in use.

rperez333 · on Jan 4, 2021

I work as a compositor on a visual effects studio that had to adapt, and can say that I'm impressed too!

The studio internally uses PCOIP boxes, which I don't like due the added tiny delay (I'm a bit like those developers who complain about miliseconds of latency on their text editors...). Anyways, for the work for home setup, we are using NoMachine, which doesn't feel any different from the PCoIP boxes - unless if using the MacOS client, which is much laggier than the Windows or Linux versions.

Actually, I went ahead and tried installing Nomachine on Google Cloud and Amazon AWS CPU only instances, and got the same responsiveness of my studio setup. No fancy setups or gpu encoding/decoding.

So if you have a Nuke license, you can do some pretty heavy 2D vfx for about 1USD/hour on a 96vcpu machine (performance similar to an AMD 32core) and 196GB of RAM, even without any GPU acceleration.

banana_giraffe · on Jan 4, 2021

I've tried a few remote desktop systems. The last one I tried was Parsec, which works well, but always made me feel queasy since it requires you to trust their connection service. (To be clear, I know of no security issues there, I just don't like relying on a third party for my security)

NoMachine looks like a good answer for people like me. Thanks for the pointer, I'll check it out.

dagmx · on Jan 2, 2021

A lot of studios use thin client/ PCoiP boxes from teradici etc..

They're pretty great overall and the bandwidth requirements aren't crazy high but it does max out your data usage if you're capped pretty quickly. The faster you can be, the better the experience.

Some studios like Imageworks don't even have the backend data center in the same location. So the thin clients connect to a center in Washington state when the studios are in LA and Vancouver.

pstrateman · on Jan 2, 2021

I think most connections could be massively improved with a VPN that supports Forward Error Correction, but there doesn't seem to be any that do.

Seems very strange to me.

nikon · on Jan 3, 2021

Where can I watch the documentary?

banana_giraffe · on Jan 3, 2021

It's a little bit on Disney+, one of the extras called "Soul, Improvised". It's very much not technical, more focused on the emotional impact of WFH.

nikon · on Jan 3, 2021

Thanks! I just checked it out. So interesting they use Linux for non-developer staff!

dagmx · on Jan 3, 2021

Most of the big animation and visual effects studios are Linux based.

We even have a reference platform spec for some kind of industry wide baseline: https://vfxplatform.com/

JustinGarrison · on Jan 5, 2021

I worked at Disney Animation on the Linux engineering team for a few years. The flexibility of Linux was a key enabler for us being able to produce movies the way we did. Artists overall seemed to love the power of the Linux desktop setup we provided.

klodolph · on Jan 2, 2021

My understanding (I am not an authority) is that for a long time, it has taken Pixar roughly an equal amount of time to render one frame of film. Something on the order of 24 hours. I don’t know what the real units are though (core-hours? machine-hours? simple wall clock?)

I am not surprised that they “make the film fit the box”, because managing compute expenditures is such a big deal!

(Edit: When I say "simple wall clock", I'm talking about the elapsed time from start to finish for rendering one frame, disregarding how many other frames might be rendering at the same time. Throughput != 1/latency, and all that.)

dahart · on Jan 2, 2021

I don’t believe the average is 24 hours of wall clock. I do think average render times have increased a bit over time, but FWIW, I think the average render time needs to be “overnight”. The shot just needs to rendered before dailies in the morning. If it takes longer than maybe 6-8 hours, it risks not being done by the next day, and that means each iteration with the director takes two days instead of one. There is significant pressure to avoid that, so when shots don’t finish overnight, people generally start optimizing.

When I was doing CG production shot work ~15 years ago, there were occasionally shots that ran 24 hours, but the average was more like 3 or 4 hours. The shots that took 24 hours or more usually caused people to investigate whether something was wrong.

I worked on one such shot that was taking more than 24 hours. A scene in the film Madagascar where the boat blows a horn and all the trees on the island blow over. The trees and plants were modeled for close-ups, including flowers with stamens and pistils, but the shot was a view of the whole island. One of my co-workers wrote a pre-render filter with only a few lines of code, to check if pieces of the geometry were smaller than a pixel, and if so just discard them. IIRC, render times immediately dropped from 24 hours to 8 hours.

quelsolaar · on Jan 3, 2021

In computer graphics its known as Blinn’s Law, It states that no matter how fast hardware you get, artists put more details in to shots and therefore render times remain roughly the same. Its been roughly true for 30ish years.

mroche · on Jan 3, 2021

And it hurts. But man are the images gorgeous!

ChuckNorris89 · on Jan 2, 2021

Wait, what? 24 hours per frame?!

At the standard 24fps it takes you 24 days per film second which works out to 473 years for the average 2 hour long film which can't be right.

berkut · on Jan 2, 2021

In high-end VFX, 12-36 hours (wall clock) per frame is a roughly accurate time frame for a final 2k frame at final quality.

36 is at the high end of things, and the histogram is more skewed towards the lower end than > 30 hours, but it's relatively common.

Frames can be parallelised, so multiple frames in a shot/sequence are rendered at once, on different machines.

franzb · on Jan 4, 2021

Hi Berkut, I'd love to get in touch with you, unfortunately I couldn't find any contact info in your profile. You can find my email in my profile. Cheers!

mattnewton · on Jan 2, 2021

Not saying it's true, but I assume this is all parallizable so 24 cores would complete that 1 second in 1 day, and 3600*24 cores would complete the first hour of the film in a day, etc. And each frame might have parallizable processes to get them under 1 day wall time, but still cost 1 "day" of core-hours

KaiserPro · on Jan 2, 2021

yup, you've also got to remember that a final frame will have been rendered many times.

Each and every asset, animation, lighting, texturing sim and final comp will go through a number of revisions before being accepted.

So in all actuality that final frame could have been rendered 20+ times.

VFX farms are huge, In 2014 I worked on one that was 36K cpus and about 15pb of storage. Its probably now in the 200k cpu mark.

dralley · on Jan 2, 2021

24 hours scaled to a normal computer, not 24 hours for the entire farm per frame.

dagmx · on Jan 2, 2021

It's definitely not 24 hours per frame outside of gargantuan shots, at least by wall time. If you're going by core time, then it assumes you're serial which is never the case.

That also doesn't include rendering multiple shots at once. It's all about parallelism.

Finally, those frame counts for a film only assume final render. There's a whole slew of work in progress renders too, so a given shot may be rendered 10-20 times. Often they'll render every other frame to spot check and render at lower resolutions to get it back quick.

noncoml · on Jan 2, 2021

Maybe they mean 24 hours per frame per core

klodolph · on Jan 2, 2021

Again, I'm not sure whether this is core-hours, machine-hours, or wall clock. And to be clear, when I say "wall clock", what I'm talking about is latency between when someone clicks "render" and when they see the final result.

My experience running massive pipelines is that there's a limited amount of parallelization you can do. It's not like you can just slice the frame into rectangles and farm them out.

capableweb · on Jan 2, 2021

> It's not like you can just slice the frame into rectangles and farm them out.

Funny thing, you sure can! Distributed rendering of single frames been a thing for a long time already.

klodolph · on Jan 3, 2021

What about GI? You can't just slice GI into pieces.

capableweb · on Jan 3, 2021

How I've seen it work in the past, it'll totally work with GI (and more generally, raytracing). If the frame to be rendered is CPU bound rather than I/O (because of heavy scenes), the whole project would be farmed out to the works, so they have a full copy of what's to be rendered, then resize which part of the frame to be rendered by them. Normally this would happen locally, and if you have 8 CPU cores, each one of them get responsible for a small size of the frame. Now if you're doing distributed rendering, replace CPU core with a full machine, and you have the same principle.

Obviously doesn't work for every frame/scene/project, only if the main time is spent on actual rendering with CPU/GPU. Most of the times when doing distributed rendering, CPU isn't actually the bottleneck, but rather transferring the necessary stuff for the rendering (project/scene data structures that each worker needs).

dahart · on Jan 3, 2021

Why are you thinking GI wouldn’t work? Slicing the image plane pretty much works for parallelizing GI just as well as it does for raster. It does help to use small-ish tiles, that way you get some degree of automatic load balancing.

Hard_Space · on Jan 3, 2021

This has been possible even for CGI tinkerers like me with C4D for more than ten years.

CyberDildonics · on Jan 2, 2021

Not every place talks about frame rendering times the same. Some talk about the time it takes to render one frame of every pass sequentially, some talk about more about the time of the hero render or the longest dependency chain, since that is the latency to turn around a single frame. Core hours is usually separate because most of the time you want to know if something will be done overnight or if broken frames can be rendered during the day.

24 hours of wall clock time is excessive and the reality is that anything over 2 hours starts to get painful. If you can't render reliably over night, your iterations slow down to molasses and the more iterations you can do the better something will look. These times are usually inflated in articles. I would never accept 24 hours to turn around a typical frame as being necessary. If I saw people working with that, my top priority would be to figure out what is going on, because with zero doubt there would be a huge amount of nonsense under the hood.

joshspankit · on Jan 2, 2021

Maybe it’s society or maybe it’s intrinsic human nature, but there seems to be an overriding “only use resources to make it faster to a point, otherwise just make it better [more impressive?]”.

Video games, desktop apps, web apps, etc. And now confirmed that it happens to movies at Pixar.

dahart · on Jan 2, 2021

You can come at this from multiple directions.

On the one hand, it’s wise to only expend effort making something faster up to a point. At some point, unless a human has to wait for the result, there is no reason to make something faster [1].

On the other hand, once something takes more than a minute or two, and the person who started it goes and does something else, it doesn’t matter how long it takes, as long as it’s done before you get back. Film shots usually render overnight, so as long as they’re done in the morning and as long as they don’t prevent something else from being rendered by the morning, it doesn’t necessarily need to go faster. Somewhere out there is a blog post I remember about writing renderers and how artists behave; it posits perhaps there’s a couple of thresholds. If something takes longer than ten seconds to render, they’re going to leave to get coffee. If something takes longer than ten minutes to render, they’re going to start it at night and check on it in the morning.

[1] I always like the way Michael Abrash framed it:

“Understanding High Performance: Before we can create high-performance code, we must understand what high performance is. The objective (not always attained) in creating high- performance software is to make the software able to carry out its appointed tasks so rapidly that it responds instantaneously, as far as the user is concerned. In other words, high-performance code should ideally run so fast that any further improvement in the code would be pointless.

“Notice that the above definition most emphatically does not say anything about making the software as fast as possible. It also does not say anything about using assembly language, or an optimizing compiler, or, for that matter, a compiler at all. It also doesn’t say anything about how the code was designed and written. What it does say is that high-performance code shouldn’t get in the user’s way—and that’s all.” (From the “Graphics Programming Black Book”)

joshspankit · on Jan 3, 2021

Excellent points, but I have two counters:

- Feedback loops

- Cumulative end-to-end latency

The second is especially challenging as only a handful of coders saying “that’s good enough” can add up to perceptibly massive latency for the end user.

dahart · on Jan 3, 2021

Oh I’d agree, the blog post about coffee was somewhat tongue-in-cheek. One shouldn’t presume it’s fast enough, one should always measure. And your points echo Abrash... if a human is waiting for the computer, then the computer could be made faster. That includes any and all human-computer interactions and workflows.

Recalling a bit more now, the actual point of the blog post I was thinking of, and not summarizing super accurately, was to try to make things faster to prevent the artists from getting out of their seat, precisely because the tool it was referring to was primarily a feedback loop interaction. The tool in question was the PDI lighting tool “Light”, which received an Academy award a few years back. https://www.oscars.org/sci-tech/ceremonies/2013

brundolf · on Jan 2, 2021

Well it can't just be one frame total every 24 hours, because an hour-long film would take 200+ years to render ;)

welearnednothng · on Jan 2, 2021

They almost certainly render two frames at a time. Thus bringing the render time down to only 100+ years per film.

chrisseaton · on Jan 2, 2021

I’m going to guess they have more than one computer rendering frames at the same time.

brundolf · on Jan 2, 2021

Yeah, I was just (semi-facetiously) pointing out the obvious that it can't be simple wall-clock time

chrisseaton · on Jan 2, 2021

Why can’t it be simple wall-clock time? Each frame takes 24 hours of real wall-clock time to render start to finish. But they render multiple frames at the same time. Doing so does not change the wall-clock time of each frame.

brundolf · on Jan 2, 2021

In my (hobbyist) experience, path-tracing and rendering in general are enormously parallelizable. So if you can render X frames in parallel such that they all finish in 24 hours, that's roughly equivalent to saying you can render one of those frames in 24h/X.

Of course I'm sure things like I/O and art-team-workflow hugely complicate the story at this scale, but I still doubt there's a meaningful concept of "wall-clock time for one frame" that doesn't change with the number of available cores.

dodobirdlord · on Jan 2, 2021

Ray tracing is embarrassingly parallel, but it requires having most if not all of the scene in memory. If you have X,000 machines and X,000 frames to render in a day, it almost certainly makes sense to pin each render to a single machine to avoid having to do a ton of moving data around the network and in and out of memory on a bunch of machines. In which case the actual wall-clock time to render a frame on a single machine that is devoted to the render becomes the number to care about and to talk about.

chrisseaton · on Jan 2, 2021

Exactly - move the compute to the data, not the data to the compute.

klodolph · on Jan 2, 2021

I suspect hobbyist experience isn't relevant here. My experience running workloads at large scale (similar to Pixar's scale) is that as you increase scale, thinking of it as "enormously parallelizable" starts to fall apart.

chrisseaton · on Jan 2, 2021

Wall-clock usually refers to time actually taken, in practice, with the particular configuration they use, not time could be taken if they used the configuration to minimise start-to-finish time.

masklinn · on Jan 2, 2021

It could still be wallclock per-frame, but you can render each frame independently.

onelastjob · on Jan 2, 2021

True. With render farms, when they say X minutes or hours per frame, they mean the time it takes 1 render node to render 1 frame. Of course, they will have lots of render nodes working on a shot at once.

riffic · on Jan 2, 2021

you solve that problem with massively parallel batch processing. Look at schedulers like Platform LSF or HTCondor.

sandermvanvliet · on Jan 2, 2021

Haven’t heard those two in a while, played around with those while I was in uni 15 years ago :-O

mjgs · on Jan 3, 2021

I was a render wrangler and sys admin at a vfx shop for a few years, I then moved to a digital intermediate shop. Reading this thread has me remembering fun times.

I wrote about my experiences in more detail a couple of months ago:

https://blog.markjgsmith.com/2020/11/24/what-its-like-workin...

There’s also some render farm setup stuff in my portfolio on the blog if you’re interested.

The hardware side of things was incredible. So much expensive kit. There’s a lot of great hardware related comments in the thread, so I won’t go over that.

The thing that I found really cool was how all the software systems were setup for so many artists to collaborate. Though we didn’t talk about agile methodologies (it was back in 2003/2004), the ways we worked had a lot of similarity to a classic agile / scrum setup, with daily standups, though they were called ‘dailies’, and there was no notion of sprints (we were always sprinting!) or sprint planning, but similar planning was done by producers and department heads, instead of user stories / features, people worked on ‘shots’. At the digital intermediate place the unit of work tended to be reels since we were scanning and printing entire reels, combining all the shots that the vfx houses completed.

Everyone used version control, though it was subversion, I don’t think git was that popular back then. Artists worked on their shots, checking in their project files rather than rendered files, we could always re-render a shot if necessary. There were also cli tools to submit finished rendered files, which were automatically organised into a standard folder structure on shared storage. The artists Linux shells would load environment variables from a dB, and their applications (Shake/Maya/Houdini etc) would load these transparently, so they never had to worry about where things were stored. That was all automatic as long as they knew which shot they were working on.

It was a great place to learn about technology and collaborating on digital production at scale.

I’ve always thought there were a lot of setups and tools that could be applied to software development at scale. I’d love to work on that sort of project. Hint: I am available for hire :)

jedimastert · on Jan 2, 2021

When I was a little younger, I was looking at 3D graphics as a career path, and I knew from the very beginning that if I were to do it, I would work towards Pixar. I've always admired everything they've done, both from an artistic and technical standpoint, and how well they've meshed those two worlds in an incredible and beautiful manner.

roboyoshi · on Jan 3, 2021

Same here. I'm older now and the dream is lost a bit, but in the beginning it really drove me towards some good and kowledgeable people. I worked in Film (and Animation), but once I saw the working conditions (low pay, long hours, difficult to find work) I moved to SWE.. I still rarely do some work on the side for VFX/Animation.. and I think that is the way I keep it so I don't loose the "fun" in all of it.

supernova87a · on Jan 2, 2021

I would love to know about some curious questions, for example:

If there's a generally static scene with just characters walking through it, does the render take advantage of rendering the static parts for the whole scene once, and then overlay and recompute the small differences caused by the moving things in each individual sub frame?

Or, alternatively what "class" of optimizations does something like that fall into?

Is rendering of video games more similar to rendering for movies, or for VFX?

What are some of physics "cheats" that look good enough but massively reduce compute intensity?

What are some interesting scaling laws about compute intensity / time versus parameters that the film director may have to choose between? "Director X, you can have <x> but that means to fit in the budget, we can't do <y>"

Can anyone point to a nice introduction to some of the basic compute-relevant techniques that rendering uses? Thanks!

dagmx · on Jan 2, 2021

If you're interested in production rendering for films, there's a great deep dive into all the major studio renderers https://dl.acm.org/toc/tog/2018/37/3

As for your questions:

> Is rendering of video games more similar to rendering for movies, or VFX?

This question is possibly based on an incorrect assumption that feature (animated) films are rendered differently than VFX. They're identical in terms of most tech stacks including rendering and the process is largely similar overall.

Games aren't really similar to either since they're raster based rather than pathtraced. The new RTX setups are bringing those worlds closer. However older rendering setups like REYES that Pixar used up until Finding Dory, are more similar to games raster pipelines. though that's trivializing the differences.

A good intro to rendering is reading Raytracing in a Weekend (https://raytracing.github.io/books/RayTracingInOneWeekend.ht...), and Matt Pharr's PBRT book (http://www.pbr-book.org/)

dodobirdlord · on Jan 2, 2021

> This question is possibly based on an incorrect assumption that feature (animated) films are rendered differently than VFX. They're identical in terms of most tech stacks including rendering and the process is largely similar overall.

Showcased by the yearly highlights reel that the Renderman team puts out.

https://vimeo.com/388365999

NickNameNick · on Jan 3, 2021

For a 2020 showreal, there sure were a lot of 2019 and earlier movies in there.

I'm pretty sure one of those shots was from Alien:Covenant (2017)

djmips · on Jan 3, 2021

A showreel is more like a resume / sales sheet than a summary of a particular year. The 2020 date would only mean it included stuff up to and including 2020.

supernova87a · on Jan 2, 2021

Thanks!

(I was also reading the OP which says "...Our world works quite a bit differently than VFX in two ways..." hence my curiosity)

lattalayta · on Jan 2, 2021

One way that animated feature films are different than VFX is schedules. Typically, an animated feature from Disney or Pixar will take 4-5 years from start to finish, and everything you see in the movie will need to be created and rendered from scratch.

VFX schedules are usually significantly more compressed, typically 6-12 months, so often times it is cheaper and faster to throw more compute power at a problem rather than paying a group of highly knowledgeable rendering engineers and technical artists to optimize it (although, VFX houses will still employ rendering engineers and technical artists that know about optimization). Pixar has a dedicated group of people called Lightspeed technical artists whose sole job is to optimize scenes so that they can be rendered and re-rendered faster.

Historically, Pixar is also notorious for not doing a lot of "post-work" to their rendered images (although they are slowly starting to embrace it on their most recent films). In other words, what you see on film is very close to what was produced by the renderer. In VFX, to save time, you often render different layers of the image separately and then composite them later in a software package like Nuke. Doing compositing later allows you to fix mistakes, or make adjustments in a faster way than completely re-rendering the entire frame.

dagmx · on Jan 2, 2021

I suspect they mean more in approaches to renderfarm utilization and core stealing.

A lot of VFX studios use off the shelf farm management solutions that package up a job as a whole to a node.

I don't believe core stealing like they describe is unique to Pixar, but is also not common outside Pixar either, which is what they allude to afaik. It's less an animation vs VFX comparison, as just studio vs studio infrastructure comparison.

lattalayta · on Jan 2, 2021

Here's a kind of silly but accurate view of path tracing for animated features https://www.youtube.com/watch?v=frLwRLS_ZR0

Typically, most pathtracers use a technique called Monte Carlo Estimation, which means that they continuously loop over every pixel in an image, and average together the incoming light from randomly traced light paths. To calculate motion blur, they typically sample the scene at least twice (once at camera shutter open, and again at shutter close). Adaptive sampling rendering techniques will typically converge faster when there is less motion blur.

One of the biggest time-saving techniques lately, is machine learning powered image denoising [1]. This allows the renderer to compute significantly fewer samples, but then have a 2D post-process run over the image and guess what the image might look like if it had been rendered with higher samples.

Animated movies and VFX render each frame in terms of minutes and hours, while games need to render in milliseconds. Many of the techniques used in game rendering are approximations of physically based light transport, that look "good enough". But modern animated films and VFX are much closer to simulating reality with true bounced lighting and reflections.

[1] https://developer.nvidia.com/optix-denoiser

RantyDave · on Jan 3, 2021

> If there's a generally static scene with just characters walking through it, does the render take advantage of rendering the static parts for the whole scene once, and then overlay and recompute the small differences caused by the moving things in each individual sub frame?

Not any more. It used to be that frames were rendered in bits then composited to make the final image. However, you then need lots of tricks to reflect what would have happened to the background as a result of the foreground ... shadows, for instance. So now the entire scene is given to the renderer and the renderer is told to get on with it.

Regarding physics cheats, it depends on the renderer but basically none. AI despeckling is making a huge difference to render times, however.

Directors don't get involved in scaling laws and stuff like that. Basically a studio has a "look" that they'll quote around.

Compute relevant techniques? A renderer basically solves the rendering equation. https://en.wikipedia.org/wiki/Rendering_equation

And have a look at Mitsuba! http://rgl.epfl.ch/publications/NimierDavidVicini2019Mitsuba...

raverbashing · on Jan 2, 2021

> If there's a generally static scene with just characters walking through it, does the render take advantage of rendering the static parts for the whole scene once

From the detail of rendering they do, I'd say there's no such thing.

As in: characters walking will have radiosity and shadows and reflections so there's no such thing as "the background is the same, only the characters are moving" because it isn't.

tayistay · on Jan 2, 2021

Illumination is global so each frame needs to be rendered separately AFAIK.

CyberDildonics · on Jan 3, 2021

That doesn't make any sense. Each frame is rendered separately because you need the granularity of individual static images. It has nothing to do with global illumination.

tikej · on Jan 2, 2021

It is always a pleasure to watch/read about something that works very well it it’s domain. Nice that they put so much heart in optimising the rendering process.

cpuguy83 · on Jan 2, 2021

Indeed. I read this and instantly wanted to spend like 6 months learning the system, decisions/reasons into making whatever trade offs they make, etc.

I think a key is keeping the amount of time to render a constant.

thomashabets2 · on Jan 2, 2021

I'm surprised they hit only 80-90% CPU utilization. Sure, I don't know their bottlenecks, but I understood this to be way more parallelizable than that.

I ray trace quake demos for fun at a much much lower scale[0], and have professionally organized much bigger installs (I feel confident in saying even though I don't know Pixar's exact scale).

But I don't know state of the art rendering. I'm sure Pixar knows their workload much better than I do. I would be interested in hearing why, though.

[0] Youtube butchers the quality in compression, but https://youtu.be/0xR1ZoGhfhc . Live system at https://qpov.retrofitta.se/, code at https://github.com/ThomasHabets/qpov.

Edit: I see people are following the links. What a day to overflow Go's 64bit counter for time durations on the stats page. https://qpov.retrofitta.se/stats

I'll fix it later.

mike_d · on Jan 2, 2021

Rendering may be highly parallelizable, but the custom bird flock simulation they wrote may be memory constrained. This is why having a solid systems team who can do care and feeding of a job scheduler is worth more than expanding a cluster.

brundolf · on Jan 2, 2021

My guess would be that the core-redistribution described in the OP only really works for cores on the same machine. If there's a spare core being used by none of the processes on that machine, a process on another machine might have trouble utilizing it because memory isn't shared. The cost of loading (and maybe also pre-processing) all of the required assets may outweigh the brief window of compute availability you're trying to utilize.

thomashabets2 · on Jan 2, 2021

Yeah. With my pov-ray workload the least efficient part is povray loading the complex scene. And that's not multithreaded. The solution that works for me is that I can just start ncore frames concurrently, or just two, but stagger them a bit (thus there's always one frame doing parallelizable work and can use all cores, even if the other is not).

But at that point it may get into RAM constraint, or some as yet unmentioned inter-frame dependency/caching.

dagmx · on Jan 2, 2021

I suspect they mean core count utilization, not per core utilization.

Ie there's some headroom left for rush jobs and a safety net, because full occupancy isn't great either.

thomashabets2 · on Jan 2, 2021

Must be RAM then, because CPU is easy to prioritize.

KaiserPro · on Jan 2, 2021

maxing a CPU is easy, keeping it fed with data, and being able to save that data out is hard.

thomashabets2 · on Jan 2, 2021

Yes, but the work units (frames) are large enough that I'm still surprised.

Maybe they're not as parallelizable as I'd expect. E.g. if there's serial work to be done by reusing scene layout algorithms between frames.

KaiserPro · on Jan 2, 2021

A scene will have many thousands of assets (trees, cars, people, etc) each one will have the geo, which could be in the milllions of polygons (although they use sub-ds)

each "polygon" could have a 16k texture on it. You're pulling TBs of textures and other assets in each frame.

thomashabets2 · on Jan 3, 2021

Hmm, yes I see. TBs? Interesting. I'd like to hear a talk about these things.

Naively I would expect that (as is the case for my MUCH smaller scale system) that I can compensate for network/disk-bound and non-multithreaded stages by merely running two concurrent frames.

On a larger scale I would expect to be able to estimate RAM-cheap frames, and always have one of them running per machine, but at SCHED_IDLE priority, so that they only get CPU when the "main" frame is blocked on disk or network, or a non-parallelizable stage. By starving one frame of CPU, it's much more likely that it'll need CPU the short intervals when it's allowed to get it.

Razengan · on Jan 3, 2021

Twitter really should just add support for long ass tweets instead of this janky mess of trying to pick out all the crumbs of people replying to themselves.

How the hell can we have so much tech and leave so many common use cases so fucking broken for so long

It’s a perfect example of how you don’t really decide how others use your products; people do.

alasdair_ · on Jan 3, 2021

Around 2002 I worked on Sun Microsystem’s Grid Engine and their “go to” demo for new customers was always a big Pixar render farm job. There was something about seeing the final rendered clip that convinced people that massively parallel work was the way to go for their use case.

2OEH8eoCRo0 · on Jan 2, 2021

From what I understand they still seem to render at 1080p and then upsample to 4k. Judging by Soul.

dodobirdlord · on Jan 2, 2021

That seems extremely unlikely. The Renderman software they use has no issues rendering at 4k.

dagmx · on Jan 2, 2021

It's not really that unlikely. Most films render at 2k DCI. It's not so much that the software and hardware can't render higher, it's just diminishing gains for the increased render time.

Most 4k films till very recently actually have the digital elements at 2k DCI-ish resolutions and are upscaled. I can't speak to wether soul is rendered at 2k or 4k, but it wouldn't be surprising if it was 2k upscaled.