There are very few scenes that need Bi-directional or VCM (in VFX, anyway) - in theory it converges faster for indirect illumination, but due to the fact both methods need to be used in incremental mode (1 sample per pixel over the entire image), you significantly lose cache coherency for texture access (even for camera rays) as you're constantly trashing the texture cache, meaning renders are a lot slower. There are much better ways of reducing this indirect noise in production (portals, blockers).
On top of this, it's also very difficult to get the ray differentials correct when merging paths, so you end up point-sampling the textures, meaning huge amounts of texture IO.
So there are two different things there, bidirectional tracing without and with VCM. VCM takes longer to trace but takes care of outlier samples that can't be oversampled away in practice.
When it comes to any sort of bounce, forward raytracing is painful, anything that helps is good.
Most renderers don't take into account much cache coherency of textures at all, which makes me think you work for Disney?
Per iteration, bi-directional is extra work too. Obviously these integration methods are much better at finding certain hard-to-find light paths/sources, but my point is that in VFX, it's generally good enough to fake stuff by just turning shadow/transmission (depending on renderer) visibility off for certain objects to allow light through.
It's rare that we actually have glass/metal objects with lights in/around them such that bi-directional / VCM actually makes sense - even for stuff like eye caustics we've found that uni-directional does a pretty good job. And other situations like car headlights behind glass with metal reflectors behind the light, just turn transmission visibility off for the glass part, and yes, it's not fully-accurate (in terms of refraction and light leak lines), but we're generally using IES profiles for stuff like this so we get accurate light spread patterns anyway.
Well, they do in that camera rays generally (and light rays in bi-directional/VCM) end up using the higher mipmap levels of textures, so you're reading a lot more data for these samples, hence pushing stuff out of cache much more: we've seen this with PRMan 19/20 in RIS: using incremental can have a 3x slowdown in some cases compared to non-incremental, as the camera rays are much more coherent per-bucket in non-incremental, so the highest level mipmaps are kept in cache much more. With incremental, you're only sending the bucket size number of samples and equivalent texture reads for the camera rays, with secondary bounces generally using much smaller/lower mipmap tiles for the texture request (and you can get away with box filtering these in 95% of the cases), then moving on to the next bucket, which will probably need completely different high-level mipmap tiles for its camera rays. With texture IO often being the bottleneck in VFX rendering, this is a huge issue.
Nope, still in London for a bit, then off to NZ...
On top of this, it's also very difficult to get the ray differentials correct when merging paths, so you end up point-sampling the textures, meaning huge amounts of texture IO.