It has to be noted: it's quite a strange approach all round - this framework rea...

yvdriess · on April 24, 2020

Untrue. Large memory nodes are par the course for genomics workloads. Only a few stages of the analysis pipeline can stream the in/out effectively. Even then, going to disk for the result only to be read back is going to bottleneck your pipeline.

The authors go in more detail around this topic in: https://journals.plos.org/plosone/article?id=10.1371/journal...

imtringued · on April 24, 2020

We once asked our cluster department to give us bare metal access to their smallest machine. They gave us a dedicated server with 256GB RAM. The real cluster nodes are bigger than that because they handle multiple jobs at the same time and they have hundreds of them.

This isn't some electron framework where it is unexcusable to use that much RAM. The hardware that is available to scientists is more than capable of handling these workloads.