Yup, 2x40. It also depends on what you plan to do with the gpu. For example, mod...

Yup, 2x40.

It also depends on what you plan to do with the gpu. For example, models that do most of the work on the gpu and rarely ingest data from the host, such as large and slow models, will run just fine. On the other hand, attempting to parallelize training across GPUs and nodes is a chore...