More

flx42_ · on July 8, 2020

Thanks, I have reported it internally and it is now fixed.

flx42_ · on May 2, 2018

nvidia-docker[1] maintainer here.

Curious to know, are you using docker today? If yes, is there anything missing to satisfy your security requirements?

[1] https://github.com/NVIDIA/nvidia-docker

bradneuberg · on May 2, 2018

We aren't using Docker today; we have jailing infrastructure using linux cgroups, namespaces, seclist, etc.

flx42_ · on Aug 23, 2017

Just wanted to chime in on TensorRT, it's a well supported product and it's different than gpu-rest-engine. This GitHub repo is simply an example of how to use TensorRT in a specific situation.

flx42_ · on Aug 18, 2017

We document how this on our wiki: https://github.com/NVIDIA/nvidia-docker/wiki/Internals

> The added benefit of this is that you can use different versions of the drivers side-by-side (in my understanding).

No, you can only have one driver version, the one that correspond to the loaded kernel modules. Installing the driver inside a Docker image makes it non-portable.

ThePhysicist · on Aug 18, 2017

Ah thanks for the clarification, I was not aware of this!

flx42_ · on Aug 18, 2017

It allows you to run GPU-accelerated applications (like machine learning, HPC, video/image processing...) inside a Docker container.

flx42_ · on Sept 9, 2016

If using Docker is an option, the official Dockerfile works well, you just need to modify the FROM line to "nvidia/cuda:8.0-cudnn5-devel-ubuntu16.04". Or "nvidia/cuda:8.0-cudnn5-devel-ubuntu14.04", depending on which version of Ubuntu you want.

https://github.com/tensorflow/tensorflow/blob/master/tensorf...

mnbbrown · on Sept 9, 2016

What sort of performance impact would Docker have in this situation? Any at all?

Edit: spelling

flx42_ · on Sept 9, 2016

No performance impact as long as your I/O is done in volumes, to avoid going through AUFS.

flx42_ · on Sept 9, 2016

One of your section is named "Install Nvidia Toolkit 7.5", this is probably what confused parent @hughperkins.

wagonhelm · on Sept 9, 2016

Just found that one and fixed it, also someone confirmed working with a GTX 1080. I am so happy to finally ditch 14.04

flx42_ · on Aug 7, 2016

At NVIDIA we maintain this utility: https://github.com/NVIDIA/nvidia-docker

It automatically discovers the devices and the right driver files on the host.

The main goal is compute (CUDA), but we also demonstrated how to run TF2 on Steam OS during our DockerCon 16 OpenForum presentation.

Nice job! :)

voltagex_ · on Aug 8, 2016

I'm really not up on how this all works, but isn't https://github.com/NVIDIA/nvidia-docker/blob/master/ubuntu-1... hardcoding driver versions in a different way?

flx42_ · on Aug 8, 2016

No, this is the CUDA toolkit, it doesn't depend on the driver version. You can compile CUDA code without having a GPU (which is the case during a "docker build").

Edit: in other words, your Docker image doesn't depend on a specific driver version and can be ran on any machine with sufficient drivers. Driver files are mounted as a volume when starting the container.

flx42_ · on July 16, 2016

Well yes, you do need to have the driver installed on the host OS :)

You can run multiple containers on the same GPU with nvidia-docker, it's exactly the same as running multiple processes (without Docker) on the same GPU.

flx42_ · on July 16, 2016

Author of nvidia-docker here. You can definitely have multiple containers on each GPU if you want. If you find a bug or if you think the documentation was not great, please file a bug!

SequoiaHope · on July 16, 2016

Awesome. Thanks for the reply and I apologize for suggesting something incorrect.

It does strike me as tricky needing to match driver versions between the host and the container. Do you know if there is any effort to eliminate that requirement?

Also while we're chatting, is there any hope of NVIDIA open sourcing their linux drivers? How would such a move affect nvidia-docker?

flx42_ · on July 17, 2016

You don't need to match the driver version between the host and the container. Actually, you shouldn't include any driver file inside the container.

All the user-level driver-files required for execution are mounted when the container is started using a volume. This way you can deploy the same container on any machine with NVIDIA drivers installed.

We have more details on our wiki: https://github.com/NVIDIA/nvidia-docker/wiki/Internals

Concerning your last question: I don't have any information on this topic, but anyway it would not really impact nvidia-docker.

dharma1 · on July 16, 2016

Thanks for your superb work. Is it possible to use nvidia-docker on several AWS instances, to use multiple GPUs? (To spread training on multiple GPUs for more speed and ram. Tensorflow and Caffe support distributed training but not sure if it's viable on dockerized envs on AWS?)

flx42_ · on July 16, 2016

One container can use multiple GPUs on the same machine without problems.

For distributed training (which Caffe doesn't actually support, not the official version), you would have to run one container per instance, but this is more a configuration problem at the framework level, than a Docker or nvidia-docker problem.