No just LTO. Right now only Ubuntu, Fedora and SUSE Tumbleweed turn it on by default.
I’ve rebased a few of my containers with SUSE and noticed some improved load times on my web services as well. I don’t run anything demanding either, just bored. It’s like half a second improvement lol.
Thanks!
As I understand it, it bind-mounts the /dev/nvidia devices and the CUDA toolkit binaries inside the container, giving it direct access just as if it was running on the host. It’s not virtualized, just running under a different namespace so the VRAM is still being managed by the host driver. I would think the same restrictions exist in containers that would apply for running CUDA applications normally on the host. Personally I’ve had up to 4 containers run GPU processes at the same time on 1 card.
And yes, Nvidia hosts it’s own GPU accelerated container images for PyTorch, Tensorflow and a bunch of others on the NGC. They also have images with the full CUDA SDK on their dockerhub.