Tensorflow – Low NVIDIA GPU Usage with Keras and Tensorflow

gpukerastensorflow

I'm running a CNN with keras-gpu and tensorflow-gpu with a NVIDIA GeForce RTX 2080 Ti on Windows 10. My computer has a Intel Xeon e5-2683 v4 CPU (2.1 GHz). I'm running my code through Jupyter (most recent Anaconda distribution). The output in the command terminal shows that the GPU is being utilized, however the script I'm running takes longer than I expect to train/test on the data and when I open the task manager it looks like the GPU utilization is very low. Here's an image:

Note that the CPU isn't being utilized and nothing else on the task manager suggests anything is being fully utilized. I don't have an ethernet connection and am connected to Wifi (don't think this effects anything but I'm not sure with Jupyter since it runs through the web broswers). I'm training on a lot of data (~128GB) which is all loaded into the RAM (512GB). The model I'm running is a fully convolutional neural network (basically a U-Net architecture) with 566,290 trainable parameters. Things I tried so far:
1. Increasing batch size from 20 to 10,000 (increases GPU usage from ~3-4% to ~6-7%, greatly decreases training time as expected).
2. Setting use_multiprocessing to True and increasing number of workers in model.fit (no effect).

I followed the installation steps on this website: https://www.pugetsystems.com/labs/hpc/The-Best-Way-to-Install-TensorFlow-with-GPU-Support-on-Windows-10-Without-Installing-CUDA-1187/#look-at-the-job-run-with-tensorboard

Note that this installation specifically DOESN'T install CuDNN or CUDA. I've had trouble in the past with getting tensorflow-gpu running with CUDA (although I haven't tried in over 2 years so maybe it's easier with the latest versions) which is why I used this installation method.

Is this most likely the reason why the GPU isn't being fully utilized (no CuDNN/CUDA)? Does it have something to do with the dedicated GPU memory usage being a bottleneck? Or maybe something to do with the network architecture I'm using (number of parameters, etc.)?

Please let me know if you need any more information about my system or the code/data I'm running on to help diagnose. Thanks in advance!

EDIT: I noticed something interesting in the task manager. An epoch with batch size of 10,000 takes around 200s. For the last ~5s of each epoch, the GPU usage increases to ~15-17% (up from ~6-7% for the first 195s of each epoch). Not sure if this helps or indicates there's a bottleneck somewhere besides the GPU.

Best Answer

I would first start by running one of the short "tests" to ensure Tensorflow is utilizing the GPU. For example, I prefer @Salvador Dali's answer in that linked question

import tensorflow as tf
with tf.device('/gpu:0'):
    a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a')
    b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b')
    c = tf.matmul(a, b)

with tf.Session() as sess:
    print (sess.run(c))

If Tensorflow is indeed using your GPU you should see the result of the matrix multplication printed. Otherwise a fairly long stack trace stating that "gpu:0" cannot be found.

If this all works well that I would recommend utilizing Nvidia's smi.exe utility. It is available on both Windows and Linux and AFAIK installs with the Nvidia driver. On a windows system it is located at

C:\Program Files\NVIDIA Corporation\NVSMI\nvidia-smi.exe

Open a windows command prompt and navigate to that directory. Then run

nvidia-smi.exe -l 3

This will show you a screen like so, that updates every three seconds.

Here we can see various information about the state of the GPUs and what they are doing. Of specific interest in this case is the "Pwr: Usage/Cap" and "Volatile GPU-Util" columns. If your model is indeed using the/a GPU these columns should increase "instantaneously" once you start training the model.

You most likely will see an increase in fan speed and temperature unless you have a very nice cooling solution. In the bottom of the printout you should also see a Process with a name akin to "python" or "Jupityr" running.

If this fails to provide an answers as to the slow training times than I would surmise the issue lies with the model and code itself. And I think its is actually the case here. Specifically viewing the Windows Task Managers listing for "Dedicated GPU Memory Usage" pinged at basically maximum.

Related Solutions

Tensorflow – Low GPU usage by Keras / Tensorflow

Could be due to several reasons but most likely you're having a bottleneck when reading the training data. As your GPU has processed a batch it requires more data. Depending on your implementation this can cause the GPU to wait for the CPU to load more data resulting in a lower GPU usage and also a longer training time.

Try loading all data into memory if it fits or use a QueueRunner which will make an input pipeline reading data in the background. This will reduce the time that your GPU is waiting for more data.

The Reading Data Guide on the TensorFlow website contains more information.

Python – run Keras model on gpu

Yes you can run keras models on GPU. Few things you will have to check first.

your system has GPU (Nvidia. As AMD doesn't work yet)
You have installed the GPU version of tensorflow
You have installed CUDA installation instructions
Verify that tensorflow is running with GPU check if GPU is working

sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))

for TF > v2.0

sess = tf.compat.v1.Session(config=tf.compat.v1.ConfigProto(log_device_placement=True))

(Thanks @nbro and @Ferro for pointing this out in the comments)

from tensorflow.python.client import device_lib
print(device_lib.list_local_devices())

output will be something like this:

[
  name: "/cpu:0"device_type: "CPU",
  name: "/gpu:0"device_type: "GPU"
]

Once all this is done your model will run on GPU:

To Check if keras(>=2.1.1) is using GPU:

from keras import backend as K
K.tensorflow_backend._get_available_gpus()

All the best.

Best Answer

Related Solutions

Tensorflow – Low GPU usage by Keras / Tensorflow

Python – run Keras model on gpu

Related Topic