Low GPU utilization with CUDA?

Is it normal for the GPU load to only be 20% with CUDA? I have torch.backends.cudnn.benchmark=True as well with no performance increase.

My model is defined as follows:

class LSTM(torch.nn.Module):

    def __init__(self, N, D_in, H, H_LSTM, D_out):
        super(LSTM, self).__init__()
        self.H = H
        self.in2lstm = torch.nn.Linear(D_in, H)
        self.lstm = torch.nn.LSTM(H, H_LSTM, dropout=0.5)
        self.lstm2out = torch.nn.Linear(H_LSTM, D_out)

    def forward(self, x, hidden, t, seq_len):
        y_pred = self.in2lstm(x[t:t+seq_len].view(seq_len, -1))
        y_pred = torch.nn.ELU()(y_pred)
        y_pred, hidden = self.lstm(y_pred.view(seq_len, 1, self.H), hidden)
        y_pred = self.lstm2out(y_pred.view(seq_len, -1))
        return y_pred, hidden

model = LSTM(N, D_in, H, H_LSTM, D_out).cuda()

if your model is small, then yes.

thanks for your reply!
Does “small model” refer to all kind of model including LSTM, CNN and others?
I met the problem that sometimes 5-layer CNN can reach very high GPU utilization while a resnet18 network cannot. Is putting the Variable on GPU and using DataParallel() enough to ensure the full utilization of GPUs? or I just wondered whether there is a way to ensure the high utilization of GPUs?
Thanks again.

@smth GPU utilization is showing up 0 even when I’m specifying the variables and network to shift to cuda. I’m using p2xlarge instance on aws. Can you help me with this?