RuntimeError: cublas runtime error : resource allocation failed

I tried to run this simple model. Basically this model is for text classification. I have used random word embeddings.

import torch.nn as nn
import torch.nn.functional as F

class DSC_RE(nn.Module):
    def __init__(self, input_dim, embedding_dim, output_dim):


        self.embedding = nn.Embedding(input_dim, embedding_dim)

        self.fc = nn.Linear(embedding_dim, output_dim)

    def forward(self, text):

        #text = [batch size, sent len]


        #text = [sent len, batch size]

        embed = self.embedding(text)

        #embed = [sent len, batch size, embedding dim]

        embed = embed.permute(1,2,0)

        #embed = [batch size, embedding dim, sent len]

        output = F.max_pool1d(embed, embed.size(2))

        #output = [batch size, embedding dim,1]

        output = output.squeeze(2)

        #output = [batch size, embedding dim]

        return self.fc(output)

However, I got this error when I ran the code

RuntimeError                              Traceback (most recent call last)
<ipython-input-11-d82622e3ea00> in <module>()
      1 n_epochs=20
----> 2 run_model(model,train_dataloader, test_dataloader,optimizer,criterion,n_epochs)

6 frames
/usr/local/lib/python3.6/dist-packages/torch/nn/ in linear(input, weight, bias)
   1368     if input.dim() == 2 and bias is not None:
   1369         # fused op is marginally faster
-> 1370         ret = torch.addmm(bias, input, weight.t())
   1371     else:
   1372         output = input.matmul(weight.t())

RuntimeError: cublas runtime error : resource allocation failed at /pytorch/aten/src/THC/THCGeneral.cpp:216

Any help is really appreciated…

I ran this code in Google Colab using Tesla K80 GPU and PyTorch 1.x version

1 Like

Could you post the arguments of your model, i.e. input_dim, embedding_dim, and output_dim?
Also, could you post the exact PyTorch and CUDA versions (print(torch.__version__), print(torch.version.cuda))?


Torch version - 1.3.1
Cuda version - 10.1.243

Input_dim = 1747 (Vocabulary size)
embedding_dim = 100
output_dim =1 (binary classification)

Facing the same issue now. Were you able to find the solution? Any ideas?

I cannot reproduce @Kalyan_Katikapalli’s error using this code with the nightly binaries + CUDA10.1:

model = DSC_RE(1747, 100, 1).cuda()
x = torch.randint(0, 1747, (100, 1)).cuda()
out = model(x)

Could you post an executable code snippet to reproduce this error?

@ptrblck Thanks for trying to help. I solved my problem. The issue was that my tensor had an element which was larger than the Input_dim in my code.