Why to() failed?

(WenBin Yan) #1

I define a model, and use model.to(DEVICE) assign to GPU. but i always get errors below:

RuntimeError: Input type (torch.cuda.FloatTensor) and weight type (torch.FloatTensor) should be the same

This is my codes:
class BranchNet(nn.Module):
def init(self, dropout=DROPOUT, num_classes=NUM_CLASSES):
super(BranchNet, self).init()

DEVICE = torch.device(“cuda:2”)

model = BranchNet(dropout=dropout, num_classes=NUM_CLASSES)
model = model.to(DEVICE)
image = image.to(DEVICE)
label = label.to(DEVICE)

(Alban D) #2

Hi,

The error is most likely not with the .to() op. It’s just that cuda is asynchronous and so errors will point to the wrong line. Run with CUDA_LAUNCH_BLOCKING=1 to make sure the error points to the right line.

1 Like
(balamurali) #3

As far as I can see. This is nothing to do with to() method. In general, pytorch expects both the model and the input data to be of the same data type i.e float in our case. And if you are using GPU both the model weight and the input should be moved to gpu. Else the error mentioned will occur. As the first step check for the architecture class.

(WenBin Yan) #4

Thank you!

CUDA_LAUNCH_BLOCKING is a ENV VARIABLE ?

(WenBin Yan) #5

I have moved sample and model to same device already.

(Simon Wang) #6

you likely did something wrong in your module code, e.g., not properly register parts as submodules.

@albanD CUDA is async, but out-of-kernel checks are not, since they are done without looking at the data contained. So CUDA_LAUNCH_BLOCKING won’t change things.

(Alban D) #7

Ho right I read it too fast :confused:
Which line exactly causes the issue? I should help you know which module is to blame as simon said.