I am working on a project where I am using a VGG to process images. I am passing two input at once in a batch to the VGG, and I am observing that the output for each input in the batch is different from the corresponding output when the input is processed individually.
After checking the output of each layer, I found the output differs from a conv2d layer, features[7].
I have checked the implementation of the convolutional layer, and I am confident that the forward pass is being performed correctly.
I suspect that there may be some other issue that I am overlooking, and I would appreciate any insights or suggestions on how to resolve this issue.
Here is my test code:
import torch
import torchvision
model = torchvision.models.vgg16(weights=torchvision.models.VGG16_Weights.DEFAULT).eval().cuda()
x = torch.rand(2, 3, 224, 224).cuda()
x1 = x[None, 1]
diffs = [(model.features[:i + 1](x)[1] - model.features[:i + 1](x1)).abs().sum().item() for i in range(30)]
print(diffs)
It shows:
[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 7.0730438232421875, 1.6718090772628784, 0.8785703182220459, 41.91109085083008, 16.41341781616211, 152.1153106689453, 42.8715705871582, 290.87322998046875, 30.523223876953125, 15.055337905883789, 134.59072875976562, 29.589153289794922, 140.07200622558594, 28.335479736328125, 140.5168914794922, 8.338151931762695, 3.884805917739868, 20.979028701782227, 4.820128440856934, 20.844335556030273, 3.523458480834961, 17.630252838134766, 0.9822019934654236]
–
I found it only happens in cuda. If layer is in cpu, there is no problem. Why?