Torchvision FasterRCNN Bounding box error

I am using a model for Faster RCNN. The model looks something like this

device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
model = torchvision.models.detection.fasterrcnn_resnet50_fpn(pretrained=False,num_classes=4)
model.to(device)

The problem is when I am sending input bounding boxes after every single forward pass the bounding box value of the input tensor is getting altered. I have attached the code snippet below

print("In train before:")
print(y_train[0]["boxes"])
output = model(x_train,y_train)
print("In train after")
 print(y_train[0]["boxes"])

I get different results in line2 and line 5. Why is that? Additionally, I am also adding the helper function I use to convert from numpy to torch

def from_numpy_to_tensor(images,labels_list):

    images = torch.from_numpy(images).cuda()
    for label in labels_list:
        label["boxes"] = torch.from_numpy(label["boxes"]).cuda()
        label["labels"] = torch.from_numpy(label["labels"]).cuda()

    return images,labels_list
1 Like

The targets will be resized in these lines of code, and reassigned to the list, so this should be expected.

Thanks @ptrblck. But won’t it be a problem as I have noticed with every iteration the target values were getting reduced further and further.

Please forgive me if my question is bit naive. I am new to this

That’s a good point.
Are you seeing the same issue using a DataLoader (with multiple workers)?
I’m just asking, as this effect might disappear if you are not working on the target list directly.
If not, then you might have found a bug and we’ll look into it.

1 Like

I have not used Dataloader. I will give it a try and update it here. Thanks again

@ptrblck as per your suggestions I tried with DataLoader but the result was the same.
The code snippet below creates a custom dataset object which takes in images and targets of type torch cuda float .

class CustomDataset(torch.utils.data.Dataset):

    def __init__(self,xtr,ytr):

        self.xtr = xtr
        self.ytr = ytr

    def __getitem__(self,idx):

        img = self.xtr[idx]
        tar = self.ytr[idx]

        return img,tar

    def __len__(self):

        return len(self.xtr)
    

def collate_fn(batch):
    return list(zip(*batch))

The dataset sample is pretty small. I was trying with a single image. I am facing some problems with num_workers as it is throwing me Cuda Runtime error so decided to skip it. The rest of the code is given below.

device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
model = torchvision.models.detection.fasterrcnn_resnet50_fpn(pretrained=False,num_classes=4)
model.to(device)
optimizer = optim.Adam(model.parameters(),lr=0.000001)

## trying with one image only
x_train = images[0:1]   
y_train = labels[0:1]

## Converts numpy to TorchCudaFloat 
x_train,y_train = from_numpy_to_tensor(x_train,y_train)

## DataLoader object 
dataset = CustomDataset(x_train,y_train)
dataloader = DataLoader(dataset,batch_size=1,collate_fn=collate_fn)


## Iterations
for i in range(20):
    print("Iter No:",i)
    for xtr,ytr in dataloader:
        #print("Iter No:",i)
        #optimizer.zero_grad()
        ytr = list(ytr)
        #print("In train before:")
        print(ytr)
        print(xtr)
        output = model(xtr,ytr)

At the start of the iteration my targets are something like this:-

[{'boxes': tensor([[ 311.9933, 1013.7640,  719.6339, 1142.7417],
        [ 308.1646,  928.4176,  739.4443,  961.6580],
        [ 308.7562,  830.7968,  740.0359,  864.0373],
        [ 305.8657,  680.8315,  763.0424,  708.1243],
        [ 300.3259,  439.0691,  790.4506,  523.2395],
        [ 306.6932,  248.2031,  741.7596,  458.5648]], device='cuda:0'), 'labels': tensor([4, 3, 3, 3, 2, 1], device='cuda:0')}]

And at the end of the 20th iteration the targets have changed to:-

[{'boxes': tensor([[117.7318, 383.4572, 271.5563, 432.2433],
        [116.2870, 351.1749, 279.0319, 363.7481],
        [116.5102, 314.2498, 279.2552, 326.8230],
        [115.4195, 257.5252, 287.9367, 267.8488],
        [113.3290, 166.0784, 298.2794, 197.9159],
        [115.7318,  93.8831, 279.9056, 173.4527]], device='cuda:0'), 'labels': tensor([4, 3, 3, 3, 2, 1], device='cuda:0')}]

As you can see the value of the targets are getting altered by a huge marging just in 20 iterations in just a simple forward pass.
@ptrblck Is this a bug? Do I have to open an issue in that case?

3 Likes

After few digging I have found a workaround that does not change the targets. The code snippet is given below.

## Iterations
for i in range(20):
    print("Iter No:",i)
    for xtr,ytr in dataloader:
        #print("Iter No:",i)
        #optimizer.zero_grad()
        y_tr = [{k:v for k,v in t.items()} for t in ytr]
        #print("In train before:")
        output = model(xtr,y_tr)
        #print("In train after")
        print(ytr)

Instead of directly sending targets ytr , I am creating another variable y_tr with same contents as that of ytr and sending that to the model. This does not change the value of targets.

Thanks for the debugging and this looks indeed like unwanted behavior!
I haven’t reproduced it yet, but would you mind creating an issue here so that we could track and fix it?

Ok sure. Will do that