Training network with multiple labels per image

I am trying to train a network that has more than 1 label per image. For example, an image may have 2 pedestrians or 1 pedestrian and 1 cyclist etc. The dataset comes with jpgs and xml files for the annotations. I have created the following dataloader for this task:

class dataLoader(Dataset):
    def __init__(self, path, root, img_trfm=None, resize=None):
        self.filename = get_file_name(path) = self.filename
        self.root = root
        self.resize = resize
        self.img_resize = transforms.Resize((resize, resize))
        self.img_trfm = img_trfm
        self.data_len = len(

    def __getitem__(self, index):
        filename = (self.filename[index])
        objects, num_objs = get_targets(filename+'.xml')
        img_name = objects[-1]
        img ='.jpg')
        img_trfm = self.img_trfm(img)
        #adds a dictionary for name of classes to 
        #corresponding integers 
        for i in range(num_objs-1):
            objects[i]['bndbox'] = torch.Tensor(objects[i]['bndbox'])
            objects[i]['id'] = get_class_id(objects[i]['label']) 
        return img_trfm, objects[:-1]
    def __len__(self):
        return self.data_len

And this is the output from the dataloader with a batch size of 1:

dataiter= iter(trainloader)
img, objects =
[{'label': ['person'], 'bndbox': tensor([[444., 220.,  27.,  65.]]), 'id': tensor([1])}, 
{'label': ['person'], 'bndbox': tensor([[468., 220.,  26.,  66.]]), 'id': tensor([1])},
{'label': ['person?'], 'bndbox': tensor([[415., 224.,  20.,  33.]]), 'id': tensor([4])}]

Please let me know if there is any issues with this approach.

I have been using a basic training function to test if this approach works. However, after a number of tries, I keep getting this error message:

RuntimeError: bool value of Tensor with more than one value is ambiguous

This is my training loop:

for epoch in range(2):  # loop over the dataset multiple times

    running_loss = 0.0
    for data in trainloader:
        images, objects = data
        for i in objects:
            label = i['id']
            outputs = net(images)
            loss = criterion(outputs, label)

Any advice would be greatly appreciated.

Thank you in advance.

I just realised that I for to add brackets at the end of nn.CrossEntropyLoss for my criterion. Although the error:

RuntimeError: bool value of Tensor with more than one value is ambiguous

seem to be gone, I am now getting this error:

ValueError: Expected input batch_size (785) to match target batch_size (1).

I managed to fix there error by adjusting my network and I started to get loss values:

tensor(1.45, grad_fn=<NllLossBackward>)
tensor(1.42, grad_fn=<NllLossBackward>)

But after getting some loss values, I got the following error:

ParseError: no element found: line 1, column 0