Imagenet example with inception v3

Ismail_Elezi · October 20, 2017, 3:55pm

The ‘aux’ layer is used only for training. On inference time, you have just the output of the final layer.

alejandroposada · October 29, 2017, 5:10am

Please correct me if I’m wrong: there’s no need to do the mandatory normalization (“The images have to be loaded in to a range of [0, 1] and then normalized using mean=[0.485, 0.456, 0.406] and std=[0.229, 0.224, 0.225]” since it’s already included in the model (as long as transform_input is set to True: https://github.com/pytorch/vision/blob/c1746a252372dc62a73766a5772466690fc1b8a6/torchvision/models/inception.py#L72-L76).

lysuhin · February 5, 2018, 1:13pm

For those who are still stuck on this issue (from here and then):

if isinstance(outputs, tuple):
    loss = sum((criterion(o,labels) for o in outputs))
else:
    loss = criterion(outputs, labels)

campellcl · August 2, 2018, 10:20pm

If this is true than the master documentation needs to be changed. It states: “All pre-trained models expect input images normalized in the same way, i.e. mini-batches of 3-channel RGB images of shape (3 x H x W), where H and W are expected to be at least 224. The images have to be loaded in to a range of [0, 1] and then normalized using mean = [0.485, 0.456, 0.406] and std = [0.229, 0.224, 0.225]”.

Z_Huang · October 9, 2021, 12:37am

Your answer saved my life.