I performed a simple **classification** inference on a sample image using ResNet-50. I used both the ** IMAGENET1K_V1** and

**versions of the model weights. I found that there was around**

`IMAGENET1K_V2`

**70%**jump in the final confidence scores (even after applying the appropriate transforms) in the

**V1**version as opposed to the

**V2**.

This is odd since the **V2** version was expected to give a better confidence score!

I have attached a Colab Notebook as a reference.

The class with the highest confidence score on the same image across the `IMAGENET1K_V1`

and `IMAGENET1K_V2`

weights (even with the appropriate transforms) are ** 99.771%** and

**. Something seems quite off!**

`58.404%`

- The custom transforms for
`IMAGENET1K_V1`

weights were:

```
T.Compose([T.Resize(256),
T.CenterCrop(224),
T.ToTensor(),
T.Normalize(
mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225]
)])
```

- The custom transforms for
`IMAGENET1K_V2`

weights were:

```
T.Compose([T.Resize(232),
T.CenterCrop(224),
T.ToTensor(),
T.Normalize(
mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225]
)])
```

Even the `ResNet50_Weights.IMAGENET1K_V1.transforms()`

and `ResNet50_Weights.DEFAULT.transforms()`

were used with literally no difference in the results!

The **Torchvision** documentation for **ResNet50** with ** IMAGENET1K_V2** weights states:

The inference transforms are available at

`ResNet50_Weights.IMAGENET1K_V2.transforms`

and perform the following preprocessing operations: Accepts`PIL.Image`

, batched`(B, C, H, W)`

and single`(C, H, W)`

image`torch.Tensor`

objects. The images are resized to`resize_size=[232]`

using`interpolation=InterpolationMode.BILINEAR`

, followed by a central crop of`crop_size=[224]`

. Finally the values are first rescaled to`[0.0, 1.0]`

and then normalized using`mean=[0.485, 0.456, 0.406]`

and`std=[0.229, 0.224, 0.225]`

.

This is precisely what I have performed above.

Could someone possibly point out what the issue is here?

**PS:** I have also raised an issue for the same.