Similarity of output values for random input on untrained ResNet

Hi Ivo!

  1. I can reproduce what you see.

  2. I don’t see anything wrong with this result (although I don’t have
    an expectation one way or the other whether it should work out
    this way).

  3. My intuition – that could be wrong – suggests that pumping a
    random input through a randomly-initialized ResNet causes the
    details of the input to get overpowered (and averaged away).
    ResNet has a lot of layers between its input and output, so it
    seems reasonable to me that the collective influence of all of
    those random-layer weights swamps the influence of the specific
    random input.

  4. The output, when interpreted as raw–score logits – even though
    similar for different inputs – doesn’t really correspond to any
    substantive prediction that favors any of the 1000 classes. If
    you use softmax() to convert the output logits into probabilities,
    you will find that no class is predicted with a probability of any
    significance.

Here is a script that illustrates some of this:

import torch
import torchvision

import numpy as np

print (torch.__version__)
print (torchvision.__version__)

_ = torch.manual_seed (2021)

resnetA = torchvision.models.resnet18 (pretrained = False)
_ = resnetA.eval()
resnetB = torchvision.models.resnet18 (pretrained = False)
_ = resnetB.eval()

input1 = torch.rand ([1, 3, 224, 224])
input2 = torch.rand ([1, 3, 224, 224])

outputA1 = resnetA (input1)
outputA2 = resnetA (input2)
outputB1 = resnetB (input1)
outputB2 = resnetB (input2)

print ('correlation (A1, A2) =\n', np.corrcoef (outputA1[0].detach().numpy(), outputA2.detach().numpy()))
print ('correlation (B1, B2) =\n', np.corrcoef (outputB1[0].detach().numpy(), outputB2.detach().numpy()))
print ('correlation (A1, B1) =\n', np.corrcoef (outputA1[0].detach().numpy(), outputB1.detach().numpy()))
print ('correlation (A1, B2) =\n', np.corrcoef (outputA1[0].detach().numpy(), outputB2.detach().numpy()))

print ('max "prediction" probability (A1) =', torch.softmax (outputA1, dim = 1).max())
print ('min "prediction" probability (A1) =', torch.softmax (outputA1, dim = 1).min())
print ('max "prediction" probability (B2) =', torch.softmax (outputB2, dim = 1).max())
print ('min "prediction" probability (B2) =', torch.softmax (outputB2, dim = 1).min())

And here is its output:

1.9.0
0.10.0
<path_to_pytorch>\torch\nn\functional.py:718: UserWarning: Named tensors and all their associated APIs are an experimental feature and subject to change. Please do not use them for anything important until they are released as stable. (Triggered internally at  ..\c10/core/TensorImpl.h:1156.)
  return torch.max_pool2d(input, kernel_size, stride, padding, dilation, ceil_mode)
correlation (A1, A2) =
 [[1.        0.9992763]
 [0.9992763 1.       ]]
correlation (B1, B2) =
 [[1.         0.99932129]
 [0.99932129 1.        ]]
correlation (A1, B1) =
 [[1.         0.00547452]
 [0.00547452 1.        ]]
correlation (A1, B2) =
 [[1.         0.00402595]
 [0.00402595 1.        ]]
max "prediction" probability (A1) = tensor(0.0036, grad_fn=<MaxBackward1>)
min "prediction" probability (A1) = tensor(0.0002, grad_fn=<MinBackward1>)
max "prediction" probability (B2) = tensor(0.0069, grad_fn=<MaxBackward1>)
min "prediction" probability (B2) = tensor(8.3634e-05, grad_fn=<MinBackward1>)

Best.

K. Frank

2 Likes