Similarity of output values for random input on untrained ResNet

KFrank · September 27, 2021, 4:38pm

Hi Ivo!

I can reproduce what you see.
I don’t see anything wrong with this result (although I don’t have
an expectation one way or the other whether it should work out
this way).
My intuition – that could be wrong – suggests that pumping a
random input through a randomly-initialized ResNet causes the
details of the input to get overpowered (and averaged away).
ResNet has a lot of layers between its input and output, so it
seems reasonable to me that the collective influence of all of
those random-layer weights swamps the influence of the specific
random input.
The output, when interpreted as raw–score logits – even though
similar for different inputs – doesn’t really correspond to any
substantive prediction that favors any of the 1000 classes. If
you use softmax() to convert the output logits into probabilities,
you will find that no class is predicted with a probability of any
significance.

Here is a script that illustrates some of this:

import torch
import torchvision

import numpy as np

print (torch.__version__)
print (torchvision.__version__)

_ = torch.manual_seed (2021)

resnetA = torchvision.models.resnet18 (pretrained = False)
_ = resnetA.eval()
resnetB = torchvision.models.resnet18 (pretrained = False)
_ = resnetB.eval()

input1 = torch.rand ([1, 3, 224, 224])
input2 = torch.rand ([1, 3, 224, 224])

outputA1 = resnetA (input1)
outputA2 = resnetA (input2)
outputB1 = resnetB (input1)
outputB2 = resnetB (input2)

print ('correlation (A1, A2) =\n', np.corrcoef (outputA1[0].detach().numpy(), outputA2.detach().numpy()))
print ('correlation (B1, B2) =\n', np.corrcoef (outputB1[0].detach().numpy(), outputB2.detach().numpy()))
print ('correlation (A1, B1) =\n', np.corrcoef (outputA1[0].detach().numpy(), outputB1.detach().numpy()))
print ('correlation (A1, B2) =\n', np.corrcoef (outputA1[0].detach().numpy(), outputB2.detach().numpy()))

print ('max "prediction" probability (A1) =', torch.softmax (outputA1, dim = 1).max())
print ('min "prediction" probability (A1) =', torch.softmax (outputA1, dim = 1).min())
print ('max "prediction" probability (B2) =', torch.softmax (outputB2, dim = 1).max())
print ('min "prediction" probability (B2) =', torch.softmax (outputB2, dim = 1).min())

And here is its output:

1.9.0
0.10.0
<path_to_pytorch>\torch\nn\functional.py:718: UserWarning: Named tensors and all their associated APIs are an experimental feature and subject to change. Please do not use them for anything important until they are released as stable. (Triggered internally at  ..\c10/core/TensorImpl.h:1156.)
  return torch.max_pool2d(input, kernel_size, stride, padding, dilation, ceil_mode)
correlation (A1, A2) =
 [[1.        0.9992763]
 [0.9992763 1.       ]]
correlation (B1, B2) =
 [[1.         0.99932129]
 [0.99932129 1.        ]]
correlation (A1, B1) =
 [[1.         0.00547452]
 [0.00547452 1.        ]]
correlation (A1, B2) =
 [[1.         0.00402595]
 [0.00402595 1.        ]]
max "prediction" probability (A1) = tensor(0.0036, grad_fn=<MaxBackward1>)
min "prediction" probability (A1) = tensor(0.0002, grad_fn=<MinBackward1>)
max "prediction" probability (B2) = tensor(0.0069, grad_fn=<MaxBackward1>)
min "prediction" probability (B2) = tensor(8.3634e-05, grad_fn=<MinBackward1>)

Best.

K. Frank