Unexpected behaviour in parameter initialization

jmaronas · May 21, 2019, 9:51am

Hi.

I have been surprised by the behavior of PyTorch. What I have done is initialize the bias of the output layer of a neural network using some data statistics. So considering, for instance:

mean=data.mean(0) #numpy array
std = data.std(0) #numpy array

#inside the model there is a sequential for the output layer. The model receives mean and std as arguments
f=#here is the sequential
f[0].bias=nn.Parameter(torch.from_numpy(numpy.log(std**2)))#initialize

I was surprised because if I put this as above, instead of std.copy(), the underlying std tensor changes when the bias changes in the optimizer updates. I thought that, in these cases, a new instance of the tensor is created (and same for numpy). As far as I know:

a=numpy.random.randn(1000,1)
b=torch.from_numpy(a)
a[0]=10# this also changes b

c=b**2 # a new instance in memory is created

Any suggestions? Thanks.

MariosOreo · May 22, 2019, 5:38am

Hi,

Did you get any proper explaination on it?

InnovArul · May 22, 2019, 5:51am

Numpy arrays and pytorch tensors interoperate by sharing memory.

jmaronas · May 22, 2019, 6:46am

yes. But any operation with reassignment should allocate new memory. If you do this on a numpy tensor:

a=numpy.random.randn(100,1)
b=a**2

b and a use different memory.