Modifying/accessing weights and gradients outside of PyTorch

danelliottster · May 5, 2017, 2:51pm

Hello,

I am trying a new algorithm that I have implemented outside of PyTorch. Is this a legit way to copy out and copy in PyTorch model parameters or am I causing problems under the hood?

My code assumes that modules/parameters returned from named_children() and parameters() methods always return those items in the same order.

Pack into a numpy vector doing something like:

Qmods = [mod for name,mod in self.named_children() if name in self.Qlayers]
for mod in Qmods:
    Qparams += [parm for parm in mod.parameters()]
    Ws = np.hstack([np.ravel(netp.data.numpy()) for netp in Qparams])
return Ws

Do stuff to Ws

Unpack from numpy vector doing something like:

for qmod in [mod for name,mod in self.named_children() if name in self.Qlayers]:
        for netp in qmod.parameters():
            stopIdx = startIdx + np.prod(netp.data.size())
            netp.data.copy_(Tensor(np.reshape(Ws[startIdx:stopIdx],netp.data.size())))
            startIdx = stopIdx

I also do something similar to grab gradients from the network which are used to update Ws outside of PyTorch. I realize this may be inefficient. Just want to make sure I am not breaking PyTorch in some way by accessing/overwriting network weights and gradients in this way.

Thank you, very much, for your guidance.

smth · May 5, 2017, 3:55pm

accessing the weights / gradients should be fine this way.

Something you can do that might be nicer is to get the model’s state_dict