I am trying a new algorithm that I have implemented outside of PyTorch. Is this a legit way to copy out and copy in PyTorch model parameters or am I causing problems under the hood?
My code assumes that modules/parameters returned from named_children() and parameters() methods always return those items in the same order.
Pack into a numpy vector doing something like:
Qmods = [mod for name,mod in self.named_children() if name in self.Qlayers] for mod in Qmods: Qparams += [parm for parm in mod.parameters()] Ws = np.hstack([np.ravel(netp.data.numpy()) for netp in Qparams]) return Ws
Do stuff to Ws
Unpack from numpy vector doing something like:
for qmod in [mod for name,mod in self.named_children() if name in self.Qlayers]: for netp in qmod.parameters(): stopIdx = startIdx + np.prod(netp.data.size()) netp.data.copy_(Tensor(np.reshape(Ws[startIdx:stopIdx],netp.data.size()))) startIdx = stopIdx
I also do something similar to grab gradients from the network which are used to update Ws outside of PyTorch. I realize this may be inefficient. Just want to make sure I am not breaking PyTorch in some way by accessing/overwriting network weights and gradients in this way.
Thank you, very much, for your guidance.