I would like to add a tensor to a model so that
model.to(device) also moves the tensor to the device.
register_buffer seems to do this; however, I don’t want the tensor to be inside the model’s
state_dict. Is there a method to do this?
My specific application is that I have a model that uses a transformer encoder. For convenience, it creates a casual mask and precomputes a positional encoding matrix in its constructor which is used in
forward. I want the mask and positional encoding to move to the device that the model is moved to, but when I save the model to disk with
torch.save(model.state_dict()) I don’t want those tensors wasting space.