How to tie a tensor to a model's device without being in state_dict

I would like to add a tensor to a model so that model.to(device) also moves the tensor to the device. register_buffer seems to do this; however, I don’t want the tensor to be inside the model’s state_dict. Is there a method to do this?

My specific application is that I have a model that uses a transformer encoder. For convenience, it creates a casual mask and precomputes a positional encoding matrix in its constructor which is used in forward. I want the mask and positional encoding to move to the device that the model is moved to, but when I save the model to disk with torch.save(model.state_dict()) I don’t want those tensors wasting space.

You could use self.register_buffer and set the persistent argument to False.
From the docs:

persistent (bool) – whether the buffer is part of this module’s state_dict.

1 Like

Not sure how I missed that. Thanks!