Vectorizing random matrix creation using `vmap` with tensor of seeds

RS-Coop · July 25, 2024, 4:36pm

I have a situation where I want to apply a batch of random matrices to a batch of vectors, and I have a given batch of seeds that are used to construct each matrix. I do this by using torch.manual_seed() before generating the matrix. This seems ripe for torch.vmap, as it lets me vectorize over the batch dimension. However, the problem is that torch.manual_seed() converts the given seed to an integer, which implicitly calls .item() on a singleton tensor resulting in an error.

Here is some sample code implementing that idea.

def _sketch(self, f, s):
    n = f.shape[0]
    r = round(0.01*n)

    torch.manual_seed(s)
    sketch = torch.randn(n, r, device='cpu')

    return torch.einsum('nc,nr->rc', f, sketch)

def sketch(self, features, seeds):
    return torch.vmap(self._sketch, randomness="different")(features, seeds)

It doesn’t seem to me like there should be any reason this can’t work. Any help here would be much appreciated!

Soumya_Kundu · July 26, 2024, 10:39am

Assuming this is the error you got:

> RuntimeError: vmap: It looks like you're calling .item() on a Tensor. We don't support vmap over calling .item() on a Tensor, please try to rewrite what you're doing with other operations. If error is occurring somewhere inside PyTorch internals, please file a bug report.

I think internally, the .item() essentially is generating a scalar. So that is messing with the vmap I guess.

Replacing torch.manual_seed(s) with something of your own may work?

AlphaBetaGamma96 · July 26, 2024, 11:31am

The .item() method converts a torch Tensor that’s a scalar to a float. Because torch.func.vmap vectorizes over pytorch primitives, if you leave the torch namespace (e.g. by using .item()), torch.func.vmap can no longer vectorize your function.

If you want to have different random samples with torch.func.vmap the randomness="different" kwarg will do that. I assume you’re manually selecting the seed for reproducibility?

In the docs it states you can get around this in 2 ways

Generate all your random data outside of vmap, and just vectorize over the data
Define your own generator directly

The docs on randomness are here: UX Limitations — PyTorch 2.4 documentation

RS-Coop · July 26, 2024, 4:02pm

Yes, that is the error. You are correct that .item() is called internally when converting the singleton integer tensor to a Python integer, which is causing said error.

Is there an obvious replacement for manual_seed() that still allows generating reproducible random tensors?

RS-Coop · July 26, 2024, 4:08pm

Yes, the seed was used previously, elsewhere in my program, to generate a random matrix, and now I want to use the same seed to reproduce it. So the options of randomness="different" or randomness="same" don’t work since I want specific randomness.

I would like to avoid generating all the data outside of vmap, and that only solves half the problem since it still requires looping over the seeds to generate the batch of matrices.

I could define a generator object, but I still have to manually seed it if I want reproducibility. The seeds are also batched themselves from a larger set, so there is no distinct order, eliminating the possibility of seeding once outside of vmap and then relying on the reproducible sequence.

Soumya_Kundu · July 28, 2024, 11:27am

Not that I know of A quick search really resulted in nothing. Even here: Reproducibility — PyTorch 2.4 documentation