Performing normal or uniform initializations on Float tensors result in only zeros

Santosh-Gupta · September 4, 2019, 12:41am

import torch
pytorchGPUDirectCreateWEmpty = torch.empty(size=(20000000, 128), dtype=torch.float, device='cuda', requires_grad=False, pin_memory=False).uniform_(-1, 1)
pytorchGPUDirectCreateWEmpty

results in

tensor([[0., 0., 0.,  ..., 0., 0., 0.],
        [0., 0., 0.,  ..., 0., 0., 0.],
        [0., 0., 0.,  ..., 0., 0., 0.],
        ...,
        [0., 0., 0.,  ..., 0., 0., 0.],
        [0., 0., 0.,  ..., 0., 0., 0.],
        [0., 0., 0.,  ..., 0., 0., 0.]], device='cuda:0')

and

import torch
torch.set_default_tensor_type('torch.cuda.FloatTensor')
u_embeddings = torch.nn.Embedding(20000000, 128, padding_idx=None, max_norm=None, norm_type=2.0, scale_grad_by_freq=False, sparse=False, _weight=None)
u_embeddings.weight.data.uniform_(-1, 1)
u_embeddings.weight.data

results in

tensor([[0., 0., 0.,  ..., 0., 0., 0.],
        [0., 0., 0.,  ..., 0., 0., 0.],
        [0., 0., 0.,  ..., 0., 0., 0.],
        ...,
        [0., 0., 0.,  ..., 0., 0., 0.],
        [0., 0., 0.,  ..., 0., 0., 0.],
        [0., 0., 0.,  ..., 0., 0., 0.]])

If I initialize with double instead of float and the initialization works fine. I could convert to float later, but I am working with limited memory and unable to initialize a double first before converting.

Why is the initialization not working for float tensors?

ptrblck · September 4, 2019, 9:45am

This code seems to work for PyTorch 1.2.0 and CUDA10.0 (installed via conda binaries):

pytorchGPUDirectCreateWEmpty = torch.empty(size=(20000000, 128), dtype=torch.float, device='cuda', requires_grad=False, pin_memory=False).uniform_(-1, 1)
print(pytorchGPUDirectCreateWEmpty.min())
> tensor(-1., device='cuda:0')
print(pytorchGPUDirectCreateWEmpty.max())
> tensor(1.0000, device='cuda:0')
print(pytorchGPUDirectCreateWEmpty.mean())
> tensor(-1.4027e-05, device='cuda:0')

on a Titan V (driver 418.56).

Which setup are you using?

Santosh-Gupta · September 4, 2019, 4:53pm

I am using Google Colab, so the GPU is a Tesla K80, Pytorch 1.10, Cuda ‘10.0.130’

I tried the embedding one this morning and somehow it works now. But torch empty still doesn’t at the shape I specified. I tried with a smaller shape, and it worked, so it may be an issue with memory.

For convenience, here’s a notebook of the code I ran this morning

https://colab.research.google.com/drive/1wwHWF92TzRsdCqpG3V40N-CyDlpuaEIU