Can torch.Tensor have different precision element-wisely?

FruitVinegar · September 5, 2019, 1:54am

Hello. pytorch users.
Like title, can torch.Tensor have different precision element-wisely?
for example:

my_tensor = torch.Tensor([1, 1.0, 1.000])
# respectively types are: torch.int8, torch.half, torch.float... etc.

If it is impossible(maybe I think pytorch doesn’t support this semantic), is there another way to implement like this? I’d like to reduce memory usage as much as I can…

Any answers will be helpful for me. Thanks

ptrblck · September 5, 2019, 11:24am

I don’t think this is possible using a tensor, since e.g. the strides would be hard (impossible) to handle.
Python containers (e.g. list) are quite flexible regarding different types of elements, but you won’t be able to apply e.g. a matrix multiplication on this “array”.

If you are running out of memory, you could have a look at torch.utils.checkpoint to trade compute for memory or e.g. apex/amp for mixed precision training.

FruitVinegar · September 18, 2019, 12:41pm

Thanks for the answer.
I’m back here to share my progressed idea to this problem.

Back to the problem, ‘element-wise(pixel-wise) different precision in Tensor(Image)’.

For example, look at this picture:

Although the background of image is not-focused well, human can recognize the object in this Image is lamp.
On a similar principle, can object-detection networks recognize objects from Image that has different precision(resolution) pixel-wisely? Moreover, can we get memory reduction effect from this solution?

This is why I suggested this problem.

From now, this is my progressed idea.
I made an helper function from this forum

# returns actual allocated memory size of given tensor
def get_actual_memsize(tensor):
    return tensor.element_size() * tensor.nelement()

To test my idea, I made a sample tensor:

B, C, H, W = 1, 10, 224, 224

tensor = torch.randn(B, C, H, W)

print(tensor.type())
print(get_actual_memsize(tensor))

this returns:

torch.FloatTensor
2007040

I decomposed tensor based on each element’s value, like this:

threshold = 0.2

important_region = torch.ge(tensor, threshold)
unimportant_region = ~important_region #just reverse

unimportant_values = torch.masked_select(tensor, unimportant_region).to(torch.float16) #Precision Reduction
important_values = torch.masked_select(tensor, important_region)

print(unimportant_values.type())
print(important_values.type())

print(get_actual_memsize(unimportant_values) + get_actual_memsize(important_values) + get_actual_memsize(unimportant_region))

and memory reduced: 2007040 → 1927504

torch.HalfTensor
torch.FloatTensor
1927504

Since the sample tensor is small(size is [1, 10, 244, 244]), the reducing effect is much more larger in bigger tensor, I think.

Now I’m restoring tensor with these decomposed components. It’s quite confusing. I’m looking for suitable pytorch API now(like masked_fill) but all I found is not suitable…

This is my current idea. and if you have a similar idea or knowledge about pytorch API, please answer how to restoring the tensor please.

ptrblck · September 18, 2019, 10:00pm

That’s the pitfall I tried to mention in my last post.
I’m not sure, if there might be a clean way to reconstruct e.g. an array (tensor) with mixed precision data types, as the strides would change based on the current element type.

Let’s think about a plain tensor:

x = torch.tensor([[0., 1., 2., 3.],
                  [4., 5., 6., 7.]])
print(x.size())
> torch.Size([2, 4])
print(x.stride())
> (4, 1)
print(x.nelement())
> 8
print(x.element_size())
> 4

The stride shows, how many element_size() bytes you would have to skip in the contiguous tensor to get to the next index in the corresponding dimension.
E.g. to index the 5 in x, you would have to use x[1, 1], which corresponds to 4+1 elements offset from the beginning of the data.
Also, as you can see the element_size() is defined per tensor, which makes the indexing (and shape changes) via strides possible.
If you are using different data types in the tensor, you would e.g. need to access the size of each elements to index any value in it.

That being said, this approach won’t work for this strided approach using contiguous memory chunks. There might be some other data layout, I’m not aware of, so please let me know, if you find something.