Thanks for the answer.
I’m back here to share my progressed idea to this problem.
Back to the problem, ‘element-wise(pixel-wise) different precision in Tensor(Image)’.
For example, look at this picture:
Although the background of image is not-focused well, human can recognize the object in this Image is lamp.
On a similar principle, can object-detection networks recognize objects from Image that has different precision(resolution) pixel-wisely? Moreover, can we get memory reduction effect from this solution?
This is why I suggested this problem.
From now, this is my progressed idea.
I made an helper function from this forum
# returns actual allocated memory size of given tensor
return tensor.element_size() * tensor.nelement()
To test my idea, I made a sample tensor:
B, C, H, W = 1, 10, 224, 224
tensor = torch.randn(B, C, H, W)
I decomposed tensor based on each element’s value, like this:
threshold = 0.2
important_region = torch.ge(tensor, threshold)
unimportant_region = ~important_region #just reverse
unimportant_values = torch.masked_select(tensor, unimportant_region).to(torch.float16) #Precision Reduction
important_values = torch.masked_select(tensor, important_region)
print(get_actual_memsize(unimportant_values) + get_actual_memsize(important_values) + get_actual_memsize(unimportant_region))
and memory reduced: 2007040 -> 1927504
Since the sample tensor is small(size is [1, 10, 244, 244]), the reducing effect is much more larger in bigger tensor, I think.
Now I’m restoring tensor with these decomposed components. It’s quite confusing. I’m looking for suitable pytorch API now(like masked_fill) but all I found is not suitable…
This is my current idea. and if you have a similar idea or knowledge about pytorch API, please answer how to restoring the tensor please.