What does torch.bucketize do/used for?

knoriy · March 3, 2022, 1:38pm

Hi,

Could someone please explain what operation the torch.bucketize does, I’ve looked at the documentation, but I can not understand what it does and how torch.bucketize could be used in practice.

Thank you in advance.

suraj.pt · March 3, 2022, 3:30pm

Hey, torch.bucketize takes a continuous input and discretizes them to integer boundaries. The returned tensor contains the right boundary index for each value in the input tensor. In this example, boundaries[2] < x[0][0] <= boundaries[3], so out[0][0] == 3.

In [15]: boundaries = torch.tensor([1,3,6,7,9])

In [16]: x
Out[16]:
tensor([[6.7165, 1.9131],
        [7.0514, 8.5162]])

In [17]: out = torch.bucketize(x, boundaries)

In [18]: out
Out[18]:
tensor([[3, 1],
        [4, 4]])

Some applications need discrete values, this function is an easy way to bin your input according to your provided boundaries. Hope this helps!

knoriy · March 3, 2022, 3:46pm

Thank you, that was super helpful. It makes sense now!!