Pytorch Implementation of Tensorflow code

Hi I am trying to move a TF code to PyTorch. But I am unable to replicate one functionality into Torch. I am pasting the two lines from tensorflow here

neighbourRagged = tf.RaggedTensor.from_row_splits(x, row_splits=rowSplits)
neighbourFeatures = tf.math.reduce_max(neighbourRagged, axis=1)

where x is a huge float array of size (257997,64) and row_splits is an int32 array of size (32769,).

I am trying to implement these two lines from tf to torch. I am aware that Torch does not support RaggedTensor yet. But is there any work-around that to “specifically” implement these two lines?
I have uploaded the exact .npy files(zipped) on drive which can be accessed from here

The x and rowSplits array can be accessed from “files” folder after extracting zip file.

x = np.load("x.npy")
rowSplits = np.load("numNeighboursArray.npy")

If you are able to replicate the functionality please do let me know!
Thank you in advance!!

Tagging @ptrblck

I’m not familiar with the TF code, but would these two operations refer to a scatter_max op?
If so, you could take a look at this post.
Or is the reduction axis in the max operation different from the split axis and you would still get “ragged tensors” with scalar values in one dimension?

1 Like

@ptrblck sir,

The first line splits the x array (n,64) along the n axis based on indices given in rowSplits and converts it into a raggedTensor of size (32768,[None],64) where None in shape indicates variable number
eg. raggedTensor [0].shape=[20,64] ,raggedTensor [1].shape=[19,64] etc.

The 2nd line takes the Max along axis 1( the variable length axis) to get the output. In the example given here, op[0] is the element wise maximum across the 20 samples obtained. op[1] is for 19 samples etc.
I hope I have made it clearer.
Can I replicate this behaviour using scatter_max or any other operation?
@albanD can you please suggest something on this?
Thank you!

Is it not possible to write this code in Torch?
Not having RaggedTensor is one of the few downfalls I am facing with Torch. If there is some workaround, someone please help me through that.

Have you tried just padding your matrix with zeros to make them all the same size? I don’t see the rational/benefit to making a matrix variable on one axis for ML.

1 Like

@J_Johnson that can be done. But here is the caveat-
Eg.

x=torch.randn(10,3)
RowSplits=[0,2,5,7,9]

Which means the splits are x[0:2], x[2:5], x[5:7], x[7:9]
Now to make each of them of same size, I shall have to pad zeros in between these rows. Say I write new x into x_ like

   x_[0:2]=x[0:2]
    x_[2]=0(inserted padding)
    x_[3:6]=x[2:5](no padding needed here) 

etc. Now how can I do it efficiently without having to run a for loop(performance hit)
Another point to note is I have to take Max in each such partition. Like in original x array I need element-wise Max from x[0] to x[1] i.e
(From first slice)
op[0,0]=max{x[0,0],x[1,0]}
op[0,1]=max{x[0,1],x[1,1]}
op[0,2]=max{x[0,2],x[1,2]}

(From second slice)
op[1,0]=max{x[2,0],x[3,0],x[4,0]}
op[1,1]=max{x[2,1],x[3,1],x[4,1]}
op[1,2]=max{x[2,2],x[3,2],x[4,2]}
And so on

Can this be done efficiently?

Hi I found one workaround based on @J_Johnson’s suggestion!
I am showing the demo numpy implementation for the same.

x = np.random.randint(low=10,size=(10,3))
# array([[6, 0, 3],
       [0, 7, 0],
       [4, 6, 8],
       [1, 6, 7],
       [2, 6, 1],
       [8, 5, 5],
       [9, 8, 3],
       [4, 9, 1],
       [5, 8, 6],
       [6, 6, 3]])

splits = np.array([0,2,3,6,7,10])
splits_ = np.subtract(splits[1:],splits[:-1])
# splits_ = [2,1,3,1,3]
l = len(splits_) # l =5
m = max(splits_) # m = 3
neighbourRagged = np.zeros([l,m,3]) 
# this will be my output array where I can copy
# the input with zero padding

I set output's second axis length to the maximum split value( in this case 3). This is the maximum value of the second axis after zero-padding

Now for each block(among l blocks) each of size (m,3), I fill up the first i rows (obtained from each split value) with the i==j-prev rows from input array and rest rows stay at zero.

prev = 0
idxs = range(l)
for idx,i,j in zip(idxs,splits_, splits[1:]):
...     neighbourRagged[idx, :i,:]=x[prev:j,:]
...     prev=j

#neighbourRagged
array([[[6., 0., 3.],
        [0., 7., 0.],
        [0., 0., 0.]],

       [[4., 6., 8.],
        [0., 0., 0.],
        [0., 0., 0.]],

       [[1., 6., 7.],
        [2., 6., 1.],
        [8., 5., 5.]],

       [[9., 8., 3.],
        [0., 0., 0.],
        [0., 0., 0.]],

       [[4., 9., 1.],
        [5., 8., 6.],
        [6., 6., 3.]]])

Now to obtain the max output, I can take the maximum along axis=1

neighbourFeatures = neighbourRagged.max(axis=1)
#neighbourFeatures
array([[6., 7., 3.],
       [4., 6., 8.],
       [8., 6., 7.],
       [9., 8., 3.],
       [6., 9., 6.]])

This is the output I want.
I was wondering if we can implement this in PyTorch using some efficient function.(without using loops etc.)

@ptrblck

Maybe can try:

x = torch.randint(0, 10, (10,3))

splits = torch.tensor([0, 2, 3,6,7,10])

x_split = [x[splits[i]:splits[i+1]] for i in range(len(splits)-1)]

Or something to that effect, depending on the specifics.

1 Like

I haven’t checked it, but have you had a chance to try e.g. PyTorch Scatter from the previously linked topic?
If so, did a scatter operation work for you or were you hitting any issues with it?

1 Like

Thank you so much @ptrblck !
The segment_csr function from torch_scatter is just exactly what I wanted.

from torch_tensor import segment_csr
x = torch.tensor([[6, 0, 3],
       [0, 7, 0],
       [4, 6, 8],
       [1, 6, 7],
       [2, 6, 1],
       [8, 5, 5],
       [9, 8, 3],
       [4, 9, 1],
       [5, 8, 6],
       [6, 6, 3]])
splits = torch.tensor([0,2,3,6,7,10])
segment_csr(x, splits,reduce='max')
# tensor([[6, 7, 3],
        [4, 6, 8],
        [8, 6, 7],
        [9, 8, 3],
        [6, 9, 6]])
# Exact answer
Thank you once again @ptrblck 
Thank you to @J_Johnson  as well
1 Like