Good evening,

I have been implementing some raycasting operations using numpy and thought moving them to pytorch to make use of the gpu parallelization.

However I am struggling to use some functions like cross product over list/tensors of tensors.

For exemple I want to compute the cross product between 10 vectors and 6000 other vectors.

With numpy I would broadcast them and get thing along this:

pvec = np.cross( directions[None,:,:],v0v2[:,None,:])

With pytorch it seems to be a problem as cross requires same size tensors and broadcasting apparently is not available for this methods.

Any idea on how to do something similar efficiently ?

Also what would be the a good way to do some computations in parallel in a good way ?

For example the same 10 operations for a lot of vectors.

In cuda I see how the kernel would work executing the same code, but directly in pytorch does not seem so clear, is it even possible ?

Yours Justin