Write custom loss, using .data.numpy() in the middle

soshishimada · February 19, 2018, 4:44pm

Hello. I want to implement a custom loss function. However, I need to use output of a network as “index” of a different tensor to compute loss. For that, I use output.data.numpy()[0] to get index value as integer.(since my original code is complicated, I wrote simplified version below).

I get an error “in-place operations can be only used on variables that don’t share storage with any other variables, but detected that there are 2 objects sharing it”

Is there any other way to use output of network as index of another tensor?

ground_truth=Variable(TORCH_TENSOR)
final_output_matrix=Variable(TORCH_TENSOR)
input = Variable(TORCH_TENSOR)

output=network(input)
index=output[i][j].data.numpy()[0] # i , j : some int values
loss = abs(ground_truth[p] - final_output[index]) # p: some int values
loss.backward()

richard · February 20, 2018, 3:38pm

I’m not sure if this will help, but, can’t you just do

output[i][j][0]

instead of creating a numpy array out of it?

soshishimada · February 21, 2018, 1:18pm

did’t work unfortunately. I think using output of network as “index” of other matrix can not be implemented since it is not differentiable …

cdancette · February 26, 2018, 9:42am

Yes, if your final output is an integer (that will be an index), then ofc it’s not differentiable (functions that outputs integers are not).

Why do you need to do something like this ?

If you’re training a classifier, you can just output a softmax, and train it to be the one hot encoding of your true label

soshishimada · February 28, 2018, 10:01am

I am currently doing a tricky thing. It’s a regression problem of xyz coordinates of an object. After inference of xyz coordinates, I want to convert them voxel style. For that, I want to use those regressed values as index of a matrix. To sum up, I want to know a differentiable way to convert coordinates into voxel.

cdancette · March 1, 2018, 10:47pm

It depends what you do after, maybe if you can do it in a smooth way, but that doesn’t seem to be differentiable.

For example, if the output of your neural net is (0.5, 1.2, 1.6) and you convert this to (0, 1, 2), then this function is clearly not differentiable.

Everything depends on what you do with the matrix object after.
If you want to check that M(i, j, k) is of the correct value, for example, and you know that your matrix is smooth / almost continuous (your values are very similar in a local area) you can approximate a gradient, by writing a new autograd function that will take your three neural net outputs (x, y, z) and return the matrix output (forward).
You then have to write the backward function as well, which would basically return the local change of value of M for each direction (x, y, z).

Here’s the doc to write an autograd function:
http://pytorch.org/docs/master/notes/extending.html#extending-torch-autograd