Currently, checkpoint function allows only
Variable output. But some architectures might have more than one output, such as the encoder of a standard FCNs, which might have several outputs from the mainstream and skip connection, or architectures that have both text and label outputs.
Now, my solution is to concatenate the tensors in the function and unroll them outside checkpoint, which is not so convenient.
So are there any neat way to do that? Or would it be
list supports in the future?
Thank you so much, really appreciate your work !
I’m not sure to understand, I though checkpoint was takes an arbitrary number of input tensor and output an arbitrary number of output tensors. Doesn’t that work? Are you returning a list containing Tensor or multiple tensor from the python function?
It actually didn’t work. It sure can input many input as the
*args, but list output causes error. For example, my output was like,
return [tensor1, tensor2, tensor3, tensor4, tensor4]
and the error would be like,
TypeError: CheckpointFunctionBackward.forward: expected Variable (got list) for return value 0
I hope I am using it right.
Yes I don’t think it support list as output. Can you change your code to:
return tensor1, tensor2, tensor3, tensor4, tensor4 (without the list) ?
What could one do, if a custom layer (implemented by deriving from autograd.Function) has a forward function that has to return a set of tensors, where the number of tensors depends on an input parameter. List of tensors could help but it looks like list as output is not supported. I get the following error: “TypeError: ripsLayerBackward.forward: expected Variable (got list) for return value 0”, where ripsLayer is the name of my layer.
You can return as many Tensors as you want (it can be different from one forward to the next actually). You just need to make sure that your backward will handle this variable number of grad output.
Thank you for your reply. Since, I need to return a variable number of tensors, could I create a tuple and then return that tuple?
The output of your function should be a bunch of Tensors. Tuples are a special construct in python in the sense that when you do
return a, b you actually return a tuple containing a and b.
So I guess it should work yes.
Thanks for your clarification. In case there is a more elegant way to handle a variable number of tensors, please, do share it.
I don’t think there is I’m afraid
No problem It is understandable.