I have a weird issue when appending tensors to a list.
I have a code similar to this:
tensor_list = 
for i in range(24):
current_tensor = function (...)
when I uncomment the print inside the loop, the values in the last print are correct. But when I comment it, I get wrong values in the final print. Any idea what could cause this?
Could you post the definition of
function so that we can have a look?
Are you seeing this issue using the JIT or plain PyTorch code?
It’s plain pytorch.
Here is the function class that is repeated in the loop and output of the function at each iteration will be the input to the next iteration call:
def __init__(self, config):
self.output_attentions = config.output_attentions
self.output_hidden_states = config.output_hidden_states
#self.layer = nn.ModuleList([BertLayer(config) for _ in range(config.num_hidden_layers)])
self.layer = nn.ModuleList([BertLayer(config) for _ in range(1)])
def forward(self, hidden_states, attention_mask, head_mask=None, is_last_layer=False,
if is_last_layer == True:
self.output_hidden_states = True
all_hidden_states = ()
all_attentions = ()
for i, layer_module in enumerate(self.layer):
all_hidden_states = all_hidden_states + (hidden_states,)
layer_outputs = layer_module(hidden_states, attention_mask, head_mask[i],
hidden_states = layer_outputs
Could you post the missing definitions of the code or link to the repository?
This is the original script for the code:
The change I have made is in “BertModel” class which I add a loop for the encoder function call (line 731).
The issue is that when I print something in the loop, it changes the list values.
I think I solved that. I was using cuda_stream for a tensor transfer before the function call. When I removed it, the values are correct. Maybe a bug in cuda_stream?
I’m glad you solved the issue.
If you are manually using CUDA streams, you would have to take care of the synchronizations as described here, which might explain this issue, if you haven’t done so.