I’m trying to run a Huggingface model on multi-GPU. The problem is that when I’m processing multiple inputs which are bound to each other from a single class (shared-weights), I’m getting RuntimeError: Expected to mark a variable ready only once.
. While if I use the module only once, for processing one input, I won’t get this error.
To make it clearer, here is the structure:
class Model():
def __init__(self, ...)
self.encoder = ...
def forward(input_ids, ...):
encoder_outputs = self.encoder(input_ids, ...)
# filter encoder_outputs and construct another tensor called 'input_ids_selected'
encoder_outputs = self.encoder(input_ids_selected, ...)
return encoder_outputs
If I remove this line: encoder_outputs = self.encoder(input_ids_selected, ...)
, I will not run into this error. Should say that to filter encoder_outputs from the first pass of encoder, I’m using other modules (linear layers) to find important input_ids
, retaining those in input_ids_selected
. You can see this as a two-step summarizer.