Here is an abstract of this problem:
Assume we have two models named Encoder and Decoder respectively, where Encoder is like:
class Encoder: def __init__(self, n, dim): self.embedding_A = nn.Embedding(n, dim) # create an embedding matrix with shape (n, dim) def forward(self): out_embeddings = .... # do something with self.embedding_A, and get a new matrix with same shape return out_embeddings
while the Decoder has a format:
class Decoder: def __init__(self, n, dim): self.embedding_B = nn.Embedding(n, dim) # same shape as Encoder's output def forward(self): # do something with its self.embedding_B and return a loss for backward
now I want the Decoder’s self.embedding_B to take the output of Encoder (i.e., the out_embeddings), then make an end-to-end optimization from Decoder’s output to Encoder’s self.embedding_A.
I know an easy way is changing the codes and directly using Encoder’s result as an input of Decoder.forward(). However, my Decoder model is too complex, making such a change is difficult.
Is there a possible way to load Encoder’s output (i.e., the out_embeddings) into Decoder’s self.embedding_B and chain two models together? In this way, the backward optimization is expected to start with Decoder’s output, go back to the self.embedding_B, then goes further back to optimize Encoder’s self.embedding_A
I have tried using some codes like
# *encoder* is an instance of class *Encoder* # *decoder* is an instance of class *Decoder* out_embedings = encoder(...) decoder.embeddings_B.data = out_embeddings.data # my 1st try decoder.embeddings_B = nn.Parameters(out_embeddings) # my 2nd try
but both cannot make the back-optimization reach to encoder.embedding_A. Are there any ways to do that by this direct assigning?