Hi there. I am using combined network made of pre-trained ESM transformer by FAIR and my own classifier. I should mention that I changed first layer of transformer to identity layer because i want to calculate embeddings with embedd. layer of this transformer separately in order to be able to manipulate them later.
I want to calculate gradient for the input however, when i use torch autograd I get a wrong one. I checked it by manualy calculating grad for one position using f(x+e)-f(x-e))/2e where f- is my combined network, x - specific position of input, e - small increment. I am not sure what exactly was wrong with my automatic diff.
here is my code:
# ======== split the transformer model in two, getting access to the embedding layer===== splitted_model =  for name, module in transformer.named_children(): splitted_model.append(module) # take the first layer embedding_layer = splitted_model embedding = embedding_layer(input_tokens) #replace the embedding layer with an Identity layer identity_layer = torch.nn.Identity() transformer.embed_tokens = identity_layer #set token dropout to False transformer.args.token_dropout=False # this will be an input to my combined network input_embedding = embedding_layer(input_tokens) # ========. combine transformer and my classifier. ============== # create class class FullModel(nn.Module): def __init__(self, transformer, classifier_nn): super(FullModel, self).__init__() self.transformer = transformer self.classifier_nn = classifier_nn def forward(self, x): # calculate representation matrix of a shape L, E. L - length on input sequence, E - length of feature verctor. token 0 is a start-of-sequence token, so the first symbol of input is token 1 x1 = self.transformer(x, repr_layers=)["representations"][0,1 : len(x) + 1] # average over L to get representation vector of length E x2 = torch.mean(x1, dim=0) # use it for the classification x3 = self.classifier_nn(x2) return x3 # combine transformer_model = transformer classifier_nn = my_classifier final_model = FullModel(transformer_model, classifier_nn) # ======== compute gradient ========= # define for which output class i want to get gradient external = torch.tensor([1,0,0]) #1st out of 3 possible x = Variable(input_embedding, requires_grad=True) pred = final_model(x) pred.backward(gradient=external) input_gradient = x.grad