Hi there. I am using combined network made of pre-trained ESM transformer by FAIR and my own classifier. I should mention that I changed first layer of transformer to identity layer because i want to calculate embeddings with embedd. layer of this transformer separately in order to be able to manipulate them later.
I want to calculate gradient for the input however, when i use torch autograd I get a wrong one. I checked it by manualy calculating grad for one position using f(x+e)-f(x-e))/2e where f- is my combined network, x - specific position of input, e - small increment. I am not sure what exactly was wrong with my automatic diff.
here is my code:
# ======== split the transformer model in two, getting access to the embedding layer=====
splitted_model = []
for name, module in transformer.named_children():
splitted_model.append(module)
# take the first layer
embedding_layer = splitted_model[0]
embedding = embedding_layer(input_tokens)
#replace the embedding layer with an Identity layer
identity_layer = torch.nn.Identity()
transformer.embed_tokens = identity_layer
#set token dropout to False
transformer.args.token_dropout=False
# this will be an input to my combined network
input_embedding = embedding_layer(input_tokens)
# ========. combine transformer and my classifier. ==============
# create class
class FullModel(nn.Module):
def __init__(self, transformer, classifier_nn):
super(FullModel, self).__init__()
self.transformer = transformer
self.classifier_nn = classifier_nn
def forward(self, x):
# calculate representation matrix of a shape L, E. L - length on input sequence, E - length of feature verctor. token 0 is a start-of-sequence token, so the first symbol of input is token 1
x1 = self.transformer(x, repr_layers=[34])["representations"][34][0,1 : len(x[0]) + 1]
# average over L to get representation vector of length E
x2 = torch.mean(x1, dim=0)
# use it for the classification
x3 = self.classifier_nn(x2)
return x3
# combine
transformer_model = transformer
classifier_nn = my_classifier
final_model = FullModel(transformer_model, classifier_nn)
# ======== compute gradient =========
# define for which output class i want to get gradient
external = torch.tensor([1,0,0]) #1st out of 3 possible
x = Variable(input_embedding, requires_grad=True)
pred = final_model(x)
pred.backward(gradient=external)
input_gradient = x.grad