I am new to PyTorch and I’m attempting to train a transformer model on biological sequences to output a binary classification. I have a functional training regiment in the sense that it iterates without errors, however my loss function never decreases and my validation accuracy is constant at 50%. Inspecting the tensor output from one forward pass shows that it is a tensor of constant values. I’m assuming something is wrong with my forward method (perhaps my max pooling?).
My input is of shape (n, 100, 4), where n is the number of samples, 100 is the sequence length, and 4 is the one-hot encoding of DNA.
Here is my model definition
class NeuralNetwork(nn.Module): ''' Build a neural network transformer for one-hot encoded DNA sequences ''' def __init__(self): super(NeuralNetwork, self).__init__() self.transformer1 = nn.TransformerEncoderLayer( d_model= 4, nhead= 2, batch_first= True, dim_feedforward= 1024 ) self.transformer2 = nn.TransformerEncoderLayer( d_model= 4, nhead= 2, batch_first= True, dim_feedforward= 1024 ) self.transformer3 = nn.TransformerEncoderLayer( d_model= 4, nhead= 2, batch_first= True, dim_feedforward= 1024 ) self.linear = nn.Linear( in_features= 4, out_features=1 ) def forward(self, x): x= self.transformer1(x) x= self.transformer2(x) x= self.transformer3(x) x= torch.max(x, dim=1) logits = self.linear(x) return torch.flatten(logits)
I am using a BCEwithLogitsLoss and the Adam optimizer. If I print the value of
output= model(input) I get a constant tensor.
Let me know if I need to include more details, I am attempting to not overclog the post. I posted this in NLP as I assumed you are the resident experts on transformers. Please correct me if I should move it.