How to implement a single-feature Encoder with Adaptive Pooling Layer for flexible finetuning?

Hi there,

I am new to Pytorch with very little understanding of programming classes. While everything worked out so far, I don’t understand the following problem now:

In oder to allow my feed-forward network to process data with different input feature size, I was trying to implement an encoder, that

  1. processes each feature in tensor of shape [730, 10] seperately, so a tensor of shape [730,1] with hidden size 64 to [730,64].
  2. concatenates all of them together to [730,64,10]
  3. activates it.
    Because I want to have the number of input features variable, I now apply adaptive pooling with arbitrary output size to this tensor.

While the network is working on the first view, the loss that it outputs is:

tensor(nan, grad_fn=)

Am I missing something in the class implementation, such that I can’t do proper backpropagation?
I would really appreciate any help on this!

The class I created looks like this:

class MLPmod(nn.Module):

def __init__(self, hidden_features, dimensions, activation):
    
    super(MLPmod, self).__init__()
    self.hidden_features = hidden_features
    self.activation = activation()
    
    self.encoder = nn.Linear(1, hidden_features)
    self.avgpool = nn.AdaptiveAvgPool1d(hidden_features)
    self.classifier = self.mlp(dimensions, activation)
    
def forward(self, x):
    
    x = self.encode(x)
    x = self.avgpool(x).view(x.shape[0],-1)
    x = self.classifier(x)
    
    return(x)

def mlp(self, dimensions, activation):

    network = nn.Sequential()
    network.add_module(f"hidden0", nn.Linear(self.hidden_features*self.hidden_features, dimensions[0]))
    network.add_module(f'activation0', activation())

    for i in range(len(dimensions)-1):
        network.add_module(f'hidden{i+1}', nn.Linear(dimensions[i], dimensions[i+1]))
        if i < len(dimensions)-2:
            network.add_module(f'activation{i+1}', activation())

    return(network)

def encode(self, x):
    
    x = x.unsqueeze(1)
    
    latent = torch.empty(x.shape[0], self.hidden_features, 1)
    
    for feature in range(x.shape[-1]):
        latent = torch.cat((latent, self.encoder(x[:,:,feature]).unsqueeze(2)),dim=2)
        
    latent = self.activation(latent)
    
    return(latent)

In these lines of code:

latent = torch.empty(x.shape[0], self.hidden_features, 1)
    
for feature in range(x.shape[-1]):
    latent = torch.cat((latent, self.encoder(x[:,:,feature]).unsqueeze(2)),dim=2)

you are appending an empty and thus uninitialized tensor to itself. Since you are not initializing the values of the tensor, it might contain invalid values such as Infs/NaNs etc.
I would recommend to append the output of the encoder to a list and use torch.stack afterwards.

Thanks a lot! That fixed the issue.