An autoencoder with multiple inputs

javad_t · December 11, 2021, 4:34pm

Hi everybody,
I’m new to pytorch and trying to implement a multimodal deep autoencoder(means: autoencoder with multiple inputs)
At the first all inputs encode with same encoder architecture, after that, all outputs concatenates together and the output goes into the another encoding and deoding layers:
SharedScreenshot

At the end, last decoder layer must reconstruct the inputs as multiple outputs.

Now I have between one and 9 inputs depending on the user’s choice and each input is a 1215x1519 matrix.

I’m rally stuck in first and last layers of this autoencoder.

Can anyone help me in this case?
Thanks.

@ptrblck

ptrblck · December 13, 2021, 4:40am

You could implement the posted model architecture using nn.ModuleLists as seen here:

class MyModel(nn.Module):
    def __init__(self):
        super().__init__()
        self.encoders = nn.ModuleList()
        for _ in range(9):
            self.encoders.append(nn.Linear(4, 3))
        
        self.encoder = nn.Sequential(
            nn.Linear(9*3, 4),
            nn.ReLU(),
            nn.Linear(4, 3)
        )
        self.decoder = nn.Sequential(
            nn.Linear(3, 4),
            nn.ReLU(),
            nn.Linear(4, 9*3)
        )
        self.decoders = nn.ModuleList()
        for _ in range(9):
            self.decoders.append(nn.Linear(3, 4))
        
    def forward(self, inputs):
        out = []
        for idx, enc in enumerate(self.encoders):
            out.append(enc(inputs[idx]))
        out = torch.cat(out, dim=1)
        z = self.encoder(out)
        out = self.decoder(z)
        out = torch.split(out, 3, dim=1)
        outs = []
        for idx, dec in enumerate(self.decoders):
            outs.append(dec(out[idx]))
        return outs
            
model = MyModel()
inputs = [torch.randn(1, 4) for _ in range(9)]
outs = model(inputs)

I don’t know which activation functions should be used etc. so you could use this code snippet as a base implementation and adapt it to your use case.

anantguptadbl · December 13, 2021, 5:40am

Do you mean that you want to use the same model in case there are 5 inputs( each input being 1215x1519) or 1 input or 9 inputs?

javad_t · December 13, 2021, 12:30pm

Thank you for your help, it was very helpful

I have another problem:
Each input is a computed embedding of a graph and as I said before, each input is 1215x1519 matrix.
On the other hand, since we do not have any labels for our data, the original 9x1215x1519 (9 is number of inputs) data considered as label and then considered a noisy version of original data with same shape for model input, in this way we’re trying to reconstruct input according to the labels.

In another implementation of this case with tensorflow and keras, developer fit the model with keras fit() function:

For fitting the model

history = model.fit(X_train_noisy, X_train, epochs=epochs, batch_size=batch_size, shuffle=True,
                    validation_data=(X_test_noisy, X_test),
                    callbacks=[EarlyStopping(monitor='val_loss', min_delta=0.0001, patience=5)])

Both X_train_noisy and X_train are 9x1215x1519.

Now I’m really confused about how to do this with pytorch!
Thank you again

@anantguptadbl

ptrblck · December 13, 2021, 11:19pm

I don’t know what exactly the Keras model is doing, as the fit method doesn’t show any information about the loss function etc.
You could thus check its internal implementation and use the same approach in PyTorch. E.g. if it’s some form of mse loss, use nn.MSELoss in PyTorch to calculate the loss.
I also don’t know what the shape represents in Keras (Is dim0 the batch size? If so, the input shape looks wrong, but I’m also not deeply familiar with Keras) so you should check how each dimension is used inside the model.

javad_t · December 18, 2021, 5:25pm

thank you again for you help

Padmaksha_Roy · July 1, 2023, 5:49pm

hi @ptrblck , in my case, I have a single encoder (for all domains/classes)and multiple decoders. So is there no way to implement one encoder for multiple inputs or do I have to do it like this case - one encoder for each domain , concatenate/stack them, and pass through the decoder. Then separate the decoder outputs and minimize the reconstruction error? My forward function will look something like this:

Kindly give some insights here. Thanks!

Padmaksha_Roy · July 1, 2023, 5:53pm

Multi_AE
puts[idx]))

def forward(self, inputs):
out =
for idx, enc in enumerate(self.encoders):
out.append(enc(in

    z = torch.cat(out, dim=1)
    #z = self.encoder(out)
    #out = self.decoder(z)
    z = torch.split(z, 3, dim=1)
    outs = []
    for idx, dec in enumerate(self.decoders):
        outs.append(dec(out[idx]))
    return outs

ptrblck · July 1, 2023, 8:05pm

Your approach looks alright and I’m not sure why it wouldn’t be possible.
Do you see any errors or are you stuck at one point?

Padmaksha_Roy · July 1, 2023, 9:08pm

hi @ptrblck , Thank you for the reply.

One more question is, whatever I told you above - is it implementing the same model that I pasted you above? My only doubt is that in that model there is a single encoder and multiple domain specific decoders whereas the model in this post has multiple encoders to collect inputs.

ptrblck · July 1, 2023, 11:04pm

I honestly don’t know which approach would work the best for your use case (i.e. using multiple or a single shared encoder). Technically, i.e. from the point of view of your PyTorch code, the code looks correct and should work.

Padmaksha_Roy · July 1, 2023, 11:10pm

Can you please tell me how do I change the code for a single shared encoder? Exactly which lines do I need to change? I want to change the code for a single shared encoder. Thank you!

ptrblck · July 1, 2023, 11:15pm

Your code already indicates the usage of a single encoder.
In the __init__ method you would initialize a single encoder via:

self.encoder = Encoder() # or whatever module you are using

and in the forward you would then iterate the outputs:

out = []
for x_ in x: # assuming x is a list of tensors
    out.append(self.encoder(x_))

and could use out afterwards.

Padmaksha_Roy · July 2, 2023, 1:10pm

Hi @ptrblck , there is a small issue here. The code that you provided returns list of tensors with grads. Now, when I try to calculate MSE loss, it gives the following error. What should be the return type here and how do I convert it? to_tensor method don’t work here.

criterion = nn.MSELoss()
loss = criterion(outs,inputs)

AttributeError: ‘list’ object has no attribute ‘size’

Kindly suggest!

ptrblck · July 2, 2023, 3:07pm

You could torch.cat or torch.stack the list to a single tensor. Make sure it has the same shape as the target.

Padmaksha_Roy · July 2, 2023, 5:01pm

Thank you! Solved this problem. I have another issue with the loss and I will try to find a post related to it. Thanks again!

Padmaksha_Roy · July 3, 2023, 1:54pm

hi @ptrblck , can you please clarify one thing in the sample code that you provided - does the “9” represent the number of classes? The input that you provided is 9 * 1 * 4, so is it like you are considering 1 sample from each of the 9 classes and doing SGD? Also, why did you split the decoder out matrix into 3 parts?

ptrblck · July 4, 2023, 6:44pm

No, the 9 represents the number of ecoders as explained in the original question:

Now I have between one and 9 inputs depending on the user’s choice…

I didn’t since out = torch.split(out, 3, dim=1) will create 9 outputs each with a size of 3.
You can just copy/paste my code and add print statements to the forward method to check the shape and content of each tensor if in doubt.

Padmaksha_Roy · July 4, 2023, 7:15pm

hi @ptrblck , thank you so much! I did the prints and all but was unclear about the numbers. Now, it is much clear to me. Each domain is encoded with one encoder and 3 can be a batch size as inputs.