RuntimeError: Given groups=1, weight of size [6, 1, 5, 5], expected input[128, 3, 218, 178] to have 1 channels, but got 3 channels instead

Mashood · August 9, 2022, 5:25pm

Yes, I have looked and tried solution from other related questions. Let me explain.

Description: So, I am doing adversarial training and I’ve used code from this Github repo. It uses Pytorch to train model on MNIST dataset.
What want is that instead of MNIST I want to use CelebeA dataset. And when I run this exact code on that it is giving me the above error.
What I’ve tried: I googled the error and got 2,3 related issues in which the issue got resolved when people changed their input-size in first conv layer. When I changed self.conv1 = torch.nn.Conv2d(1, 6, 5, padding=2) to self.conv1 = torch.nn.Conv2d(3, 6, 5, padding=2) it results into another error on line (x=x.view(-1, 16*5*5)) saying: “RuntimeError: shape '[-1, 400]' is invalid for input of size 4472832” .
I even tried changing this -1 value to 3, but its resulted into another different error and to be honest I don’t know what I’m doing so I stopped messing around.
Part of the code that I think is related to this error. You can see the whole code here:

# Creating a simple network
class LeNet5(torch.nn.Module):          
     
    def __init__(self):     
        super(LeNet5, self).__init__()
        self.conv1 = torch.nn.Conv2d(1, 6, 5, padding=2)
        self.conv2 = torch.nn.Conv2d(6, 16, 5)
        self.fc1 = nn.Linear(16*5*5, 120)   
        self.fc2 = nn.Linear(120, 84)       
        self.fc3 = nn.Linear(84, 10)    
        
    def forward(self, x):
        x = F.relu(self.conv1(x))  
        x = F.max_pool2d(x, 2) 
        x = F.relu(self.conv2(x))
        x = F.max_pool2d(x, 2)
        x = x.view(-1, 16*5*5)
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.fc3(x)
        
        return F.log_softmax(x,dim=-1)

FLAGS = flags.FLAGS
NB_EPOCHS = 2
BATCH_SIZE = 128
LEARNING_RATE = .001

#Training the Network
def trainTorch(torch_model, train_loader, test_loader,
        nb_epochs=NB_EPOCHS, batch_size=BATCH_SIZE, train_end=-1, test_end=-1, learning_rate=LEARNING_RATE, optimizer=None):

    train_loss = []
    total = 0
    correct = 0
    step = 0
    for _epoch in range(nb_epochs):
      for xs, ys in train_loader:
        xs, ys = Variable(xs), Variable(ys)
        if torch.cuda.is_available():
          xs, ys = xs.cuda(), ys.cuda()
          print('Cuda is available')
        optimizer.zero_grad()
        preds = torch_model(xs)
        # print("HI")
        loss = F.nll_loss(preds, ys)
        # print("HADSFSDF")
        loss.backward()  # calc gradients
        train_loss.append(loss.data.item())
        optimizer.step()  # update gradients

        preds_np = preds.cpu().detach().numpy()
        correct += (np.argmax(preds_np, axis=1) == ys.cpu().detach().numpy()).sum()
        total += train_loader.batch_size
        step += 1
        if total % 1000 == 0:
          acc = float(correct) / total
          print('[%s] Training accuracy: %.2f%%' % (step, acc * 100))
          total = 0
          correct = 0



model1 = LeNet5()
if torch.cuda.is_available():
  model1 = model1.cuda()
nb_epochs = 4
batch_size = 128
learning_rate = 0.001
train_end = -1
test_end = -1
report = AccuracyReport()
train_loader = torch.utils.data.DataLoader(
    datasets.CelebA('data', split='train', transform=transforms.ToTensor(), download="True"),
    batch_size=batch_size, shuffle=True)

test_loader = torch.utils.data.DataLoader(
    datasets.CelebA('data', split='test', transform=transforms.ToTensor()),
    batch_size=batch_size)

# In[ ]:


#Training the model
print("Training Model")
optimizer = optim.Adam(model1.parameters(), lr=learning_rate)
trainTorch(model1, train_loader, test_loader, nb_epochs, batch_size, train_end, test_end, learning_rate, optimizer = optimizer)

What I want to achieve:
1- To understand this and resolve this error and get my CelebA dataset working on this code.
2- Would I require any other change (for adversarial training on CelebA instead of MNIST) even if this is resolved?

3- My ultimate goal is to do adversarial training on any facial recognition dataset (ideally with pytorch and FGSM attack). If my current approach is not right, can you guys suggest me any other sources where it is done or can provide information on how to do this?
Thanks!

Mashood · August 9, 2022, 6:01pm

@ptrblck , need your feedback please.

ptrblck · August 9, 2022, 8:03pm

Replace self.conv1 with a new nn.Conv2d layer accepting 3 input channels, then replace:

x = x.view(-1, 16*5*5)

with

x = x.view(x.size(0), -1)

and fix the in_features of self.fc1 in case you are running into a shape mismatch afterwards.

Mashood · August 9, 2022, 10:45pm

First of all thanks for the prompt reply, man.
I applied the change but now it throws the following error at line: preds = torch_model(xs)

File ~/anaconda3/envs/pytorch/lib/python3.9/site-packages/torch/nn/functional.py:1690 in linear
ret = torch.addmm(bias, input, weight.t())

RuntimeError: mat1 and mat2 shapes cannot be multiplied (128x34944 and 400x120)

ptrblck · August 9, 2022, 10:47pm

Yes, that was expected and matches my second point:

Change self.fc1 to:

self.fc1 = nn.Linear(34944, 120)

Mashood · August 9, 2022, 10:47pm

I also want to know that what I am trying to do is right or not?
Like should the code that’s been running on MNIST initially run on CelebA dataset with minor changes ?

ptrblck · August 9, 2022, 10:49pm

No, it should not run directly as MNIST uses grayscale images (this the in_channels is set to 1 in the first conv layer) while CelebA uses RGB images which is why you need to replace the first conv layer with a new one accepting 3 input channels.
Also, if the spatial size changes (MNIST uses 28x28) you need to adapt the in_features of the first linear layer to account for this increase in features.

Mashood · August 10, 2022, 8:34pm

Forgive my inexperience, but how would I “adapt the in_features of the first linear layer to account for this increase in features” ?
Also, what would be the class size in case of CelebA dataset? self.fc3 = nn.Linear(84, 10) , here they have added 10(I think its because of MNIST), but what would it be in my case?
I really appreciate your guidance. Thanks!

ptrblck · August 10, 2022, 9:52pm

Check my previous post where I explain how to change the in_features based on the reported activation shape in the error message.

It depends on your use case and CelebA provides:

10,177 number of identities,

202,599 number of face images, and

5 landmark locations, 40 binary attributes annotations per image.

Mashood · August 11, 2022, 10:47pm

Yeah sorry about that. I think I missed that message somehow. I have updated that. But still there are errors…
Now, I’m seeing error : RuntimeError: 1D target tensor expected, multi-target not supported on line:
loss = F.nll_loss(preds, ys)
I googled the error and every solution says that the label tensor that should be 1-D instead of multiple dimension tensor. In my case I’ve printed the ys and it is a multi-dimension. So, to fox that I’ve tried ys = ys.squeeze_() but its still the same.

My use case is that first I want to run simple facial recognition on the dataset and see the results. Then, the code attacks the model with some adversaries(which is created on same dataset) and then train the model using the original data mixed with the adversarial images. So, basically it is a Multi-class Classification problem.
Also, for that reason I have set the size of my class to 10177 like:
self.fc3 = nn.Linear(84, 10177)

ptrblck · August 11, 2022, 11:31pm

For a multi-class classification use case with nn.NLLLoss the model outputs should have the shape [batch_size, nb_classes] and the targets should have the shape [batch_size] containing class indices in the range [0, nb_classes-1].

Mashood · August 11, 2022, 11:52pm

Hey man, I understand to some extent what you are saying, but I don’t know how and what exact changes are needed… Like what exactly should I do here loss = F.nll_loss(preds, ys) now?

ptrblck · August 11, 2022, 11:57pm

Here is a small but complete code snippet using a single layer as a model showing the expected shapes:

batch_size = 2
nb_classes = 10
in_features = 20

model = nn.Linear(in_features, nb_classes)
x = torch.randn(batch_size, in_features)
target = torch.randint(0, nb_classes, (batch_size,))

criterion = nn.NLLLoss()

output = model(x)
output = F.log_softmax(output, dim=1)
loss = criterion(output, target)
loss.backward()

Mashood · August 12, 2022, 12:16am

Sorry man. I really appreciate your help. But, I’m still clueless. I don’t know how to relate this to my code…

Mashood:

FLAGS = flags.FLAGS
NB_EPOCHS = 2
BATCH_SIZE = 128
LEARNING_RATE = .001

#Training the Network
def trainTorch(torch_model, train_loader, test_loader,
        nb_epochs=NB_EPOCHS, batch_size=BATCH_SIZE, train_end=-1, test_end=-1, learning_rate=LEARNING_RATE, optimizer=None):

    train_loss = []
    total = 0
    correct = 0
    step = 0
    for _epoch in range(nb_epochs):
      for xs, ys in train_loader:
        xs, ys = Variable(xs), Variable(ys)
        if torch.cuda.is_available():
          xs, ys = xs.cuda(), ys.cuda()
          print('Cuda is available')
        optimizer.zero_grad()
        preds = torch_model(xs)
        # print("HI")
        loss = F.nll_loss(preds, ys)
        # print("HADSFSDF")
        loss.backward()  # calc gradients
        train_loss.append(loss.data.item())
        optimizer.step()  # update gradients

        preds_np = preds.cpu().detach().numpy()
        correct += (np.argmax(preds_np, axis=1) == ys.cpu().detach().numpy()).sum()
        total += train_loader.batch_size
        step += 1
        if total % 1000 == 0:
          acc = float(correct) / total
          print('[%s] Training accuracy: %.2f%%' % (step, acc * 100))
          total = 0
          correct = 0

ptrblck · August 12, 2022, 12:36am

In your code you are using the default target_type of the CelebA dataset, which is attr, i.e. a multi-label classification as described in the docs.
Based on your description you want to use the identity target, so this code should work:

dataset = datasets.CelebA('./data', split='train', transform=transforms.ToTensor(), download="True", target_type='identity')
batch_size = 8
train_loader = torch.utils.data.DataLoader(dataset, batch_size=batch_size, shuffle=True)

model = models.resnet18()
model.fc = nn.Linear(512, 10177)
optimizer = torch.optim.Adam(model.parameters(), lr=1e-3)

for xs, ys in train_loader:
    optimizer.zero_grad()
    output = model(xs)
    output = F.log_softmax(output, dim=1)
    loss = F.nll_loss(output, ys)
    loss.backward()
    optimizer.step()
    print(loss.item())

Mashood · August 12, 2022, 1:52pm

I’m trying this snippet separate from my existing messed up code. One thing though, I’m getting the following error on the above line: Undefined name 'models'

ptrblck · August 12, 2022, 3:56pm

You are most likely missing:

import torchvision.models as models

Mashood · August 13, 2022, 11:26am

Hey ptr. I tried your code and there are 2 things that seem wrong. First the training accuracy is 0% for all of the iterations and it says: IndexError: Target 10177 is out of bounds. Same error in case of 10176 as well.
My new code snippet:

def trainTorch(torch_model, train_loader, test_loader,
        nb_epochs=NB_EPOCHS, batch_size=BATCH_SIZE, train_end=-1, test_end=-1, learning_rate=LEARNING_RATE, optimizer=None):

    train_loss = []
    total = 0
    correct = 0
    step = 0
    for _epoch in range(nb_epochs):
      for xs, ys in train_loader:        
        optimizer.zero_grad()
        preds = torch_model(xs)
        output = F.log_softmax(preds, dim=1)
        loss = F.nll_loss(output, ys)
        loss.backward()
       
        train_loss.append(loss.data.item())
        optimizer.step()  # update gradients

        preds_np = preds.cpu().detach().numpy()
        correct += (np.argmax(preds_np, axis=1) == ys.cpu().detach().numpy()).sum()
        total += train_loader.batch_size
        step += 1
        if total % 1000 == 0:
          acc = float(correct) / total
          print('[%s] Training accuracy: %.2f%%' % (step, acc * 100))
          total = 0
          correct = 0

Output:

Training Model
[125] Training accuracy: 0.00%
[250] Training accuracy: 0.00%
[375] Training accuracy: 0.00%
[500] Training accuracy: 0.00%
[625] Training accuracy: 0.00%
[750] Training accuracy: 0.00%
[875] Training accuracy: 0.10%
[1000] Training accuracy: 0.00%
[1125] Training accuracy: 0.00%
[1250] Training accuracy: 0.00%
[1375] Training accuracy: 0.00%

Please, need your suggestions. Thanks.

ptrblck · August 13, 2022, 6:58pm

That’s true, as the out_features have to be 10178 for 10177 classes.

That’s not the case and my code works up to class indices of 10176.

That might be the case as I didn’t provide a code snippet containing a model with already tuned hyperparameters for this use case, but provided a code snippet which shows how to fix your previous errors as you were stuck even after explaining what is causing the issues.
You can thus the the class index issue by using nn.Linear(512, 10178) as the final classifier in your model and could then reuse parts of my code to fix your multi-target errors in your model and training code.

Mashood · August 14, 2022, 5:01pm

Hey man. I think we are really close here. Sorry to ask you again and again. I have reuse your parts with my code and now the training accuracy is not 0 but its still low as 2% or 5%. I did changed the batch_size to 128 instead of 8 which increased the accuracy and speed. But, its still not right. I have googled how to adjust these hyperparameters but did not find relative help. Can you please advise if I’ve missed something here in self.conv or self.fc layers.
*note: I changed the second param of self.fc2 to 512 because when I used your code for self.fc3 it gave “mat1 and mat2 missmatch error”.
Thank you for all the help!
This is what the code looks like now:

class LeNet5(torch.nn.Module):          
     
    def __init__(self):     
        super(LeNet5, self).__init__()
        self.conv1 = torch.nn.Conv2d(3, 6, 5, padding=2)
        self.conv2 = torch.nn.Conv2d(6, 16, 5)
        self.fc1 = nn.Linear(34944, 120)
        self.fc2 = nn.Linear(120, 512)       
        self.fc3 = nn.Linear(512, 10178) # change class size according to celebA dataset 
        
    def forward(self, x):
        x = F.relu(self.conv1(x))  
        x = F.max_pool2d(x, 2) 
        x = F.relu(self.conv2(x))
        x = F.max_pool2d(x, 2)
        #x = x.view(-1, 16*5*5)
        x = x.view(x.size(0), -1)
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.fc3(x)
        
        return F.log_softmax(x,dim=-1)


model1 = LeNet5()

nb_epochs = 4
batch_size = 128
learning_rate = 0.001
train_end = -1
test_end = -1
report = AccuracyReport()
train_loader = torch.utils.data.DataLoader(
    datasets.CelebA('data', split='train', target_type='identity', transform=transforms.ToTensor(), download="True"),
    batch_size=batch_size, shuffle=True)