RuntimeError: Given groups=1, weight of size [64, 3, 7, 7], expected input[3, 1, 224, 224] to have 3 channels, but got 1 channels instead

Hello, Am also facing a related challenge, Here is the error i get when i try to load an image into a CNN,

RuntimeError: Given groups=1, weight of size [16, 3, 3, 3], expected input[2, 2, 2, 2] to have 3 channels, but got 2 channels instead

Below is my code
class Network(nn.Module):
def init(self, in_channels=3):
print("\ninitalizing “Network”")
super(Network,self).init()
#create 4 input Conv. layers
self.Conv1 = nn.Sequential(
nn.Conv2d(in_channels, 16, 3),
nn.MaxPool2d(2,2),
nn.Conv2d(16,16, 3, stride=1, padding=1), nn.ReLU(),
nn.Conv2d(16, 16, 3), nn.ReLU(),
nn.Conv2d(16, 16, 3))

     self.Conv2 = nn.Sequential(
	        nn.Conv2d(in_channels, 16, 3),
            nn.MaxPool2d(2,2),
			nn.Conv2d(16, 8, 3, stride=1, padding=1), nn.ReLU(),
            nn.Conv2d(8, 16, 3, padding=1), nn.ReLU(),
            nn.Conv2d(16, 16, 3))

     self.Conv3 =  nn.Sequential(
	        nn.Conv2d(in_channels, 16,3),
            nn.MaxPool2d(2,2),
		    nn.Conv2d(16, 4, 3,stride=1, padding=1), nn.ReLU(),
            nn.Conv2d(4, 16, 3, padding=1), nn.ReLU(),
            nn.Conv2d(16, 16, 3))

     self.Conv4 =nn.Sequential(
	        nn.Conv2d(in_channels, 16, 3),
            nn.MaxPool2d(2,2),
			nn.Conv2d(16, 2, 3, stride=1, padding=1), nn.ReLU(),
            nn.Conv2d(2, 16,  3, padding=1),nn.ReLU(),
            nn.Conv2d(16, 16,  3))

     self.Upsample1 = nn.Sequential(
              nn.Upsample(size=(16, 16), scale_factor=None,align_corners=True, mode="bilinear"),
              nn.Conv2d(16, 64, 3, stride=1, padding=1) ,nn.ReLU(),
              nn.MaxPool2d(3,2),
              nn.Conv2d(64, 32, 3, stride=1),nn.ReLU(),
              
              nn.Upsample(size=(32, 32), scale_factor=None,align_corners=True, mode='bilinear'),
              nn.Conv2d(32, 16, 3, stride=1, padding=1),nn.ReLU(),
              nn.MaxPool2d(2,2),
              nn.Conv2d(16, 8, 3, stride=1),nn.ReLU(),
              
              nn.Upsample(size=(8, 8), scale_factor=None,align_corners=True, mode='bilinear'),
              nn.Conv2d(8, 4, 3, stride=1, padding=1),nn.ReLU(),
              nn.MaxPool2d(2,2),
              nn.Conv2d(4, 2, 3),nn.ReLU())
              

     self.Upsample2 = nn.Sequential(
              nn.Upsample(size=(16, 16), scale_factor=None,align_corners=True, mode='bilinear'),
              nn.Conv2d(16, 64, 3, stride=2, padding=1),nn.ReLU(),
              nn.MaxPool2d(2,2),
              nn.Conv2d(64, 16, 3, stride=2),nn.ReLU(),
              
              nn.Upsample(size=(16, 16), scale_factor=None,align_corners=True, mode='bilinear'),
              nn.Conv2d(16, 4, 3, stride=2, padding=1),nn.ReLU(),
              nn.MaxPool2d(2,2),
              nn.Conv2d(4, 4, 3),nn.ReLU())              
                
     self.Upsample3 = nn.Sequential(
              nn.Upsample(size=(16, 16), scale_factor=None,align_corners=True, mode='bilinear'),
              nn.Conv2d(16, 64, 3, stride=2, padding=1),nn.ReLU(),
              nn.MaxPool2d(2,2),
              nn.Conv2d(64, 8, 3),nn.ReLU())
     
     self.Conv1a = nn.Conv2d(3, 16, 3, padding=1)
     self.Conv2a = nn.Conv2d(4, 16, 3, padding=1)                   
     self.Conv3a = nn.Conv2d(8, 16, 3, padding=1)
     self.Conv4a = nn.Conv2d(16,16, 3) 
     self.Conv5 =  nn.Conv2d(64 ,3, 3 )
  
def forward(self,image):
   
    w = self.Conv1(image)
    w = self.Upsample1(w)
    w=  F.relu(self.Conv1a(w))

    k = self.Conv2(image)
    k = self.Upsample2(k)
    k=  F.relu(self.Conv2a(k))

    y = self.Conv3(image)
    y = self.Upsample3(y)
    y=  F.relu(self.Conv3a(y))

    z = self.Conv4(image)
    z=  F.relu(self.Conv4a(z))
    
    p = torch.cat([w, k, y, z], dim=1)
    p = F.relu(self.fc1(p))
    p = self.fc2(p)

    q = self.Conv5(p)


    q = self.out(q)
    return q

I guess the first conv layer is using in_channels=3 and is raising this error, since you are passing an input with two channels in the shape [2, 2, 2, 2].
Note that even if you use 3 channels ([2, 3, 2, 2]), the spatial size would still be too small for the model, since the first conv layer uses a kernel size of 3, which is bigger than the input.

Thank you for the observation @ptrblck, its true the the first conv. is raising the error, how should i go about it?

1 Like

If you need to use the input as [batch_size=2, channels=2, height=2, width=2], you won’t be able to use this architecture even after changing the in_channels to 2 in the first conv layer, since your spatial input shape is too small.
You could try to use a kernel_size of 2 and remove all pooling layers, as you would otherwise downsample the activation to a single pixel of 1x1.

Could you increase the spatial size of your input and could you explain why the shape is so small?

1 Like

Thanks for your response. I am facing this issue in a different context. If I don’t want to change my original images and also don’t have the option to change the architecture (since I am using resnet18 from models), how can I do slicing? Is there an example to guide?

The new error setting is as such:

RuntimeError: Given groups=1, weight of size [64, 1, 7, 7], expected input[51, 3, 224, 224] to have 1 channels, but got 3 channels instead

Here is a test image above. I am wondering if I don’t want to convert my images to black and white or grey-scale, and want to use all three RGB channel, how could I still do so for PyTorch training given I cannot change the loaded ResNet18?

You could add an extra conv layer before passing the image to the pretrained resnet.
This would still be similar to changing the model architecture, but instead of replacing the first pretrained conv layer you would just add another one.
Here is a small code snippet:

first_conv = nn.Conv2d(3, 1, kernel_size, stride, padding) # you could use e.g. a 1x1 kernel
model = pretrained_model()

x = # load data, should have the shape [batch_size, 3, height, width]
out = first_conv(x)
out = pretrained_model(out)

PS: OpenCV loads the images by default in BGR which is why your current image looks “blue-ish”.
torchvision models are pretrained on RGB images so you would usually have to invert the colors before passing it to the pretained model. However, your use case doesn’t seem to use pretrained torchvision models, since the first conv layer expects a single input channel, so you can probably skip this information.

1 Like

Thanks a lot for the response. I think I am not able to follow along with you very closely.

I wrote this in network:

num_classes = 68 * 2 #68 coordinates X and Y flattened

class Network(nn.Module):
    def __init__(self,num_classes=136):
        super().__init__()
        self.first_conv = nn.Conv2d(3, 1, (1,1), 1, 1) # you could use e.g. a 1x1 kernel
        self.model_name = 'resnet18'
        self.model = models.resnet18()
        self.model.conv1 = nn.Conv2d(1, 64, kernel_size=7, stride=2, padding=3, bias=False)
        self.model.fc = nn.Linear(self.model.fc.in_features, num_classes)
        
    def forward(self, x):
        out = self.first_conv(x)
        out = self.model(out)
        return out

and here’s the error I get when training:

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-89-11ebb3420695> in <module>
     28         landmarks = landmarks.view(landmarks.size(0),-1).cuda()
     29 
---> 30         predictions = network(images)
     31 
     32         # clear all the gradients before calculating them

~/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
    720             result = self._slow_forward(*input, **kwargs)
    721         else:
--> 722             result = self.forward(*input, **kwargs)
    723         for hook in itertools.chain(
    724                 _global_forward_hooks.values(),

<ipython-input-85-d970c9b6abd5> in forward(self, x)
     11 
     12     def forward(self, x):
---> 13         out = self.first_conv(x)
     14         out = self.model(out)
     15         return out

~/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
    720             result = self._slow_forward(*input, **kwargs)
    721         else:
--> 722             result = self.forward(*input, **kwargs)
    723         for hook in itertools.chain(
    724                 _global_forward_hooks.values(),

~/anaconda3/lib/python3.7/site-packages/torch/nn/modules/conv.py in forward(self, input)
    417 
    418     def forward(self, input: Tensor) -> Tensor:
--> 419         return self._conv_forward(input, self.weight)
    420 
    421 class Conv3d(_ConvNd):

~/anaconda3/lib/python3.7/site-packages/torch/nn/modules/conv.py in _conv_forward(self, input, weight)
    414                             _pair(0), self.dilation, self.groups)
    415         return F.conv2d(input, weight, self.bias, self.stride,
--> 416                         self.padding, self.dilation, self.groups)
    417 
    418     def forward(self, input: Tensor) -> Tensor:

RuntimeError: Input type (torch.cuda.DoubleTensor) and weight type (torch.cuda.FloatTensor) should be the same

Thanks for the code.
I’m not sure, if you would need this workaround and why you are replacing the self.model.conv1 with a conv layer accepting a single input channel.
The default resnet18 model already accepts RGB images, so you could just remove the self.model.conv1 line of code as well as self.first_conv and use the self.model directly.

Also, note that you are not using the pretrained model, since you are not passing pretrained=True to resnet18.

The second error is raised, since your input tensor is a DoubleTensor, while the parameters are FlaotTensors. Transform the input via x = x.float() to get rid of this error.

1 Like

I see, thanks for explanation. I guess you mean something like the following, right?

num_classes = 68 * 2 #68 coordinates X and Y flattened

class Network(nn.Module):
    def __init__(self,num_classes=136):
        super().__init__()
        ##self.first_conv = nn.Conv2d(3, 1, (1,1), 1, 1) # you could use e.g. a 1x1 kernel
        self.model_name = 'resnet18'
        self.model = models.resnet18()
        ##self.model.conv1 = nn.Conv2d(1, 64, kernel_size=7, stride=2, padding=3, bias=False)
        self.model.fc = nn.Linear(self.model.fc.in_features, num_classes)
        
    def forward(self, x):
        ##out = self.first_conv(x)
        x = x.float()
        out = self.model(x)
        return out

However, I am not sure if the error below is exactly related to this code block or how to fix:

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-100-11ebb3420695> in <module>
     37 
     38         # calculate the gradients
---> 39         loss_train_step.backward()
     40 
     41         # update the parameters

~/anaconda3/lib/python3.7/site-packages/torch/tensor.py in backward(self, gradient, retain_graph, create_graph)
    183                 products. Defaults to ``False``.
    184         """
--> 185         torch.autograd.backward(self, gradient, retain_graph, create_graph)
    186 
    187     def register_hook(self, hook):

~/anaconda3/lib/python3.7/site-packages/torch/autograd/__init__.py in backward(tensors, grad_tensors, retain_graph, create_graph, grad_variables)
    125     Variable._execution_engine.run_backward(
    126         tensors, grad_tensors, retain_graph, create_graph,
--> 127         allow_unreachable=True)  # allow_unreachable flag
    128 
    129 

RuntimeError: Found dtype Double but expected Float
Exception raised from compute_types at /opt/conda/conda-bld/pytorch_1595629403081/work/aten/src/ATen/native/TensorIterator.cpp:183 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x4d (0x7f9c153f177d in /home/mona/anaconda3/lib/python3.7/site-packages/torch/lib/libc10.so)
frame #1: at::TensorIterator::compute_types(at::TensorIteratorConfig const&) + 0x259 (0x7f9c48448ca9 in /home/mona/anaconda3/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so)
frame #2: at::TensorIterator::build(at::TensorIteratorConfig&) + 0x6b (0x7f9c4844c44b in /home/mona/anaconda3/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so)
frame #3: at::TensorIterator::TensorIterator(at::TensorIteratorConfig&) + 0xdd (0x7f9c4844cabd in /home/mona/anaconda3/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so)
frame #4: at::native::mse_loss_backward_out(at::Tensor&, at::Tensor const&, at::Tensor const&, at::Tensor const&, long) + 0x18a (0x7f9c482b171a in /home/mona/anaconda3/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so)
frame #5: <unknown function> + 0xd1d610 (0x7f9c16574610 in /home/mona/anaconda3/lib/python3.7/site-packages/torch/lib/libtorch_cuda.so)
frame #6: at::native::mse_loss_backward(at::Tensor const&, at::Tensor const&, at::Tensor const&, long) + 0x90 (0x7f9c482ae140 in /home/mona/anaconda3/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so)
frame #7: <unknown function> + 0xd1d6b0 (0x7f9c165746b0 in /home/mona/anaconda3/lib/python3.7/site-packages/torch/lib/libtorch_cuda.so)
frame #8: <unknown function> + 0xd3f936 (0x7f9c16596936 in /home/mona/anaconda3/lib/python3.7/site-packages/torch/lib/libtorch_cuda.so)
frame #9: at::mse_loss_backward(at::Tensor const&, at::Tensor const&, at::Tensor const&, long) + 0x119 (0x7f9c48770da9 in /home/mona/anaconda3/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so)
frame #10: <unknown function> + 0x2b5e8c9 (0x7f9c4a3c98c9 in /home/mona/anaconda3/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so)
frame #11: <unknown function> + 0x7f60d6 (0x7f9c480610d6 in /home/mona/anaconda3/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so)
frame #12: at::mse_loss_backward(at::Tensor const&, at::Tensor const&, at::Tensor const&, long) + 0x119 (0x7f9c48770da9 in /home/mona/anaconda3/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so)
frame #13: torch::autograd::generated::MseLossBackward::apply(std::vector<at::Tensor, std::allocator<at::Tensor> >&&) + 0x1af (0x7f9c4a30552f in /home/mona/anaconda3/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so)
frame #14: <unknown function> + 0x30d1017 (0x7f9c4a93c017 in /home/mona/anaconda3/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so)
frame #15: torch::autograd::Engine::evaluate_function(std::shared_ptr<torch::autograd::GraphTask>&, torch::autograd::Node*, torch::autograd::InputBuffer&, std::shared_ptr<torch::autograd::ReadyQueue> const&) + 0x1400 (0x7f9c4a937860 in /home/mona/anaconda3/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so)
frame #16: torch::autograd::Engine::thread_main(std::shared_ptr<torch::autograd::GraphTask> const&) + 0x451 (0x7f9c4a938401 in /home/mona/anaconda3/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so)
frame #17: torch::autograd::Engine::thread_init(int, std::shared_ptr<torch::autograd::ReadyQueue> const&, bool) + 0x89 (0x7f9c4a930579 in /home/mona/anaconda3/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so)
frame #18: torch::autograd::python::PythonEngine::thread_init(int, std::shared_ptr<torch::autograd::ReadyQueue> const&, bool) + 0x4a (0x7f9c4ec5f99a in /home/mona/anaconda3/lib/python3.7/site-packages/torch/lib/libtorch_python.so)
frame #19: <unknown function> + 0xc9067 (0x7f9c859d2067 in /home/mona/anaconda3/lib/python3.7/site-packages/zmq/backend/cython/../../../../.././libstdc++.so.6)
frame #20: <unknown function> + 0x9609 (0x7f9c88ae9609 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #21: clone + 0x43 (0x7f9c88a10103 in /lib/x86_64-linux-gnu/libc.so.6)

Are these https://github.com/pytorch/pytorch/issues/42588 related?

solved here RuntimeError: Found dtype Double but expected Float Exception raised from compute_types at /opt/conda/conda-bld/pytorch_1595629403081/work/aten/src/ATen/native/TensorIterator.cpp:183

Hi Guys,
i’m using yolov3.
I have 2 inputs (4 channels (3 channels for RGB image and 1 channel for thermal image)).
I want to concat the 2 inputs.
This code of my file yolo.py :
----------------------------yolo.py----------------

import argparse

import logging

import sys

from copy import deepcopy

from pathlib import Path

sys.path.append('./')  # to run '$ python *.py' files in subdirectories

logger = logging.getLogger(__name__)

from models.common import *

from models.experimental import MixConv2d, CrossConv

from utils.autoanchor import check_anchor_order

from utils.general import make_divisible, check_file, set_logging

from utils.torch_utils import time_synchronized, fuse_conv_and_bn, model_info, scale_img, initialize_weights, \

    select_device, copy_attr

try:

    import thop  # for FLOPS computation

except ImportError:

    thop = None

class Detect(nn.Module):

    stride = None  # strides computed during build

    export = False  # onnx export

    def __init__(self, nc=4, anchors=(), ch=()):  # detection layer

        super(Detect, self).__init__()

        self.nc = nc  # number of classes

        self.no = nc + 5  # number of outputs per anchor

        self.nl = len(anchors)  # number of detection layers

        self.na = len(anchors[0]) // 2  # number of anchors

        self.grid = [torch.zeros(1)] * self.nl  # init grid

        a = torch.tensor(anchors).float().view(self.nl, -1, 2)

        self.register_buffer('anchors', a)  # shape(nl,na,2)

        self.register_buffer('anchor_grid', a.clone().view(self.nl, 1, -1, 1, 1, 2))  # shape(nl,1,na,1,1,2)

       # self.m = nn.ModuleList(nn.Conv2d(x, self.no * self.na, 1) for x in ch)  # output conv

        self.m = nn.ModuleList(nn.Conv2d(4, self.no * self.na, 1) for out in ch) 

        #self.m2 = nn.ModuleList(nn.Concat(in_ther, self.no * self.na, 1) for in_ther in ch) 

#class TwoInputs(nn.Module):

 #   def __init__(self):

  #    super(TwoInputs, self).__init__()

      #self.conv = nn.Conv2d( ... )  # set up your layer here

     # self.fc1 = nn.Linear( ... )  # set up first FC layer

     # self.fc2 = nn.Linear( ... )  

    def forward(self,in_rgb,in_ther):

        combined = torch.cat((in_rgb.view(in_rgb.size(0), -1), in_ther.view(in_ther.size(0), -1)), dim=1)

        out = self.Concat(combined)

        return out   

    def forward(self, x):

        # x = x.copy()  # for profiling

        z = []  # inference output

        self.training |= self.export

        for i in range(self.nl):

            x[i] = self.m[i](x[i])  # conv

            bs, _, ny, nx = x[i].shape  # x(bs,255,20,20) to x(bs,3,20,20,85)

            x[i] = x[i].view(bs, self.na, self.no, ny, nx).permute(0, 1, 3, 4, 2).contiguous()

            if not self.training:  # inference

                if self.grid[i].shape[2:4] != x[i].shape[2:4]:

                    self.grid[i] = self._make_grid(nx, ny).to(x[i].device)

                y = x[i].sigmoid()

                y[..., 0:2] = (y[..., 0:2] * 2. - 0.5 + self.grid[i].to(x[i].device)) * self.stride[i]  # xy

                y[..., 2:4] = (y[..., 2:4] * 2) ** 2 * self.anchor_grid[i]  # wh

                z.append(y.view(bs, -1, self.no))

        return x if self.training else (torch.cat(z, 1), x)

    @staticmethod

    def _make_grid(nx=20, ny=20):

        yv, xv = torch.meshgrid([torch.arange(ny), torch.arange(nx)])

        return torch.stack((xv, yv), 2).view((1, 1, ny, nx, 2)).float()

class Model(nn.Module):

    def __init__(self, cfg='yolov3.yaml', ch=4, nc=None):  # model, input channels, number of classes

        super(Model, self).__init__()

        if isinstance(cfg, dict):

            self.yaml = cfg  # model dict

        else:  # is *.yaml

            import yaml  # for torch hub

            self.yaml_file = Path(cfg).name

            with open(cfg) as f:

                self.yaml = yaml.load(f, Loader=yaml.FullLoader)  # model dict

        # Define model

        ch = self.yaml['ch'] = self.yaml.get('ch', ch)  # input channels

        if nc and nc != self.yaml['nc']:

            logger.info('Overriding model.yaml nc=%g with nc=%g' % (self.yaml['nc'], nc))

            self.yaml['nc'] = nc  # override yaml value

        self.model, self.save = parse_model(deepcopy(self.yaml), ch=[ch])  # model, savelist

        self.names = [str(i) for i in range(self.yaml['nc'])]  # default names

        print([x.shape for x in self.forward(torch.zeros(1, ch, 64, 64))])

       # print([out.shape for out in self.forward(torch.zeros(1, ch, 64, 64))])

        # Build strides, anchors

        m = self.model[-1]  # Detect()

        if isinstance(m, Detect):

            s = 256  # 2x min stride

            m.stride = torch.tensor([s / x.shape[-2] for x in self.forward(torch.zeros(1, ch, s, s))])  # forward

            m.anchors /= m.stride.view(-1, 1, 1)

            check_anchor_order(m)

            self.stride = m.stride

            self._initialize_biases()  # only run once

            #print('Strides: %s' % m.stride.tolist())

        # Init weights, biases

        initialize_weights(self)

        self.info()

        logger.info('')

    def forward(self, x, augment=False, profile=False):

        if augment:

            img_size = out.shape[-2:]  # height, width

            s = [1, 0.83, 0.67]  # scales

            f = [None, 3, None]  # flips (2-ud, 3-lr)

            y = []  # outputs

            for si, fi in zip(s, f):

                xi = scale_img(x.flip(fi) if fi else x, si, gs=int(self.stride.max()))

                yi = self.forward_once(xi)[0]  # forward

                # cv2.imwrite('img%g.jpg' % s, 255 * xi[0].numpy().transpose((1, 2, 0))[:, :, ::-1])  # save

                yi[..., :4] /= si  # de-scale

                if fi == 2:

                    yi[..., 1] = img_size[0] - yi[..., 1]  # de-flip ud

                elif fi == 3:

                    yi[..., 0] = img_size[1] - yi[..., 0]  # de-flip lr

                y.append(yi)

            return torch.cat(y, 1), None  # augmented inference, train

        else:

            return self.forward_once(x, profile)  # single-scale inference, train

    def forward_once(self, x, profile=False):

        y, dt = [], []  # outputs

        for m in self.model:

            if m.f != -1:  # if not from previous layer

                x = y[m.f] if isinstance(m.f, int) else [x if j == -1 else y[j] for j in m.f]  # from earlier layers

            if profile:

                o = thop.profile(m, inputs=(x,out), verbose=False)[0] / 1E9 * 2 if thop else 0  # FLOPS

                t = time_synchronized()

                for _ in range(10):

                    _ = m(x)

                dt.append((time_synchronized() - t) * 100)

                print('%10.1f%10.0f%10.1fms %-40s' % (o, m.np, dt[-1], m.type))

            x = m(x)  # run

            y.append(x if m.i in self.save else None)  # save output

        if profile:

            print('%.1fms total' % sum(dt))

        return x

    def _initialize_biases(self, cf=None):  # initialize biases into Detect(), cf is class frequency

        # https://arxiv.org/abs/1708.02002 section 3.3

        # cf = torch.bincount(torch.tensor(np.concatenate(dataset.labels, 0)[:, 0]).long(), minlength=nc) + 1.

        m = self.model[-1]  # Detect() module

        for mi, s in zip(m.m, m.stride):  # from

            b = mi.bias.view(m.na, -1)  # conv.bias(255) to (3,85)

            b.data[:, 4] += math.log(8 / (640 / s) ** 2)  # obj (8 objects per 640 image)

            b.data[:, 5:] += math.log(0.6 / (m.nc - 0.99)) if cf is None else torch.log(cf / cf.sum())  # cls

            mi.bias = torch.nn.Parameter(b.view(-1), requires_grad=True)

    def _print_biases(self):

        m = self.model[-1]  # Detect() module

        for mi in m.m:  # from

            b = mi.bias.detach().view(m.na, -1).T  # conv.bias(255) to (3,85)

            print(('%6g Conv2d.bias:' + '%10.3g' * 6) % (mi.weight.shape[1], *b[:5].mean(1).tolist(), b[5:].mean()))

    # def _print_weights(self):

    #     for m in self.model.modules():

    #         if type(m) is Bottleneck:

    #             print('%10.3g' % (m.w.detach().sigmoid() * 2))  # shortcut weights

    def fuse(self):  # fuse model Conv2d() + BatchNorm2d() layers

        print('Fusing layers... ')

        for m in self.model.modules():

            if type(m) is Conv and hasattr(m, 'bn'):

                m.conv = fuse_conv_and_bn(m.conv, m.bn)  # update conv

                delattr(m, 'bn')  # remove batchnorm

                m.forward = m.fuseforward  # update forward

        self.info()

        return self

    def nms(self, mode=True):  # add or remove NMS module

        present = type(self.model[-1]) is NMS  # last layer is NMS

        if mode and not present:

            print('Adding NMS... ')

            m = NMS()  # module

            m.f = -1  # from

            m.i = self.model[-1].i + 1  # index

            self.model.add_module(name='%s' % m.i, module=m)  # add

            self.eval()

        elif not mode and present:

            print('Removing NMS... ')

            self.model = self.model[:-1]  # remove

        return self

    def autoshape(self):  # add autoShape module

        print('Adding autoShape... ')

        m = autoShape(self)  # wrap model

        copy_attr(m, self, include=('yaml', 'nc', 'hyp', 'names', 'stride'), exclude=())  # copy attributes

        return m

    def info(self, verbose=False, img_size=640):  # print model information

        model_info(self, verbose, img_size)

def parse_model(d, ch):  # model_dict, input_channels(3)

    logger.info('\n%3s%18s%3s%10s  %-40s%-30s' % ('', 'from', 'n', 'params', 'module', 'arguments'))

    anchors, nc, gd, gw = d['anchors'], d['nc'], d['depth_multiple'], d['width_multiple']

    na = (len(anchors[0]) // 2) if isinstance(anchors, list) else anchors  # number of anchors

    no = na * (nc + 5)  # number of outputs = anchors * (classes + 5)

    layers, save, c2 = [], [], ch[-1]  # layers, savelist, ch out

    for i, (f, n, m, args) in enumerate(d['backbone'] + d['head']):  # from, number, module, args

        m = eval(m) if isinstance(m, str) else m  # eval strings

        for j, a in enumerate(args):

            try:

                args[j] = eval(a) if isinstance(a, str) else a  # eval strings

            except:

                pass

        n = max(round(n * gd), 1) if n > 1 else n  # depth gain

        if m in [Conv, Bottleneck, SPP, DWConv, MixConv2d, Focus, CrossConv, BottleneckCSP, C3]:

            c1, c2 = ch[f], args[0]

            # Normal

            # if i > 0 and args[0] != no:  # channel expansion factor

            #     ex = 1.75  # exponential (default 2.0)

            #     e = math.log(c2 / ch[1]) / math.log(2)

            #     c2 = int(ch[1] * ex ** e)

            # if m != Focus:

            c2 = make_divisible(c2 * gw, 8) if c2 != no else c2

            # Experimental

            # if i > 0 and args[0] != no:  # channel expansion factor

            #     ex = 1 + gw  # exponential (default 2.0)

            #     ch1 = 32  # ch[1]

            #     e = math.log(c2 / ch1) / math.log(2)  # level 1-n

            #     c2 = int(ch1 * ex ** e)

            # if m != Focus:

            #     c2 = make_divisible(c2, 8) if c2 != no else c2

            args = [c1, c2, *args[1:]]

            if m in [BottleneckCSP, C3]:

                args.insert(2, n)

                n = 1

        elif m is nn.BatchNorm2d:

            args = [ch[f]]

        elif m is Concat:

            c2 = sum([ch[x if x < 0 else x + 1] for x in f])

        elif m is Detect:

            args.append([ch[x + 1] for x in f])

            if isinstance(args[1], int):  # number of anchors

                args[1] = [list(range(args[1] * 2))] * len(f)

        elif m is Contract:

            c2 = ch[f if f < 0 else f + 1] * args[0] ** 2

        elif m is Expand:

            c2 = ch[f if f < 0 else f + 1] // args[0] ** 2

        else:

            c2 = ch[f if f < 0 else f + 1]

        m_ = nn.Sequential(*[m(*args) for _ in range(n)]) if n > 1 else m(*args)  # module

        t = str(m)[8:-2].replace('__main__.', '')  # module type

        np = sum([x.numel() for x in m_.parameters()])  # number params

        m_.i, m_.f, m_.type, m_.np = i, f, t, np  # attach index, 'from' index, type, number params

        logger.info('%3s%18s%3s%10.0f  %-40s%-30s' % (i, f, n, np, t, args))  # print

        save.extend(x % i for x in ([f] if isinstance(f, int) else f) if x != -1)  # append to savelist

        layers.append(m_)

        ch.append(c2)

    return nn.Sequential(*layers), sorted(save)

if __name__ == '__main__':

    parser = argparse.ArgumentParser()

    parser.add_argument('--cfg', type=str, default='yolov3.yaml', help='model.yaml')

    parser.add_argument('--device', default='0', help='cuda device, i.e. 0 or 0,1,2,3 or cpu')

    opt = parser.parse_args()

    opt.cfg = check_file(opt.cfg)  # check file

    set_logging()

    device = select_device(opt.device)

    # Create model

    model = Model(opt.cfg).to(device)

    model.train()

    # Profile

    # img = torch.rand(8 if torch.cuda.is_available() else 1, 3, 640, 640).to(device)

    # y = model(img, profile=True)

    # Tensorboard

    # from torch.utils.tensorboard import SummaryWriter

    # tb_writer = SummaryWriter()

    # print("Run 'tensorboard --logdir=models/runs' to view tensorboard at http://localhost:6006/")

    # tb_writer.add_graph(model.model, img)  # add model to tensorboard

    # tb_writer.add_image('test', img[0], dataformats='CWH')  # add model to tensorboard

But I have this error :

  File "train.py", line 538, in <module>
    train(hyp, opt, device, tb_writer, wandb)
  File "train.py", line 79, in train
    model = Model(opt.cfg or ckpt['model'].yaml, ch=4, nc=nc).to(device)  # create (ch=3 change to 4)
  File "/content/drive/My Drive/yolov3/models/yolo.py", line 99, in __init__
    print([x.shape for x in self.forward(torch.zeros(1, ch, 64, 64))])
  File "/content/drive/My Drive/yolov3/models/yolo.py", line 138, in forward
    return self.forward_once(x, profile)  # single-scale inference, train
  File "/content/drive/My Drive/yolov3/models/yolo.py", line 154, in forward_once
    x = m(x)  # run
  File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/content/drive/My Drive/yolov3/models/yolo.py", line 60, in forward
    x[i] = self.m[i](x[i])  # conv
  File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/conv.py", line 399, in forward
    return self._conv_forward(input, self.weight, self.bias)
  File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/conv.py", line 396, in _conv_forward
    self.padding, self.dilation, self.groups)
RuntimeError: Given groups=1, weight of size [27, 4, 1, 1], expected input[1, 256, 8, 8] to have 4 channels, but got 256 channels instead

Please can you help me.

Based on the error message, a conv layer expects 4 input channels (I guess the first conv layer), while you are feeding a tensor with 256 channels into it.
I assume a view or reshape operation might be wrong in your code and I’m not exactly sure which line of code is causing this issue, but would start debugging all view operations first, such as:

x[i] = x[i].view(bs, self.na, self.no, ny, nx).permute(0, 1, 3, 4, 2).contiguous()

I still have problem.

import argparse

import logging

import sys

from copy import deepcopy

from pathlib import Path

sys.path.append('./')  # to run '$ python *.py' files in subdirectories

logger = logging.getLogger(__name__)

from models.common import *

from models.experimental import MixConv2d, CrossConv

from utils.autoanchor import check_anchor_order

from utils.general import make_divisible, check_file, set_logging

from utils.torch_utils import time_synchronized, fuse_conv_and_bn, model_info, scale_img, initialize_weights, \

    select_device, copy_attr

try:

    import thop  # for FLOPS computation

except ImportError:

    thop = None

class Detect(nn.Module):

    stride = None  # strides computed during build

    export = False  # onnx export

    def __init__(self, nc=4, anchors=(), ch=()):  # detection layer

        super(Detect, self).__init__()

        self.nc = nc  # number of classes

        self.no = nc + 5  # number of outputs per anchor

        self.nl = len(anchors)  # number of detection layers

        self.na = len(anchors[0]) // 2  # number of anchors

        self.grid = [torch.zeros(1)] * self.nl  # init grid

        a = torch.tensor(anchors).float().view(self.nl, -1, 2)

        self.register_buffer('anchors', a)  # shape(nl,na,2)

        self.register_buffer('anchor_grid', a.clone().view(self.nl, 1, -1, 1, 1, 2))  # shape(nl,1,na,1,1,2)

       # self.m = nn.ModuleList(nn.Conv2d(x, self.no * self.na, 1) for x in ch)  # output conv

        self.m = nn.ModuleList(nn.Conv2d(out, self.no * self.na, 1) for out in ch) 

        #self.m2 = nn.ModuleList(nn.Concat(in_ther, self.no * self.na, 1) for in_ther in ch) 

#class TwoInputs(nn.Module):

 #   def __init__(self):

  #    super(TwoInputs, self).__init__()

      #self.conv = nn.Conv2d( ... )  # set up your layer here

     # self.fc1 = nn.Linear( ... )  # set up first FC layer

     # self.fc2 = nn.Linear( ... )  

    def forward(self,in_rgb,in_ther):

        combined = torch.cat((in_rgb.view(in_rgb.size(0), -1), in_ther.view(in_ther.size(0), -1)), dim=1)

        out = self.Concat(combined)

        return out   

    def forward(self, in_rgb,in_ther):

        # x = x.copy()  # for profiling

        z = []  # inference output

        self.training |= self.export

        for i in range(self.nl):

            in_rgb[i] = self.m[i](in_rgb[i])

            in_ther[i] = self.m[i](in_ther[i])   # conv

            bs, _, ny, nx = in_rgb[i].shape  # x(bs,255,20,20) to x(bs,3,20,20,85)

            bs, _, ny, nx = in_ther[i].shape  # x(bs,255,20,20) to x(bs,3,20,20,85)

            in_rgb[i] = in_rgb[i].view(bs, self.na, self.no, ny, nx).permute(0, 1, 3, 4, 2).contiguous()

            in_ther[i] = in_ther[i].view(bs, self.na, self.no, ny, nx).permute(0, 1, 3, 4, 2).contiguous()

            if not self.training:  # inference

                if self.grid[i].shape[2:4] != x[i].shape[2:4]:

                    self.grid[i] = self._make_grid(nx, ny).to(x[i].device)

                y = x[i].sigmoid()

                y[..., 0:2] = (y[..., 0:2] * 2. - 0.5 + self.grid[i].to(x[i].device)) * self.stride[i]  # xy

                y[..., 2:4] = (y[..., 2:4] * 2) ** 2 * self.anchor_grid[i]  # wh

                z.append(y.view(bs, -1, self.no))

        return in_rgb,in_ther if self.training else (torch.cat(z, 1), in_rgb,in_ther)

    @staticmethod

    def _make_grid(nx=20, ny=20):

        yv, xv = torch.meshgrid([torch.arange(ny), torch.arange(nx)])

        return torch.stack((xv, yv), 2).view((1, 1, ny, nx, 2)).float()

class Model(nn.Module):

    def __init__(self, cfg='yolov3.yaml', ch=4, nc=None):  # model, input channels, number of classes

        super(Model, self).__init__()

        if isinstance(cfg, dict):

            self.yaml = cfg  # model dict

        else:  # is *.yaml

            import yaml  # for torch hub

            self.yaml_file = Path(cfg).name

            with open(cfg) as f:

                self.yaml = yaml.load(f, Loader=yaml.FullLoader)  # model dict

        # Define model

        ch = self.yaml['ch'] = self.yaml.get('ch', ch)  # input channels

        if nc and nc != self.yaml['nc']:

            logger.info('Overriding model.yaml nc=%g with nc=%g' % (self.yaml['nc'], nc))

            self.yaml['nc'] = nc  # override yaml value

        self.model, self.save = parse_model(deepcopy(self.yaml), ch=[ch])  # model, savelist

        self.names = [str(i) for i in range(self.yaml['nc'])]  # default names

        #print([x.shape for x in self.forward(torch.zeros(1, ch, 64, 64))])

       # print([out.shape for out in self.forward(torch.zeros(1, ch, 64, 64))])

        # Build strides, anchors

        m = self.model[-1]  # Detect()

        if isinstance(m, Detect):

            s = 256  # 2x min stride

            m.stride = torch.tensor([s / in_rgb.shape[-2] for in_rgb in self.forward(torch.zeros(1, ch, s, s))])  # forward

            m.stride = torch.tensor([s / in_ther.shape[-2] for in_ther in self.forward(torch.zeros(1, ch, s, s))])  # forward

            m.anchors /= m.stride.view(-1, 1, 1)

            check_anchor_order(m)

            self.stride = m.stride

            self._initialize_biases()  # only run once

            #print('Strides: %s' % m.stride.tolist())

        # Init weights, biases

        initialize_weights(self)

        self.info()

        logger.info('')

    def forward(self,in_rgb, in_ther, augment=False, profile=False):

        if augment:

            img_size = out.shape[-2:]  # height, width

            s = [1, 0.83, 0.67]  # scales

            f = [None, 3, None]  # flips (2-ud, 3-lr)

            y = []  # outputs

            for si, fi in zip(s, f):

                xi = scale_img(in_rgb.flip(fi) if fi else in_rgb, si, gs=int(self.stride.max()))

                xi = scale_img(in_ther.flip(fi) if fi else in_ther, si, gs=int(self.stride.max()))

                yi = self.forward_once(xi)[0]  # forward

                # cv2.imwrite('img%g.jpg' % s, 255 * xi[0].numpy().transpose((1, 2, 0))[:, :, ::-1])  # save

                yi[..., :4] /= si  # de-scale

                if fi == 2:

                    yi[..., 1] = img_size[0] - yi[..., 1]  # de-flip ud

                elif fi == 3:

                    yi[..., 0] = img_size[1] - yi[..., 0]  # de-flip lr

                y.append(yi)

            return torch.cat(y, 1), None  # augmented inference, train

        else:

            return self.forward_once(in_ther, profile)  # single-scale inference, train

    def forward_once(self,profile=False):

        y, dt = [], []  # outputs

        for m in self.model:

            if m.f != -1:  # if not from previous layer

                x = y[m.f] if isinstance(m.f, int) else [x if j == -1 else y[j] for j in m.f]  # from earlier layers

            if profile:

                o = thop.profile(m, inputs=(x,), verbose=False)[0] / 1E9 * 2 if thop else 0  # FLOPS

                t = time_synchronized()

                for _ in range(10):

                    _ = m(x)

                dt.append((time_synchronized() - t) * 100)

                print('%10.1f%10.0f%10.1fms %-40s' % (o, m.np, dt[-1], m.type))

            x = m(x)  # run

            y.append(x if m.i in self.save else None)  # save output

        if profile:

            print('%.1fms total' % sum(dt))

        return x

    def _initialize_biases(self, cf=None):  # initialize biases into Detect(), cf is class frequency

        # https://arxiv.org/abs/1708.02002 section 3.3

        # cf = torch.bincount(torch.tensor(np.concatenate(dataset.labels, 0)[:, 0]).long(), minlength=nc) + 1.

        m = self.model[-1]  # Detect() module

        for mi, s in zip(m.m, m.stride):  # from

            b = mi.bias.view(m.na, -1)  # conv.bias(255) to (3,85)

            b.data[:, 4] += math.log(8 / (640 / s) ** 2)  # obj (8 objects per 640 image)

            b.data[:, 5:] += math.log(0.6 / (m.nc - 0.99)) if cf is None else torch.log(cf / cf.sum())  # cls

            mi.bias = torch.nn.Parameter(b.view(-1), requires_grad=True)

    def _print_biases(self):

        m = self.model[-1]  # Detect() module

        for mi in m.m:  # from

            b = mi.bias.detach().view(m.na, -1).T  # conv.bias(255) to (3,85)

            print(('%6g Conv2d.bias:' + '%10.3g' * 6) % (mi.weight.shape[1], *b[:5].mean(1).tolist(), b[5:].mean()))

    # def _print_weights(self):

    #     for m in self.model.modules():

    #         if type(m) is Bottleneck:

    #             print('%10.3g' % (m.w.detach().sigmoid() * 2))  # shortcut weights

    def fuse(self):  # fuse model Conv2d() + BatchNorm2d() layers

        print('Fusing layers... ')

        for m in self.model.modules():

            if type(m) is Conv and hasattr(m, 'bn'):

                m.conv = fuse_conv_and_bn(m.conv, m.bn)  # update conv

                delattr(m, 'bn')  # remove batchnorm

                m.forward = m.fuseforward  # update forward

        self.info()

        return self

    def nms(self, mode=True):  # add or remove NMS module

        present = type(self.model[-1]) is NMS  # last layer is NMS

        if mode and not present:

            print('Adding NMS... ')

            m = NMS()  # module

            m.f = -1  # from

            m.i = self.model[-1].i + 1  # index

            self.model.add_module(name='%s' % m.i, module=m)  # add

            self.eval()

        elif not mode and present:

            print('Removing NMS... ')

            self.model = self.model[:-1]  # remove

        return self

    def autoshape(self):  # add autoShape module

        print('Adding autoShape... ')

        m = autoShape(self)  # wrap model

        copy_attr(m, self, include=('yaml', 'nc', 'hyp', 'names', 'stride'), exclude=())  # copy attributes

        return m

    def info(self, verbose=False, img_size=640):  # print model information

        model_info(self, verbose, img_size)

def parse_model(d, ch):  # model_dict, input_channels(3)

    logger.info('\n%3s%18s%3s%10s  %-40s%-30s' % ('', 'from', 'n', 'params', 'module', 'arguments'))

    anchors, nc, gd, gw = d['anchors'], d['nc'], d['depth_multiple'], d['width_multiple']

    na = (len(anchors[0]) // 2) if isinstance(anchors, list) else anchors  # number of anchors

    no = na * (nc + 5)  # number of outputs = anchors * (classes + 5)

    layers, save, c2 = [], [], ch[-1]  # layers, savelist, ch out

    for i, (f, n, m, args) in enumerate(d['backbone'] + d['head']):  # from, number, module, args

        m = eval(m) if isinstance(m, str) else m  # eval strings

        for j, a in enumerate(args):

            try:

                args[j] = eval(a) if isinstance(a, str) else a  # eval strings

            except:

                pass

        n = max(round(n * gd), 1) if n > 1 else n  # depth gain

        if m in [Conv, Bottleneck, SPP, DWConv, MixConv2d, Focus, CrossConv, BottleneckCSP, C3]:

            c1, c2 = ch[f], args[0]

            # Normal

            # if i > 0 and args[0] != no:  # channel expansion factor

            #     ex = 1.75  # exponential (default 2.0)

            #     e = math.log(c2 / ch[1]) / math.log(2)

            #     c2 = int(ch[1] * ex ** e)

            # if m != Focus:

            c2 = make_divisible(c2 * gw, 8) if c2 != no else c2

            # Experimental

            # if i > 0 and args[0] != no:  # channel expansion factor

            #     ex = 1 + gw  # exponential (default 2.0)

            #     ch1 = 32  # ch[1]

            #     e = math.log(c2 / ch1) / math.log(2)  # level 1-n

            #     c2 = int(ch1 * ex ** e)

            # if m != Focus:

            #     c2 = make_divisible(c2, 8) if c2 != no else c2

            args = [c1, c2, *args[1:]]

            if m in [BottleneckCSP, C3]:

                args.insert(2, n)

                n = 1

        elif m is nn.BatchNorm2d:

            args = [ch[f]]

        elif m is Concat:

            c2 = sum([ch[x if x < 0 else x + 1] for x in f])

        elif m is Detect:

            args.append([ch[x + 1] for x in f])

            if isinstance(args[1], int):  # number of anchors

                args[1] = [list(range(args[1] * 2))] * len(f)

        elif m is Contract:

            c2 = ch[f if f < 0 else f + 1] * args[0] ** 2

        elif m is Expand:

            c2 = ch[f if f < 0 else f + 1] // args[0] ** 2

        else:

            c2 = ch[f if f < 0 else f + 1]

        m_ = nn.Sequential(*[m(*args) for _ in range(n)]) if n > 1 else m(*args)  # module

        t = str(m)[8:-2].replace('__main__.', '')  # module type

        np = sum([x.numel() for x in m_.parameters()])  # number params

        m_.i, m_.f, m_.type, m_.np = i, f, t, np  # attach index, 'from' index, type, number params

        logger.info('%3s%18s%3s%10.0f  %-40s%-30s' % (i, f, n, np, t, args))  # print

        save.extend(x % i for x in ([f] if isinstance(f, int) else f) if x != -1)  # append to savelist

        layers.append(m_)

        ch.append(c2)

    return nn.Sequential(*layers), sorted(save)

if __name__ == '__main__':

    parser = argparse.ArgumentParser()

    parser.add_argument('--cfg', type=str, default='yolov3.yaml', help='model.yaml')

    parser.add_argument('--device', default='0', help='cuda device, i.e. 0 or 0,1,2,3 or cpu')

    opt = parser.parse_args()

    opt.cfg = check_file(opt.cfg)  # check file

    set_logging()

    device = select_device(opt.device)

    # Create model

    model = Model(opt.cfg).to(device)

    model.train()

    # Profile

    # img = torch.rand(8 if torch.cuda.is_available() else 1, 3, 640, 640).to(device)

    # y = model(img, profile=True)

    # Tensorboard

    # from torch.utils.tensorboard import SummaryWriter

    # tb_writer = SummaryWriter()

    # print("Run 'tensorboard --logdir=models/runs' to view tensorboard at http://localhost:6006/")

    # tb_writer.add_graph(model.model, img)  # add model to tensorboard

    # tb_writer.add_image('test', img[0], dataformats='CWH')  # add model to tensorboard
Traceback (most recent call last):
  File "train.py", line 538, in <module>
    train(hyp, opt, device, tb_writer, wandb)
  File "train.py", line 79, in train
    model = Model(opt.cfg or ckpt['model'].yaml, ch=4, nc=nc).to(device)  # create (ch=3 change to 4)
  File "/content/drive/My Drive/yolov3/models/yolo.py", line 112, in __init__
    m.stride = torch.tensor([s / in_rgb.shape[-2] for in_rgb in self.forward(torch.zeros(1, ch, s, s))])  # forward
TypeError: forward() missing 1 required positional argument: 'in_ther'

Based on the error message you are not passing the required in_ther argument to the forward method of your model.
In particular this line of code is failing:

print([x.shape for x in self.forward(torch.zeros(1, ch, 64, 64))])

since a single tensor is passed to self.forward.

I changed it to :

   print([out.shape for out in self.forward(torch.zeros(1, 3, 64, 64),torch.zeros(1, 1, 64, 64))])

and I’m sure that I put input channel equal to 4 in my code (3 for RGB image and 1 for thermal image) but the system read only 3 channels:

Traceback (most recent call last):
  File "train.py", line 641, in <module>
    train(hyp, opt, device, tb_writer, wandb)
  File "train.py", line 79, in train
    model = Model(opt.cfg or ckpt['model'].yaml, ch=4, nc=nc).to(device)  # create (ch=3 change to 4)
  File "/content/drive/My Drive/yolov3/models/yolo.py", line 111, in __init__
    print([out.shape for out in self.forward(torch.zeros(1, 3, 64, 64),torch.zeros(1, 1, 64, 64))])
  File "/content/drive/My Drive/yolov3/models/yolo.py", line 156, in forward
    return self.forward_once(in_rgb,in_ther,profile)  # single-scale inference, train
  File "/content/drive/My Drive/yolov3/models/yolo.py", line 177, in forward_once
    in_rgb = m(in_rgb)  # run
  File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/content/drive/My Drive/yolov3/models/common.py", line 37, in forward
    return self.act(self.bn(self.conv(x)))
  File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/conv.py", line 399, in forward
    return self._conv_forward(input, self.weight, self.bias)
  File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/conv.py", line 396, in _conv_forward
    self.padding, self.dilation, self.groups)
RuntimeError: Given groups=1, weight of size [32, 4, 3, 3], expected input[1, 3, 64, 64] to have 4 channels, but got 3 channels instead

yolo.py

import argparse

import logging

import sys

from copy import deepcopy

from pathlib import Path

sys.path.append('./')  # to run '$ python *.py' files in subdirectories

logger = logging.getLogger(__name__)

from models.common import *

from models.experimental import MixConv2d, CrossConv

from utils.autoanchor import check_anchor_order

from utils.general import make_divisible, check_file, set_logging

from utils.torch_utils import time_synchronized, fuse_conv_and_bn, model_info, scale_img, initialize_weights, \

    select_device, copy_attr

try:

    import thop  # for FLOPS computation

except ImportError:

    thop = None

class Detect(nn.Module):

    stride = None  # strides computed during build

    export = False  # onnx export

    def __init__(self, nc=4, anchors=(), ch=()):  # detection layer

        super(Detect, self).__init__()

        self.nc = nc  # number of classes

        self.no = nc + 5  # number of outputs per anchor

        self.nl = len(anchors)  # number of detection layers

        self.na = len(anchors[0]) // 2  # number of anchors

        self.grid = [torch.zeros(1)] * self.nl  # init grid

        a = torch.tensor(anchors).float().view(self.nl, -1, 2)

        self.register_buffer('anchors', a)  # shape(nl,na,2)

        self.register_buffer('anchor_grid', a.clone().view(self.nl, 1, -1, 1, 1, 2))  # shape(nl,1,na,1,1,2)

       # self.m = nn.ModuleList(nn.Conv2d(x, self.no * self.na, 1) for x in ch)  # output conv

        #self.m = nn.Conv2d(4, 64, kernel_size=5, stride=2, padding=3, bias=False)

        self.m = nn.ModuleList(nn.Conv2d(out, self.no * self.na, 1) for out in ch) 

        #self.m2 = nn.ModuleList(nn.Concat(in_ther, self.no * self.na, 1) for in_ther in ch) 

#class TwoInputs(nn.Module):

 #   def __init__(self):

  #    super(TwoInputs, self).__init__()

      #self.conv = nn.Conv2d( ... )  # set up your layer here

     # self.fc1 = nn.Linear( ... )  # set up first FC layer

     # self.fc2 = nn.Linear( ... )  

    def forward(self, in_rgb, in_ther):

        combined = torch.cat((in_rgb.view(in_rgb.size(0), -1), in_ther.view(in_ther.size(0), -1)), dim=1)

        out = self.Concat(combined)

        return out   

    def forward(self, in_rgb,in_ther):

        # x = x.copy()  # for profiling

        z = []  # inference output

        self.training |= self.export

        for i in range(self.nl):

            in_rgb[i] = self.m[i](in_rgb[i])

            in_ther[i] = self.m[i](in_ther[i])   # conv

            bs, _, ny, nx = in_rgb[i].shape  # x(bs,255,20,20) to x(bs,3,20,20,85)

            bs, _, ny, nx = in_ther[i].shape  # x(bs,255,20,20) to x(bs,3,20,20,85)

            in_rgb[i] = in_rgb[i].view(bs, self.na, self.no, ny, nx).permute(0, 1, 3, 4, 2).contiguous()

            in_ther[i] = in_ther[i].view(bs, self.na, self.no, ny, nx).permute(0, 1, 3, 4, 2).contiguous()

            if not self.training:  # inference

                if self.grid[i].shape[2:4] != in_rgb[i].shape[2:4]:

                    self.grid[i] = self._make_grid(nx, ny).to(in_rgb[i].device)

                if self.grid[i].shape[2:4] != in_ther[i].shape[2:4]:

                    self.grid[i] = self._make_grid(nx, ny).to(in_ther[i].device)

                y = in_rgb[i].sigmoid()

                y[..., 0:2] = (y[..., 0:2] * 2. - 0.5 + self.grid[i].to(in_rgb[i].device)) * self.stride[i]  # xy

                y[..., 2:4] = (y[..., 2:4] * 2) ** 2 * self.anchor_grid[i]  # wh

                y = in_ther[i].sigmoid()

                y[..., 0:2] = (y[..., 0:2] * 2. - 0.5 + self.grid[i].to(in_ther[i].device)) * self.stride[i]  # xy

                y[..., 2:4] = (y[..., 2:4] * 2) ** 2 * self.anchor_grid[i]  # wh

                z.append(y.view(bs, -1, self.no))

        return in_rgb,in_ther if self.training else (torch.cat(z, 1), in_rgb,in_ther)

    @staticmethod

    def _make_grid(nx=20, ny=20):

        yv, xv = torch.meshgrid([torch.arange(ny), torch.arange(nx)])

        return torch.stack((xv, yv), 2).view((1, 1, ny, nx, 2)).float()

class Model(nn.Module):

    def __init__(self, cfg='yolov3.yaml', ch=4, nc=None):  # model, input channels, number of classes

        super(Model, self).__init__()

        if isinstance(cfg, dict):

            self.yaml = cfg  # model dict

        else:  # is *.yaml

            import yaml  # for torch hub

            self.yaml_file = Path(cfg).name

            with open(cfg) as f:

                self.yaml = yaml.load(f, Loader=yaml.FullLoader)  # model dict

        # Define model

        ch = self.yaml['ch'] = self.yaml.get('ch', ch)  # input channels

        if nc and nc != self.yaml['nc']:

            logger.info('Overriding model.yaml nc=%g with nc=%g' % (self.yaml['nc'], nc))

            self.yaml['nc'] = nc  # override yaml value

        self.model, self.save = parse_model(deepcopy(self.yaml), ch=[ch])  # model, savelist

        self.names = [str(i) for i in range(self.yaml['nc'])]  # default names

        print([out.shape for out in self.forward(torch.zeros(1, 3, 64, 64),torch.zeros(1, 1, 64, 64))])

      #  print([in_ther.shape for in_ther in self.forward(torch.zeros(1, 1, 64, 64),torch.zeros(1, 3, 64, 64))])

       # print([out.shape for out in self.forward(torch.zeros(1, ch, 64, 64))])

        # Build strides, anchors

        m = self.model[-1]  # Detect()

        if isinstance(m, Detect):

            s = 256  # 2x min stride

            m.stride = torch.tensor([s / in_rgb.shape[-2] for in_rgb in self.forward(torch.zeros(1, 3, s, s))])  # forward

            m.stride = torch.tensor([s / in_ther.shape[-2] for in_ther in self.forward(torch.zeros(1, 1, s, s))])  # forward

            m.anchors /= m.stride.view(-1, 1, 1)

            check_anchor_order(m)

            self.stride = m.stride

            self._initialize_biases()  # only run once

            #print('Strides: %s' % m.stride.tolist())

        # Init weights, biases

        initialize_weights(self)

        self.info()

        logger.info('')

    def forward(self,in_rgb, in_ther, augment=False, profile=False):

        if augment:

            img_size = out.shape[-2:]  # height, width

            s = [1, 0.83, 0.67]  # scales

            f = [None, 3, None]  # flips (2-ud, 3-lr)

            y = []  # outputs

            for si, fi in zip(s, f):

                xi = scale_img(in_rgb.flip(fi) if fi else in_rgb, si, gs=int(self.stride.max()))

                xi = scale_img(in_ther.flip(fi) if fi else in_ther, si, gs=int(self.stride.max()))

                yi = self.forward_once(xi)[0]  # forward

                # cv2.imwrite('img%g.jpg' % s, 255 * xi[0].numpy().transpose((1, 2, 0))[:, :, ::-1])  # save

                yi[..., :4] /= si  # de-scale

                if fi == 2:

                    yi[..., 1] = img_size[0] - yi[..., 1]  # de-flip ud

                elif fi == 3:

                    yi[..., 0] = img_size[1] - yi[..., 0]  # de-flip lr

                y.append(yi)

            return torch.cat(y, 1), None  # augmented inference, train

        else:

            return self.forward_once(in_rgb,in_ther,profile)  # single-scale inference, train

    def forward_once(self, in_rgb, in_ther, profile=False):

        y, dt = [], []  # outputs

        for m in self.model:

            if m.f != -1:  # if not from previous layer

                in_rgb = y[m.f] if isinstance(m.f, int) else [in_rgb if j == -1 else y[j] for j in m.f]  # from earlier layers

                in_ther = y[m.f] if isinstance(m.f, int) else [in_ther if j == -1 else y[j] for j in m.f]  # from earlier layers

            if profile:

                o = thop.profile(m, inputs=(in_rgb,), verbose=False)[0] / 1E9 * 2 if thop else 0  # FLOPS

                o = thop.profile(m, inputs=(in_ther,), verbose=False)[0] / 1E9 * 2 if thop else 0  # FLOPS

                t = time_synchronized()

                for _ in range(10):

                    _ = m(in_rgb)

                    _ = m(in_ther)

                dt.append((time_synchronized() - t) * 100)

                print('%10.1f%10.0f%10.1fms %-40s' % (o, m.np, dt[-1], m.type))

            in_rgb = m(in_rgb)  # run

            in_ther = m(in_ther)  # run

            y.append(in_rgb if m.i in self.save else None)  # save output

            y.append(in_ther if m.i in self.save else None)  # save output

        if profile:

            print('%.1fms total' % sum(dt))

        return in_rgb,in_ther

    def _initialize_biases(self, cf=None):  # initialize biases into Detect(), cf is class frequency

        # https://arxiv.org/abs/1708.02002 section 3.3

        # cf = torch.bincount(torch.tensor(np.concatenate(dataset.labels, 0)[:, 0]).long(), minlength=nc) + 1.

        m = self.model[-1]  # Detect() module

        for mi, s in zip(m.m, m.stride):  # from

            b = mi.bias.view(m.na, -1)  # conv.bias(255) to (3,85)

            b.data[:, 4] += math.log(8 / (640 / s) ** 2)  # obj (8 objects per 640 image)

            b.data[:, 5:] += math.log(0.6 / (m.nc - 0.99)) if cf is None else torch.log(cf / cf.sum())  # cls

            mi.bias = torch.nn.Parameter(b.view(-1), requires_grad=True)

    def _print_biases(self):

        m = self.model[-1]  # Detect() module

        for mi in m.m:  # from

            b = mi.bias.detach().view(m.na, -1).T  # conv.bias(255) to (3,85)

            print(('%6g Conv2d.bias:' + '%10.3g' * 6) % (mi.weight.shape[1], *b[:5].mean(1).tolist(), b[5:].mean()))

    # def _print_weights(self):

    #     for m in self.model.modules():

    #         if type(m) is Bottleneck:

    #             print('%10.3g' % (m.w.detach().sigmoid() * 2))  # shortcut weights

    def fuse(self):  # fuse model Conv2d() + BatchNorm2d() layers

        print('Fusing layers... ')

        for m in self.model.modules():

            if type(m) is Conv and hasattr(m, 'bn'):

                m.conv = fuse_conv_and_bn(m.conv, m.bn)  # update conv

                delattr(m, 'bn')  # remove batchnorm

                m.forward = m.fuseforward  # update forward

        self.info()

        return self

    def nms(self, mode=True):  # add or remove NMS module

        present = type(self.model[-1]) is NMS  # last layer is NMS

        if mode and not present:

            print('Adding NMS... ')

            m = NMS()  # module

            m.f = -1  # from

            m.i = self.model[-1].i + 1  # index

            self.model.add_module(name='%s' % m.i, module=m)  # add

            self.eval()

        elif not mode and present:

            print('Removing NMS... ')

            self.model = self.model[:-1]  # remove

        return self

    def autoshape(self):  # add autoShape module

        print('Adding autoShape... ')

        m = autoShape(self)  # wrap model

        copy_attr(m, self, include=('yaml', 'nc', 'hyp', 'names', 'stride'), exclude=())  # copy attributes

        return m

    def info(self, verbose=False, img_size=640):  # print model information

        model_info(self, verbose, img_size)

def parse_model(d, ch):  # model_dict, input_channels(3)

    logger.info('\n%3s%18s%3s%10s  %-40s%-30s' % ('', 'from', 'n', 'params', 'module', 'arguments'))

    anchors, nc, gd, gw = d['anchors'], d['nc'], d['depth_multiple'], d['width_multiple']

    na = (len(anchors[0]) // 2) if isinstance(anchors, list) else anchors  # number of anchors

    no = na * (nc + 5)  # number of outputs = anchors * (classes + 5)

    layers, save, c2 = [], [], ch[-1]  # layers, savelist, ch out

    for i, (f, n, m, args) in enumerate(d['backbone'] + d['head']):  # from, number, module, args

        m = eval(m) if isinstance(m, str) else m  # eval strings

        for j, a in enumerate(args):

            try:

                args[j] = eval(a) if isinstance(a, str) else a  # eval strings

            except:

                pass

        n = max(round(n * gd), 1) if n > 1 else n  # depth gain

        if m in [Conv, Bottleneck, SPP, DWConv, MixConv2d, Focus, CrossConv, BottleneckCSP, C3]:

            c1, c2 = ch[f], args[0]

            # Normal

            # if i > 0 and args[0] != no:  # channel expansion factor

            #     ex = 1.75  # exponential (default 2.0)

            #     e = math.log(c2 / ch[1]) / math.log(2)

            #     c2 = int(ch[1] * ex ** e)

            # if m != Focus:

            c2 = make_divisible(c2 * gw, 8) if c2 != no else c2

            # Experimental

            # if i > 0 and args[0] != no:  # channel expansion factor

            #     ex = 1 + gw  # exponential (default 2.0)

            #     ch1 = 32  # ch[1]

            #     e = math.log(c2 / ch1) / math.log(2)  # level 1-n

            #     c2 = int(ch1 * ex ** e)

            # if m != Focus:

            #     c2 = make_divisible(c2, 8) if c2 != no else c2

            args = [c1, c2, *args[1:]]

            if m in [BottleneckCSP, C3]:

                args.insert(2, n)

                n = 1

        elif m is nn.BatchNorm2d:

            args = [ch[f]]

        elif m is Concat:

            c2 = sum([ch[x if x < 0 else x + 1] for x in f])

        elif m is Detect:

            args.append([ch[x + 1] for x in f])

            if isinstance(args[1], int):  # number of anchors

                args[1] = [list(range(args[1] * 2))] * len(f)

        elif m is Contract:

            c2 = ch[f if f < 0 else f + 1] * args[0] ** 2

        elif m is Expand:

            c2 = ch[f if f < 0 else f + 1] // args[0] ** 2

        else:

            c2 = ch[f if f < 0 else f + 1]

        m_ = nn.Sequential(*[m(*args) for _ in range(n)]) if n > 1 else m(*args)  # module

        t = str(m)[8:-2].replace('__main__.', '')  # module type

        np = sum([x.numel() for x in m_.parameters()])  # number params

        m_.i, m_.f, m_.type, m_.np = i, f, t, np  # attach index, 'from' index, type, number params

        logger.info('%3s%18s%3s%10.0f  %-40s%-30s' % (i, f, n, np, t, args))  # print

        save.extend(x % i for x in ([f] if isinstance(f, int) else f) if x != -1)  # append to savelist

        layers.append(m_)

        ch.append(c2)

    return nn.Sequential(*layers), sorted(save)

if __name__ == '__main__':

    parser = argparse.ArgumentParser()

    parser.add_argument('--cfg', type=str, default='yolov3.yaml', help='model.yaml')

    parser.add_argument('--device', default='0', help='cuda device, i.e. 0 or 0,1,2,3 or cpu')

    opt = parser.parse_args()

    opt.cfg = check_file(opt.cfg)  # check file

    set_logging()

    device = select_device(opt.device)

    # Create model

    model = Model(opt.cfg).to(device)

    model.train()

train.py

# Multi-scale

            if opt.multi_scale:

                sz = random.randrange(imgsz * 0.5, imgsz * 1.5 + gs) // gs * gs  # size

                sf = sz / max(imgs.shape[2:])  # scale factor

                if sf != 1:

                    ns = [math.ceil(x * sf / gs) * gs for x in imgs.shape[2:]]  # new shape (stretched to gs-multiple)

                    imgs = F.interpolate(imgs, size=ns, mode='bilinear', align_corners=False)
#forward
            with amp.autocast(enabled=cuda):

                pred = model(imgs)  # forward

                loss, loss_items = compute_loss(pred, targets.to(device), model)  # loss scaled by batch_size

                if rank != -1:

                    loss *= opt.world_size  # gradient averaged between devices in DDP mode

                if opt.quad:

                    loss *= 4.

My goal to process 2 inputs separately then combine them in a middle layer like that :

input_rgb------------>

                --Concat

input_thermal-------->

I think you might be using the 4 channels in your model input, but the posted line of code initializes a new tensor with 3 channels via torch.zeros(1, 3, 64, 64), which raises the error.
I don’t know what this line of code it doing (it seems to check the shapes), so you might want to change the zero tensor to have 4 channels.

hi sir, i am unable to rectify this error can you please tell the solution.

class CovidModel(pl.LightningModule):
    def __init__(self, weight=1):
        super().__init__()
        
        self.model = torchvision.models.resnet18()
        # change conv1 from 3 to 1 input channels
        self.model.conv1 = torch.nn.Conv2d(1, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
        # change out_feature of the last fully connected layer (called fc in resnet18) from 1000 to 1
        self.model.fc = torch.nn.Linear(in_features=512, out_features=1)
        
        self.optimizer = torch.optim.Adam(self.model.parameters(), lr=1e-4)
        self.loss_fn = torch.nn.BCEWithLogitsLoss(pos_weight=torch.tensor([weight]))
        
        # simple accuracy computation
        self.train_acc = torchmetrics.Accuracy()
        self.val_acc = torchmetrics.Accuracy()

    def forward(self, data):
        pred = self.model(data)
        return pred
    
    def training_step(self, batch, batch_idx):
        x_ray, label = batch
        label = label.float()  # Convert label to float (just needed for loss computation)
        pred = self(x_ray)[:,0]  # Prediction: Make sure prediction and label have same shape
        loss = self.loss_fn(pred, label)  # Compute the loss
        
        # Log loss and batch accuracy
        self.log("Train Loss", loss)
        self.log("Step Train Acc", self.train_acc(torch.sigmoid(pred), label.int()))
        return loss
    
    
    def training_epoch_end(self, outs):
        # After one epoch compute the whole train_data accuracy
        self.log("Train Acc", self.train_acc.compute())
        
        
    def validation_step(self, batch, batch_idx):
        # Same steps as in the training_step
        x_ray, label = batch
       
        label = label.float()
        pred = self(x_ray)[:,0]  # make sure prediction and label have same shape

        loss = self.loss_fn(pred, label)
        
        # Log validation metrics
        self.log("Val Loss", loss)
        self.log("Step Val Acc", self.val_acc(torch.sigmoid(pred), label.int()))
        return loss
    
    def validation_epoch_end(self, outs):
        self.log("Val Acc", self.val_acc.compute())
    
    def configure_optimizers(self):
        #Caution! You always need to return a list here (just pack your optimizer into one :))
        return [self.optimizer]



RuntimeError Traceback (most recent call last)
in ()
----> 1 trainer.fit(model, train_loader, val_loader,ckpt_path="/content/drive/MyDrive/covid/Xray/Data/weights_3.ckpt")

22 frames
/usr/local/lib/python3.7/dist-packages/torch/nn/modules/conv.py in _conv_forward(self, input, weight, bias)
442 _pair(0), self.dilation, self.groups)
443 return F.conv2d(input, weight, bias, self.stride,
→ 444 self.padding, self.dilation, self.groups)
445
446 def forward(self, input: Tensor) → Tensor:

RuntimeError: Given groups=1, weight of size [64, 1, 7, 7], expected input[64, 3, 224, 224] to have 1 channels, but got 3 channels instead

In your model you are replacing the first conv layer with a new one expecting an input with a single channel.
However, the error message shows that your input has 3 channels, so I’m unsure what’s the use case is and why you’ve replaced the original conv layer.

I am trying to find out a solution for my problem " RuntimeError: Given groups=1, weight of size [64, 2, 11], expected input[1, 209, 8] to have 2 channels, but got 209 channels instead", after seeing so many posts in this forum, but I am unable to figure out the issue.

Here’s my code:

from future import division
import argparse
import torch
from torch.utils import model_zoo
from torch.autograd import Variable
from torch.autograd import Variable
from torch.nn import Linear, ReLU, CrossEntropyLoss, Sequential, Conv2d, MaxPool2d, Module, Softmax, BatchNorm2d, Dropout
from torch.optim import Adam, SGD
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim

import models
import utils
import os
import pickle
import pandas as pd

from Lenet import *

from Utils import *

import scipy.io
import numpy as np
import matplotlib.pyplot as plt

from data_loader import get_train_test_loader, get_office31_dataloader
from sklearn.utils import resample

import warnings
warnings.filterwarnings(“ignore”)

import logging
handler=logging.basicConfig(level=logging.INFO)
lgr = logging.getLogger(name)

from sklearn.metrics import roc_auc_score, log_loss, roc_auc_score, roc_curve, auc,accuracy_score
from utils import accuracy, Tracker

from torchmetrics.classification import BinaryAccuracy

########################################################################

fnameand=‘vectors_Qv_vlen1_updated_location_variance_android.csv’
fnameios=‘vectors_Qv_vlen1_updated_location_variance_ios.csv’

#figure, ax = plt.subplots()
dfand = pd.read_csv(fnameand, sep=‘,’)
dfios = pd.read_csv(fnameios, sep=‘,’)

upsampling

dfandupsample = resample(dfand,replace=True,n_samples=len(dfios),random_state=42)

Xs=dfios[[“location_variance0”,“time_spent_moving0”,“total_distance0”,“AMS0”,“unique_locations0”,“entropy0”,“normalized_entropy0”,“time_home0”]]
ys = dfios[[‘finallabel’]]

changing labels to 1 or 0

ys.loc[ys[“finallabel”] == “improved”, “finallabel”] = 0
ys.loc[ys[“finallabel”] == “nonImproved”, “finallabel”] = 1

ys=np.array(ys).astype(“float32”)

#dfand = pd.read_csv(fnameand, sep=‘,’)
Xt=dfandupsample[[“location_variance0”,“time_spent_moving0”,“total_distance0”,“AMS0”,“unique_locations0”,“entropy0”,“normalized_entropy0”,“time_home0”]]
yt = dfandupsample[[‘finallabel’]]

changing labels to 1 or 0

yt.loc[yt[“finallabel”] == “improved”, “finallabel”] = 0
yt.loc[yt[“finallabel”] == “nonImproved”, “finallabel”] = 1

yt=np.array(yt).astype(“float32”)

trainX, trainY = Xs, ys
targetX,targetY=Xt,yt

trainX = np.array(trainX).astype(“float32”).reshape(1,209,8)

targetX = np.array(targetX).astype(“float32”).reshape(1,209,8)

print (trainX.shape,trainY.shape,targetX.shape,targetY.shape)

########################################################################################

Convert the np arrays into the correct dimention and type

Note that BCEloss requires Float in X as well as in y

def XnumpyToTensor(x_data_np):
x_data_np = np.array(x_data_np.values, dtype=np.float32)
print(x_data_np.shape)
print(type(x_data_np))

x_data_np.reshape(1,209,8)

lgr.info ("Using the CPU")
X_tensor = Variable(torch.from_numpy(x_data_np)) # Note the conversion for pytorch

print(type(X_tensor.data)) # should be 'torch.cuda.FloatTensor'            
print((X_tensor.data.shape)) # torch.Size([108405, 29])
return X_tensor

Convert the np arrays into the correct dimention and type

Note that BCEloss requires Float in X as well as in y

def YnumpyToTensor(y_data_np):
y_data_np=y_data_np.reshape((y_data_np.shape[0],1)) # Must be reshaped for PyTorch!
print(y_data_np.shape)
print(type(y_data_np))

lgr.info ("Using the CPU")        
#     Y = Variable(torch.squeeze (torch.from_numpy(y_data_np).type(torch.LongTensor)))  #         
Y_tensor = Variable(torch.from_numpy(y_data_np)).type(torch.FloatTensor)  # BCEloss requires Float        

print(type(Y_tensor.data)) # should be 'torch.cuda.FloatTensor'
print(y_data_np.shape)
print(type(y_data_np))    
return Y_tensor

#######################################################################################
use_cuda=False
X_tensor_train= XnumpyToTensor(trainX) # default order is NBC for a 3d tensor, but we have a 2d tensor
X_shape=X_tensor_train.data.size()

Dimensions

N_FEATURES=trainX.shape[1]# Number of features for the input layer
NUM_ROWS_TRAINNING=trainX.shape[0]# Number of rows
N_MULT_FACTOR=10 # min should be 4 # this number has no meaning except for being divisable by 2 z3 increasing this is incerasing accuracy of model
N_HIDDEN=N_FEATURES * N_MULT_FACTOR# Size of first linear layer
N_CNN_KERNEL=3 # CNN kernel size
MAX_POOL_KERNEL=4

DEBUG_ON=False

def debug(x):
if DEBUG_ON:
print (‘(x.size():’ + str (x.size()))

##########################################################################################

#-----------------------------------------------------------------------------------------

class Net2(nn.Module):
def init(self, num_classes: int = 2, dropout: float = 0.5) → None:
super().init()
#_log_api_usage_once(self)
self.features = nn.Sequential(
nn.Conv1d(2, 64, kernel_size=11, stride=4, padding=2),
nn.ReLU(inplace=True),
nn.MaxPool1d(kernel_size=3, stride=2),
nn.Conv1d(64, 192, kernel_size=5, padding=2),
nn.ReLU(inplace=True),
nn.MaxPool1d(kernel_size=3, stride=2),
nn.Conv1d(192, 384, kernel_size=3, padding=1),
nn.ReLU(inplace=True),
nn.Conv1d(384, 256, kernel_size=3, padding=1),
nn.ReLU(inplace=True),
nn.Conv1d(256, 256, kernel_size=3, padding=1),
nn.ReLU(inplace=True),
nn.MaxPool2d(kernel_size=3, stride=2),
)
self.avgpool = nn.AdaptiveAvgPool2d((6, 6))
self.classifier = nn.Sequential(
nn.Dropout(p=dropout),
nn.Linear(256 * 6 * 6, 4096),
nn.ReLU(inplace=True),
nn.Dropout(p=dropout),
nn.Linear(4096, 4096),
nn.ReLU(inplace=True),
nn.Linear(4096, num_classes),
)

def forward(self, x: torch.Tensor) -> torch.Tensor:
    x = self.features(x)
    x = self.avgpool(x)
    x = torch.flatten(x, 1)
    x = self.classifier(x)
    return x

#--------------------------------------------------------------------------------------

net = Net2()

print(“model description—”)
print(net)

################################################################################

global parameters

print (trainX.shape,trainY.shape,targetX.shape,targetY.shape)

trainX = np.array(trainX).astype(“float32”).reshape(1,209,8)
targetX = np.array(targetX).astype(“float32”).reshape(1,209,8)

print(“after rehape===”,trainX.shape,trainY.shape,targetX.shape,targetY.shape)

##################################################################################
import time
start_time = time.time()
epochs=10
all_losses = []

X_tensor_train= torch.from_numpy(trainX)
Y_tensor_train= torch.from_numpy(trainY)

X_tensor_target=torch.from_numpy (targetX)
Y_tensor_target=torch.from_numpy(targetY)

#################################################################

#def train(model, epoch, _lambda):
def train(model, epoch, param):
discriminative_loss_param=param[0]
domain_loss_param=param[1]
adver_loss_param=param[2]

result = []    
source_out,target_out = net(X_tensor_train),net(X_tensor_target)

#----------------------------------------------------------------

print(X_tensor_train,Y_tensor_train,X_tensor_target)

if name==‘main’:
discriminative_loss_param = 0.01 ##0.03 for InstanceBased method, 0.01 for CenterBased method
domain_loss_param = 8
adver_loss_param=0
param=[discriminative_loss_param, domain_loss_param,adver_loss_param]

training_statistic = []
testing_s_statistic = []
testing_t_statistic = []


final_res=[]
tracker = Tracker()
tuf=[]
accuracies_source = []
accuracies_target= []


for e in range(0,epochs):
    print("epoch===",e)

    res = train(net, e, param=param)
    

    ####################################################################

For the model I am inputting my data with 209 rows and 8 columns at a time.

How to resolve this issue?

Your fist layer is defined as:

nn.Conv1d(2, 64, kernel_size=11, stride=4, padding=2),

and thus expects an input with 2 channels while you are reshaping your input to:

trainX = np.array(trainX).astype(“float32”).reshape(1,209,8)

creating a single sample with 20 channels.

PS: you can post code snippets by wrapping them into three backticks ```, which makes debugging easier.