Hey there,
I am trying to create a simple resnet, possibly with 2-3 resblocks , but having trouble implementing it, can anyone help me here? Thank you
hey, just a comment, if you post your code and point out where you are having trouble , you might find it easier to get help on this.
Sure, I will do that. Here is what I have so far.
class toy_resnet(nn.Module):
def __init__(self):
super(toy_resnet,self).__init__()
self.conv1 = nn.Sequential(
Conv2d(3, 8, kernel_size=3, stride=1, padding='same'),
BatchNorm2d(8),
ReLU(inplace=True),
Conv2d(8, 8, kernel_size=3, stride=1, padding='same'),
BatchNorm2d(8),
ReLU(inplace=True)
)
self.conv2 = nn.Sequential(
Conv2d(8, 8, kernel_size=3, stride=1, padding='same'),
BatchNorm2d(8),
ReLU(inplace=True),
Conv2d(8, 8, kernel_size=3, stride=1, padding='same'),
BatchNorm2d(8),
ReLU(inplace=True)
)
self.classifier = nn.Sequential(
nn.Flatten(),
nn.Linear(24200,37*10),
nn.Dropout(0.5)
)
self.max_pool = MaxPool2d(kernel_size=3, stride=2)
def forward(self,input):
# x = self.conv1(input.float().to('cuda'))
# #print(x.shape)
# x = self.conv2(x.float().to('cuda'))
# #print(x.shape)
# x = self.conv3(x.float().to('cuda'))
#res = input
x = self.conv1(input.float())
x = self.max_pool(x)
res= x
#x+=input
#print(x.shape)
x = self.conv2(x.float())
out = x+res
x = self.conv2(out.float())
res = x+out
out = self.max_pool(res)
#print(x.shape)
#x = self.conv3(x.float())
#print(x.shape)
x = self.classifier(out)
return x
thanks, so your model is supposed to have 370 output classes with a dropout layer as the last layer ( before softmax presumably) . where are you facing issues here?
Well, I just want to know if i am doing it correctly, and my second concern is that it’s not doing so well with comparison to just a simple network with just few conv layers and filters.
i guess from the the looks of it the implementation seems fine to me, maybe someone else can comment. the network “not doing so well” as a description is a very sparse to make a comment, i’ll make a general comment. networks with more parameters are prone to overfitting, where their train error goes down but the validation error does not do as well as compared to a network with less parameters. if this is the case , you can search for regularization as a keyword.
good luck
Thank you for your help.