In advance i apollogize to the rookienes of my question. I am bit new to pytorch.
I have implemented a simple Unet model like this:
class Unet(nn.Module):
def __init__(self):
super(Unet, self).__init__()
# Down hill1
self.conv1 = nn.Conv3d(1, 2, kernel_size=3, stride=1)
self.conv2 = nn.Conv3d(2, 2, kernel_size=3, stride=1)
# Down hill2
self.conv3 = nn.Conv3d(2, 4, kernel_size=3, stride=1)
self.conv4 = nn.Conv3d(4, 4, kernel_size=3, stride=1)
#up hill1
self.upConv1 = nn.Conv3d(4, 2, kernel_size=3, stride=1)
self.upConv2 = nn.Conv3d(2, 2, kernel_size=3, stride=1)
#up hill2
self.upConv3 = nn.Conv3d(2, 1, kernel_size=3, stride=1)
self.upConv4 = nn.Conv3d(1, 1, kernel_size=3, stride=1)
self.mp = nn.MaxPool3d(kernel_size=3, stride=2, padding=1)
From this i get the impression that i have 8 sets of weights. One set for each conv.
conv1.weight.data
conv2.weight.data
....
The documentation tells me i can update these weights through optim
optimizer = optim.Adam(unet.parameters(), lr=0.1)
pred = unet.forward(x)
loss = MyLossFunction(pred, y)
loss.backward()
optimizer.step()
For some reason i find it hard to believe that it will update all 8 sets of weights correctly using this approach.
Theoretically the weights needs to be optimized through the use of the derivative of the loss function with respect to x. I can’t seem to find any connection between the loss function and the optimizer in that code. So how on earth can the optimizer figure out the derivative? The optimizer only seem to have knowledge about the model parameters and not the loss function.