This doesn’t directly answer your question, but neither of the two approaches you outlined would work, I guess, because it would require that forward also accepts a second array, targets.
The “typical” use of the nn modules is that you return the predictions (e.g., logits, softmax outputs etc) from forward and compute the cost outside the loop. E.g., sth like
class MultilayerPerceptron(torch.nn.Module):
def __init__(self, num_features, num_classes):
super(MultilayerPerceptron, self).__init__()
self.linear_1 = torch.nn.Linear(num_features, num_hidden_1)
self.linear_2 = torch.nn.Linear(num_hidden_1, num_hidden_2)
self.linear_out = torch.nn.Linear(num_hidden_2, num_classes)
def forward(self, x):
out = self.linear_1(x)
out = F.relu(out)
out = self.linear_2(out)
out = F.relu(out)
logits = self.linear_out(out)
probas = F.softmax(logits, dim=1)
return logits, probas
model = MultilayerPerceptron(num_features=num_features,
num_classes=num_classes)
cost_fn = torch.nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate)
for epoch in range(num_epochs):
model.train()
for batch_idx, (features, targets) in enumerate(train_loader):
features = features.view(-1, 28*28).to(device)
targets = targets.to(device)
### FORWARD AND BACK PROP
logits, probas = model(features)
cost = cost_fn(logits, targets)
optimizer.zero_grad()
cost.backward()
### UPDATE MODEL PARAMETERS
optimizer.step()
I guess, because it would require that forward also accepts a second array, targets.
My code was just meant to be simple (and thus non functional), in order to illustrate my question. In my case, the target variable is non changing, and it setup durring a setup phase as self.target.
For my code example, I am working with VGG/sequential models, and the forward function for my loss module would be inside a class (which I add to the sequential model).
The “typical” use of the nn modules is that you return the predictions (e.g., logits, softmax outputs etc) from forward and compute the cost outside the loop. E.g., sth like
I am collecting the self.loss value from each custom nn module, for use in computations. But I don’t need to change the output of a layer in my hierarchical model.
My code was just meant to be simple (and thus non functional),
I see! Back to the original Q: No you wouldn’t need to return an output variable inside the
def forward(self, x):
out = self.linear_1(x)
out = F.relu(out)
out = self.linear_2(out)
out = F.relu(out)
cost_fn = torch.nn.CrossEntropyLoss()
self.cost = cost_fn(self.linear_out(out), targets)
return None
and then the following training loop:
for epoch in range(num_epochs):
for batch_idx, (features, targets) in enumerate(train_loader):
features = features.to(device)
targets = targets.to(device)
### FORWARD AND BACK PROP
model(features)
optimizer.zero_grad()
model.cost.backward()
### UPDATE MODEL PARAMETERS
optimizer.step()
you could even do sth like this and it would work (i.e., the code will run):
def forward(self, x):
self.cost = torch.tensor(99., requires_grad=True)
return None
...
### FORWARD AND BACK PROP
model(features)
optimizer.zero_grad()
model.cost.backward()