I have images I wish to predict on, as well as some array data. So I have a Net I am using that looks like this:
class Net(nn.Module):
def __init__(self, arch, n_meta_features: int):
super(Net, self).__init__()
self.arch = arch
self.arch._fc = nn.Linear(in_features=1280, out_features=500, bias=True) # B1
self.meta = nn.Sequential(nn.Linear(n_meta_features, 500),
nn.BatchNorm1d(500),
nn.ReLU(),
nn.Dropout(p=0.2),
nn.Linear(500, 250), # FC layer output will have 250 features
nn.BatchNorm1d(250),
nn.ReLU(),
nn.Dropout(p=0.2))
self.final = nn.Linear(500 + 250, 1)
def forward(self, inputs):
x, meta = inputs
cnn_features = self.arch(x)
meta_features = self.meta(meta)
features = torch.cat((cnn_features, meta_features), dim=1)
output = self.final(features)
return output
So you can see I send the images through my CNN (EffiicientNet), and then my metadata through a series of Linear models. They are combined at the end with torch.cat() and then sent through a final Linear model. The result is binary. There is no sigmoid on the result because I am using BCEWithLogitsLoss. I do a sigmoid when I get my prediction back from the model. Here is basic train sequence:
for x, y in train_loader:
x[0] = torch.tensor(x[0], device=device, dtype=torch.float32)
x[1] = torch.tensor(x[1], device=device, dtype=torch.float32)
y = torch.tensor(y, device=device, dtype=torch.float32)
optim.zero_grad()
z = model(x)
loss = criterion(z, y.unsqueeze(1))
loss.backward()
optim.step()
pred = torch.round(torch.sigmoid(z)) # round off sigmoid to obtain predictions
correct += (pred.cpu() == y.cpu().unsqueeze(1)).sum().item() # tracking number of correctly predicted samples
epoch_loss += loss.item()
I am looking for feedback if this looks sound or not? Perhaps there are better ways to handle these two types of data? I definitely have some tweaking to do on the models I am using, right now all my probabilities are heavily skewed toward 0.