I have images I wish to predict on, as well as some array data. So I have a Net I am using that looks like this:
class Net(nn.Module): def __init__(self, arch, n_meta_features: int): super(Net, self).__init__() self.arch = arch self.arch._fc = nn.Linear(in_features=1280, out_features=500, bias=True) # B1 self.meta = nn.Sequential(nn.Linear(n_meta_features, 500), nn.BatchNorm1d(500), nn.ReLU(), nn.Dropout(p=0.2), nn.Linear(500, 250), # FC layer output will have 250 features nn.BatchNorm1d(250), nn.ReLU(), nn.Dropout(p=0.2)) self.final = nn.Linear(500 + 250, 1) def forward(self, inputs): x, meta = inputs cnn_features = self.arch(x) meta_features = self.meta(meta) features = torch.cat((cnn_features, meta_features), dim=1) output = self.final(features) return output
So you can see I send the images through my CNN (EffiicientNet), and then my metadata through a series of Linear models. They are combined at the end with torch.cat() and then sent through a final Linear model. The result is binary. There is no sigmoid on the result because I am using BCEWithLogitsLoss. I do a sigmoid when I get my prediction back from the model. Here is basic train sequence:
for x, y in train_loader: x = torch.tensor(x, device=device, dtype=torch.float32) x = torch.tensor(x, device=device, dtype=torch.float32) y = torch.tensor(y, device=device, dtype=torch.float32) optim.zero_grad() z = model(x) loss = criterion(z, y.unsqueeze(1)) loss.backward() optim.step() pred = torch.round(torch.sigmoid(z)) # round off sigmoid to obtain predictions correct += (pred.cpu() == y.cpu().unsqueeze(1)).sum().item() # tracking number of correctly predicted samples epoch_loss += loss.item()
I am looking for feedback if this looks sound or not? Perhaps there are better ways to handle these two types of data? I definitely have some tweaking to do on the models I am using, right now all my probabilities are heavily skewed toward 0.