I am trying to train samples (some 4000 per batch) in relation to a single image for that batch.
Can I do this without the need to repeat my image for each sample? Because it is the same data shared to all training samples, this becomes a huge memory drain.
To try and be a little more clear…
assume I have a batch which has aprox 4000 samples, 100 features, where I will be using linear layers to process each sample, something like:
linear_block = torch.nn.Sequential(
torch.nn.Linear( 100, 10),
torch.nn.ReLU( inplace=True)
)
I would also like to take the entire (4000, 100) batch and pass through Conv2D layer(s), ultimately passing features into a final merge block so that each sample is provided information from the entire batch…
batch_block = torch.nn.Sequential(
torch.nn.Conv2D( 1, 6, kernel_size=(3,9), stride=(1,2), pad=(1,4)),
torch.nn.ReLU( inplace=True),
torch.nn.Conv2D( 6, 6, kernel_size=(3,9), stride=(1,2), pad=(1,4)),
torch.nn.ReLU( inplace=True)
}
merge_block = torch.nn.Sequential(
torch.nn.Linear( combined_output_size_of_flat_batch_and_linear_block, 5),
torch.nn.Softmax( dim=1)
}
forward( X):
X_full_batch_conv_features = batch_block( X.view(1, 1, 4000, 100))
X = linear_block( X)
batch_flatten = X_full_batch_conv_features.flatten()
merge = torch.cat( [X, batch_flatten], dim=1)
return merge_block( merge)
I’ve been able to do this using .repeat()
for the batch_block, copying all the data to every sample but that seems like a huge waste of memory when every sample is looking back to the same set of numbers.