I have a set of data which I’ve used to generate an image, which I then iterate through convolution layers and then ultimately flatten the features into one layer so it becomes a size: [1, 11552].
I now want to connect that layer to be included in the training for each sample in a batch of say [4000, 310]. I know that if I copied that layer into a size [4000, 11552] I could then use torch.cat() to combine the features, however that seems needlessly wasteful (Though I may be missing something important).
Is there a way to connect that layer to each sample from the larger batch without having to duplicate the data? So that during training I would effectively be training a batch of size [4000, 11862]?
Ideally I would not want to duplicate that common layer n_batch times since it’s always the same. I am currently creating a new tensor of ones the size of my batch of samples, and the width of the common layer. Then multiplying by the common layer and using torch.cat() to join them, but this just feels wrong.
Oh, interesting. Thank you so much. Looks like it will certainly save a multiplication step. I’m not sure what the implications on autograd that will have but hopefully some speed improvement, even though it looks like it will still consume memory where it ought not.
I believe the slight modification to your suggestion is that I need to make sure to only repeat along the batch axis. Not repeat the single 10 nodes 10 times also. So my final code looks like: