I have a neural network model that I have mapped to a GPU device (hence the CUDA network description in the topic title) and during training I wish to put the entire training set through the network. (For exploratory purposes, not for training)
If I was in evaluation mode for a test set I would map the saved (trained) model to a CPU device and pass the test set in as one giant batch.
However, if I try to do this during training with the CUDA network I overload the gpu memory. Is there a good way to temporarily stop the model relying on GPU memory within training just to put the training set through (without recording gradients)?
My current ideas are:
- find a way to temporarily map the network to the CPU without saving and reloading the model
- put the training data through one example at a time (this is very slow and goes against the point of the original problem)
Any help on this would be really appreciated! Apologies for not providing any code, I feel like it’s a problem that maybe has a generalized solution anyway.