I’ve encounter a problem when sharing a model between processes, and it is critical to me (for memory resources).
I’ve been sharing a model between several processes (on linux, ubuntu). The model is used only for a forward pass, since it performs some sort of pre-processing for the samples (before fed to a different network). I’ve done everything I can to ensure that - the model is on eval mode, each parameter has ‘grad’ flag False, and forward is under ‘with torch.no_grad():’.
The problem is that after the new process is spawned, for some reason it allocates new memory on the GPU. At first I thought this memory is intermediate values of the computational graph, but then I noticed each process still allocates new memory on GPU, even when sleep is invoked (i.e. before even running data through the model), the memory is still allocated. Further more, it is a lot of memory relative to the model! The model is about 4GB (lets say 2GB of weights and 2GB of optimizer), the memory allocated is 1GB (!), which may also indicate that the network is not completely replicated, only a part of it.
Here is an example code, it think it contains the must critical parts of what I’m doing
def inferrerFunc(neuralNetwork): #If we use sleep here, memory is still allocated on GPU #time.sleep(1000) #Imagine theres a dataset here,,, for x in dataset: y_t = neuralNetwork(x) class mainProc(): def __init__(self): self.neuralNetwork = neuralNetwork() torch.multiprocessing.set_start_method('spawn', force=True) self.neuralNetwork.share_memory() self.neuralNetwork.eval() def startInferrer(self): self.inferrer = torch.multiprocessing.Process(target = inferrerFunc, args = (self.neuralNetwork,)) self.inferrer.start()