While accessing individual layers using ._modules() in ResNet50: forward() function, the terminal quits on it's own, with an output: '^C'

Infinite_Warp · August 28, 2020, 3:28pm

I was trying the coding implementation of a research paper Attention Aware Polarity Sensitive Embedding for Affective Image Retrieval. I found this code on github:

Coding Implementation

 print("check point ---2")
    print(len(self.base._modules))
    for name, module in self.base._modules.items():
      print(name)
      print(x.size())
      if name == 'avgpool':
        break

      if name == 'layer3':
        l2 = x
      x = module(x)
      l4 = x
    print("check point ---x")

The terminal on Google Colab quits after printing this:

Output with cuda

It never reaches “checkpoint —x”, and the program just ends.

I tried to remove the .cuda() from the __init()__ :

output without cuda

I observed that it was able to loop over one more iteration, so I believe that the RAM on Google Colab is filling up as the model is very large. I want to know if there is a workaround for this piece of code.If I can find l2 and l4 some other way?

ptrblck · August 30, 2020, 9:54am

Could you run the code on a local machine in a terminal to get a proper error message?
I’m not sure, if you can observe the RAM utilization in Colab, but if so you could check if you are really running out of memory.

Infinite_Warp · August 31, 2020, 12:51am

Thank you for the reply sir. I tried the model on my cpu(8GB RAM, Windows):

It states that it doesn’t have enough memory to allocate(Asks me to buy new RAM ).I apologize for the blurred out image. The error starts in line 137 of new_resnet24.py, c2 = self.inconv2(l2) . I suspect, the same thing might be happening on Google Colab(12GB RAM). I have tried all kinds of methods to obtain layer2 and layer4 :

Forward hooks
Using a list of features of the model and then looping over the children of resnet50
Reducing the size of the dataset to reduce computational cost

None of the above methods have worked. Each time the program ended with ‘^C’ . If you can suggest some more methods to obtain the activations of the intermediate layers, I would be grateful.

PS: ~~The~~ ~~code~~ ~~snippet~~ in ~~the~~ ~~original~~ ~~question~~ is a ~~part~~ of ~~the~~ forward() ~~method~~ of ~~ResNet~~ in ~~new_resnet24~~.py in ~~this~~ ~~repository~~

ptrblck · August 31, 2020, 12:55am

Forward hooks would work fine and I don’t think the hooks themselves create the increased memory usage.
Based on the code snippet you’ve provided it seems you are executing the forward method sequentially using the child modules of the base model, which could yield the OOM error.
If the model is working fine without your current approach of storing the intermediate tensors, could you check, if you are storing these intermediates without detaching them?

Infinite_Warp · August 31, 2020, 1:07am

I am not sure what detaching l2 and l4 actually means. They are sequentially assigned to the new layer output x. I believe, they are being assigned x on the same space allocated to them initially. I don’t think any extra space is being used during reassignment.

In the case of forward hooks as well, I need the previous outputs which induce the need to sequentially find the outputs till the required layer. I might have not understood the concept of forward hooks( I only read the doubts on the discussion forum about forward hooks in pytorch) properly. Can you give a short example on the given snippet of code?

Infinite_Warp · September 1, 2020, 6:38am

Thank you the problem was solved. You were right, I needed to detach the intermediate tensors. I used the Variable wrapping with l4 and x. Thanks for the help