Intermediate tensors in NeMo

jhh3 · June 24, 2020, 12:22pm

Hi,
I’m looking at QuartzNet in NeMo and trying to probe some of the internal tensors. I see that I can use evaluated_tensors = neural_factory.infer(tensors=[a,b,c]) to run inference and return the evaulations of a,b,c, but I can’t figure out how to get a list of the intermediate activation tensors, or to get a pointer to one. I’m looking for a method of either the neural_factory or the individual models (like jasper_encoder) that would return a list of tensors that I can pass to infer(). Any ideas?
Thanks
Edit: I know infer() and the neural_factory object are NeMo, so maybe out of scope for this board, but the underlying model is based on a PyTorch module, so I’m hoping a general PyTorch method for getting internal activations will be useful here. Here’s the inheritance tree for that encoder model.

[nemo.collections.asr.jasper.JasperEncoder,
 nemo.backends.pytorch.nm.TrainableNM,
 nemo.core.neural_modules.NeuralModule,
 abc.ABC,
 torch.nn.modules.module.Module,
 object]

ptrblck · June 25, 2020, 5:03am

For a general PyTorch model you could use forward hooks to get the intermediate activations as described here. Since NeMo seems to be using a PyTorch model internally, you would have to access its layers to register the hooks.

jhh3 · June 25, 2020, 8:00pm

Thanks ptrblck! That worked. It takes a little poking around to figure out the class structure, but not too much. For example, using your example, I was able to use your example and this line:
encoder.encoder[1].mconv[0].conv.register_forward_hook(get_activation('B1(a).mconv.1D'))
to capture the output of the 1D convolution in the first block of a QuartzNet encoder model.
Again, thanks for the help.

jhh3 · July 1, 2020, 11:10am

After messing with this for a while, I wanted to add one caveat that I encountered. Some layers objects, like ReLU, are re-used throughout the network. I guess it’s any layer without parameters, but I’m not sure. The result is that if you put a hook on a ReLU layer, like encoder.encoder[1].mout[0].register_forward_hook(get_activation('B1.mout.relu')) in QuartzNet, it gets called for every ReLU in the whole model, not just the one you wanted. So the final result in the dictionary is actually the ReLU output for the final model output, not the ReLU output associated with the layer where you registered the hook.