Check out hooks. If you want to inspect an gradient, you can register a backwards_hook, and drop the values into a print statement or tensorboard.
eg, in the below code I drop a hook to monitor the values passing through a softmax functiion. (later I compute the entropy and pump it into tensorboard).
def monitorAttention(self, input, output):
if writer.global_step % 10 == 0:
monitors.monitorSoftmax(self, input, output, ' input ', writer, dim=1)
self.softmax.register_forward_hook(monitorAttention)