Tensorboard issue with self-defined forward function

promach · January 8, 2022, 2:17pm

@Unity05 I tried your modification, the runtime error on combined_feature_map is now gone.

However, the following raises my suspicion that only tensorboard requires such manual reset. Normal training operation without using tensorboard should not require such reset ?

  File "/home/phung/PycharmProjects/beginner_tutorial/gdas.py", line 431, in forward
    self.cells[c].nodes[n].connections[cc].edge_weights[e].grad_fn)
RuntimeError: Cannot insert a Tensor that requires grad as a constant. Consider making it a parameter or input, or detaching the gradient
Tensor:
 0
 0
 0
 0
[ torch.FloatTensor{4} ]

Unity05 · January 8, 2022, 3:16pm

Yeah, as I’ve said, combined_feature_map is not the only thing with a problem here. You’ll also get the same issue with outputs.
Regarding your question, why without tensorboard you would not face this issue is that in that case, you only go through the first epoch once.

promach · January 8, 2022, 3:18pm

@Unity05 I think you missed the essence of my previous reply.

self.cells[c].nodes[n].connections[cc].edge_weights[e].grad_fn is pointing to the gradient function which should not be assigned with attribute requires_grad=False

Unity05 · January 8, 2022, 3:23pm

Yeah, but your edge_weights still have requires_grad=True.

promach · January 8, 2022, 3:35pm

@Unity05 How would I apply the same suggestion to y = self.cells[c].nodes[n].connections[cc].edges[e].forward_f(x) ?

Unity05 · January 8, 2022, 3:42pm

What’s the problem there?

promach · January 8, 2022, 3:52pm

@Unity05 forward_f(x) is pointing to all those NN operations such as nn.Conv2d() which is not supposed to have attribute requires_grad=False

Unity05 · January 8, 2022, 3:59pm

Yes that’s right, but what’s the problem with y = self.cells[c].nodes[n].connections[cc].edges[e].forward_f(x)?

promach · January 8, 2022, 4:00pm

@Unity05

Error occurs, No graph saved
Traceback (most recent call last):
  File "/home/phung/PycharmProjects/beginner_tutorial/gdas.py", line 817, in <module>
    ltrain = train_NN(forward_pass_only=0)
  File "/home/phung/PycharmProjects/beginner_tutorial/gdas.py", line 574, in train_NN
    writer.add_graph(graph, NN_input)
  File "/home/phung/PycharmProjects/venv/py39/lib/python3.9/site-packages/torch/utils/tensorboard/writer.py", line 736, in add_graph
    self._get_file_writer().add_graph(graph(model, input_to_model, verbose, use_strict_trace))
  File "/home/phung/PycharmProjects/venv/py39/lib/python3.9/site-packages/torch/utils/tensorboard/_pytorch_graph.py", line 295, in graph
    raise e
  File "/home/phung/PycharmProjects/venv/py39/lib/python3.9/site-packages/torch/utils/tensorboard/_pytorch_graph.py", line 289, in graph
    trace = torch.jit.trace(model, args, strict=use_strict_trace)
  File "/home/phung/PycharmProjects/venv/py39/lib/python3.9/site-packages/torch/jit/_trace.py", line 741, in trace
    return trace_module(
  File "/home/phung/PycharmProjects/venv/py39/lib/python3.9/site-packages/torch/jit/_trace.py", line 958, in trace_module
    module._c._create_method_from_trace(
  File "/home/phung/PycharmProjects/venv/py39/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/phung/PycharmProjects/venv/py39/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1090, in _slow_forward
    result = self.forward(*input, **kwargs)
  File "/home/phung/PycharmProjects/beginner_tutorial/gdas.py", line 353, in forward
    y = self.cells[c].nodes[n].connections[cc].edges[e].forward_f(x)
  File "/home/phung/PycharmProjects/beginner_tutorial/gdas.py", line 112, in forward_f
    return self.f(x)
  File "/home/phung/PycharmProjects/venv/py39/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/phung/PycharmProjects/venv/py39/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1090, in _slow_forward
    result = self.forward(*input, **kwargs)
  File "/home/phung/PycharmProjects/venv/py39/lib/python3.9/site-packages/torch/nn/modules/conv.py", line 446, in forward
    return self._conv_forward(input, self.weight, self.bias)
  File "/home/phung/PycharmProjects/venv/py39/lib/python3.9/site-packages/torch/nn/modules/conv.py", line 442, in _conv_forward
    return F.conv2d(input, weight, bias, self.stride,
RuntimeError: Cannot insert a Tensor that requires grad as a constant. Consider making it a parameter or input, or detaching the gradient

Unity05 · January 8, 2022, 4:03pm

I’ll try to reproduce this issue later. Is your gist up to date?

promach · January 8, 2022, 4:03pm

@Unity05 Yes, it is up to date

Unity05 · January 9, 2022, 12:22pm

Seems to be a combined_feature_map issue again.

promach · January 9, 2022, 12:26pm

@Unity05 but self.cells[c].nodes[n].connections[cc].edges[e].forward_f(x) only helps to build up self.cells[c].nodes[n].connections[cc].combined_featured_map

Besides, please note that forward_f(x) is using nn.Conv2d() in which I should not use any of your suggestions about setting requires_grad=False

Unity05 · January 9, 2022, 1:19pm

Maybe you forgot to set requires_grad=False in line 191. In case you just want the graph: graph.

promach · January 10, 2022, 2:49am

See GitHub - buttercutter/gdas: A simple implementation for the GDAS paper : Searching for A Robust Neural Architecture in Four GPU Hours , tensorboard is producing graph now but how shall I interpret such tensorboard graph in order to be able to debug problematic inter-nodes and inter-cells connections within graph ?

From what I can observe so far, the tensorboard graph does not really indicate node number and cell number which makes it so difficult for tracking down the bug.

promach · January 15, 2022, 2:21pm

@ptrblck See Adding node and cell number for tensorboard graph · Issue #5505 · tensorflow/tensorboard · GitHub ? Is there a way to do this using some internal torch.utils.tensorboard settings ?