I am reading the (pre-print) paper on autograd [Paszke et. al.]. In that paper, both these claims are made: (1) PyTorch creates a dynamic computation graph and (2) the graph is immutable.
I cannot fully reconcile how these two statements can both be true. Could someone please elaborate and help me understand?
I have a basic understanding of what mutability means – at least in the context of basic Python. I also have a basic understanding that PyTorch’s graph is define-by-run, as opposed to TensorFlow etc.
“dynamic” refers to creation (emphasizing runtime creation), “immutable” - to further modifications (immutable nodes may be implied). And this graph is almost invisible to user code btw.
OK, I think this makes sense. What I conclude is that: (1) each sequence of forward steps (say at different epochs) dynamically builds up a computation graph, and (2) the nodes of such a graph are immutable. In other words, multiple graphs may be built up throughout program execution, but once a node is entered in such a graph, that node is treated as a mutable object.
This also fits with their comment that in-place objects are handled by “rebasing” pointers within the computation graph (which is different from editing the node itself).