Computation graph optimization during training

Dale_Song · December 5, 2019, 12:43pm

Hi, is it possible or neccessary to optimize the dynamic computation graph generated during training for a higher throughput?If it is, then what is the recommended way to achieve that? Thanks in advance.

albanD · December 5, 2019, 3:27pm

Hi,

This is not necessary in general.
If you really want to try and get the best, you should use torchscript model with cpp inference to strip away the python interpreter.

Dale_Song · December 6, 2019, 1:43am

Thank you for the reply. But my use case is to improve the training throughput, if I understand it right, torchscript can only improve performance for network inference rather than training(forward&backward).And do you have any advice about how to improve pytorch forward&backward efficiency?

albanD · December 6, 2019, 3:50pm

You can actually perform training with torchscript.
You can try to torchscript your python code during training.

That being said, if your network is a regular architecture, we try to make sure that the performance for these are as good as possible out of the box.

111254 · March 2, 2020, 9:18am

Hi Dale,have you slove this problem, and TorchScript works?

111254 · March 2, 2020, 10:08am

Did torchscript can actually optimize computation graph during traing?

G.M · March 2, 2020, 10:36am

Yes, touchscript does optimize the graph at train time. See :
https://pytorch.org/blog/optimizing-cuda-rnn-with-torchscript/#writing-custom-rnns.