What is the "persistent algorithm" in GRU and LSTM?

LSTM and GRU docs have the following notes:

“If the following conditions are satisfied: 1) cudnn is enabled, 2) input data is on the GPU 3) input data has dtype torch.float16 4) V100 GPU is used, 5) input data is not in PackedSequence format persistent algorithm can be selected to improve performance.”

What is persistent algorithm and how can I select it?

You cannot select it manually and it will automatically be used, if the specified conditions are met.
This GTC talk gives some information on persistent kernels, which basically try to avoid memory “movement” and try to reuse values once they are loaded.