import torch
from torch import nn
import time
dropout = 0.01
layers = []
for i in range(10000):
layers.append(nn.Dropout(dropout))
x = torch.rand(10000, requires_grad=True)
for l in layers:
x = l(x)
back_start = time.time()
x.sum().backward()
print('back time', time.time() - back_start)
result for dropout=0:
back time 0.02752685546875
for dropout=0.01:
back time 0.11877131462097168
Additional evidence:
putting inplace=True fails if dropout is not 0
but works ok with dropout = 0
in addition backwards time is unchanged whether inplace is True or not, for dropout 0