Sparse dropout could be implemented efficiently in TensorFlow by tf.sparse_retain
. But it seems that in PyTorch, such kind of sparse retain function does not exist. How could we implement a sparse dropout?
Hi, did you figure out how to do it? I am also trying to implement sparse dropout in pytorch.