Can torch.nn.AdaptiveLogSoftmaxWithLoss specifies a target value that is ignored and does not contribute to the input gradient like torch.nn.CrossEntropyLoss?

Can torch.nn.AdaptiveLogSoftmaxWithLoss specifies a target value that is ignored and does not contribute to the input gradient like torch.nn.CrossEntropyLoss? In many cases, we need to pad the text so that all the sequences are the same length, so we can process them in batch. So I think specifying a target value that is ignored and does not contribute to the input gradient is necessary.