I am following this Github Repo for the WGAN implementation with Gradient Penalty.
And I am trying to understand the following method, which does the job of unit-testing the gradient-penalty calulations.
def test_gradient_penalty(image_shape):
bad_gradient = torch.zeros(*image_shape)
bad_gradient_penalty = gradient_penalty(bad_gradient)
assert torch.isclose(bad_gradient_penalty, torch.tensor(1.))
image_size = torch.prod(torch.Tensor(image_shape[1:]))
good_gradient = torch.ones(*image_shape) / torch.sqrt(image_size)
good_gradient_penalty = gradient_penalty(good_gradient)
assert torch.isclose(good_gradient_penalty, torch.tensor(0.))
random_gradient = test_get_gradient(image_shape)
random_gradient_penalty = gradient_penalty(random_gradient)
assert torch.abs(random_gradient_penalty - 1) < 0.1
# Now pass tuple argument for image dimenstion of
# (batch_size, channel, height, width)
test_gradient_penalty((256, 1, 28, 28))
I don’t understand the below line
good_gradient = torch.ones(*image_shape) / torch.sqrt(image_size)
In above the torch.ones(*image_shape)
is just filling a 4-D Tensor filled up with 1 and then
torch.sqrt(image_size)
is just representing the value of tensor(28.)
So, what I am trying to understand why I need to divide the 4-D Tensor by tensor(28.)
to get the good_gradient
If I print bad_gradient, it will be a 4-D Tensor as below
tensor([[[[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.],
...,
[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.]]],
---
---
If I print good_gradient
, the output will be
tensor([[[[0.0357, 0.0357, 0.0357, ..., 0.0357, 0.0357, 0.0357],
[0.0357, 0.0357, 0.0357, ..., 0.0357, 0.0357, 0.0357],
[0.0357, 0.0357, 0.0357, ..., 0.0357, 0.0357, 0.0357],
...,
[0.0357, 0.0357, 0.0357, ..., 0.0357, 0.0357, 0.0357],
[0.0357, 0.0357, 0.0357, ..., 0.0357, 0.0357, 0.0357],
[0.0357, 0.0357, 0.0357, ..., 0.0357, 0.0357, 0.0357]]],
---
---