Is it a normal behavior of WGAN-GP to have this cost values?

I’m training WGAN-GP on some large images where they are not normalized. based on their research papers i never saw this cost values as what Im encountering right now. using this implementation WGAN-GP

I’ll post them without graph since the numbers are really huge in different.

0: real: -2045681.5 G: -241852.31
1: real: -5690064.5 G: -6379684.5
2: real: 39.45565 G: -72.539696
3: real: 13.095197 G: -36.50335
4: real: -2.5783277 G: -13.31345
5: real: 5.0107975 G: -15.501221
6: real: 19.652765 G: -22.62031
8: real: -4.4978914 G: 1.0
.
29: real: -38.364815 G: -1.1187422
30: real: -55.154697 G: -9.308983
31: real: -72.53244 G: -19.775333
32: real: -52.585705 G: -19.072691
33: real: -21.87315 G: 1.0
34: real: -27.245234 G: 1.0
.
52: real: -183.57954 G: 1.0
53: real: -437.08746 G: 1.0
54: real: -957.3964 G: 1.0
55: real: -987.6546 G: 1.0
56: real: -709.7966 G: 1.0
.
65: real: -4385.247 G: 1.0
66: real: -5945.86 G: 1.0
67: real: -8121.7046 G: 1.0
68: real: -11794.778 G: 1.0
69: real: -11969.041 G: 1.0
.
76: real: -18436.766 G: 1.0
77: real: -17285.088 G: 1.0
78: real: -17295.377 G: 1.0
79: real: -17688.537 G: 1.0
80: real: -18128.531 G: 1.0
Of course I removed some lines for view purpose. the counter is the epoch number while real is D cost (fake and real) and G is the negative cost.

I’m checking the gradient on some layers, after 2 or 3 epochs all gradient got zeros even though the output is changing by time. there is something that I can’t see it reasonable.