I’ve been training a CycleWGAN-GP model for some time on 1-D data. But, my model overfits really bad. To avoid overfitting, I have tried many solutions:
- Data augmentation (as a result I have now twice size of the original data)
- Tried adding dropout in different layers but this time model did not learn anything at all. Actually, at this point I might have ended the training earlier because even the training results sucked. But I actually waited for enough time to see maybe it improves but no . I am actually very confused where exactly I should add the dropout in my network.
- Added decaying white noise in my both domains. This helped me a little but not much. It slightly improved the test results.
- Simplified the model. Basically at first I had total of 34 layers (generator and critic) also including residual blocks. I simplified to total of 16 layers. The training results were so bad. Then, gradually increased the layers up back to 34 layers. As I added more layers, the training improved. So does the test, but test results are still so bad.
At this point I really don’t know what else I should do. I especially need help on where to put the dropout layers since in practice they really help to prevent overfitting. Any help would be appreciated.