RuntimeError: Given groups=1, weight of size [6, 1, 5, 5], expected input[128, 3, 218, 178] to have 1 channels, but got 3 channels instead

I don’t see any obvious issues and would recommend to try to scale down the use case and try to overfit a small dataset first (e.g. just 10 samples) by playing around with some hyperparameters. Once your model is able to predict these samples perfectly, you could then scale up the use case again.