I am currently training a denoising diffusion probabilistic model (DDPM) but the results I obtain differ from the training data. Each training image is divided into four quadrants, where each quadrant depicts an independent image (each image has a white frame). In other words I arranged four images in a 2x2 grid.
The neural network has 250 million trainable parameters and I have 40 000 images for training (I can obtain more if I need to, but this is a time consuming task).
- How many images do I need to train the DDPM with 250 million trainable parameters? (I somewhere read of a 10:1 or at least 1:1 ratio of training data to trainable parameters.)
- How can I imagen the influence of training data to accuracy of the model? Does each training data point increase the accuracy of the model in the same way (let’s say 1 data set increases accuracy by 0.1%(just to name some random number)) or is the relationship between training data and accuracy non linear (e.g. an S-curve if I plot the accuracy over the training data points)?