If there is no suitable loss function in PyTorch at present, what’s the name of a suitable loss function from other papers/sources/packages that I could explore and try to code into my own custom loss function in PyTorch?
I don’t know for sure that there is anything wrong with using L1 or L2 loss, but my feeling is that these functions will not capture the essential quality of a blockmodel where one is looking for, say, ‘complete’ blocks (i.e. all 1’s) or ‘null’ blocks (all 0’s).
If you have an intuition about which results are better than which, you could try to come up with a bunch of examples and order them by goodness and then try to design a loss function that matches these intuitions. However, my experience is that this kind of methodology doesn’t always yield good results. For instance, in autoregressive models of audio, the best loss seems to be KL-divergence even though it doesn’t take into account how far off one is. I have experimented with losses that take into account how far off one is, such as Wasserstein and Cramer distance, but found them to yield worse results than simple KL. So I guess the best advice is just to experiment, and that seems to be what you already are planning to do
Thanks for your excellent advice, which I will heed!
I have found a paper with a (potentially) good criterion function for blockmodeling. I will try to create a custom loss function from it in PyTorch. However, I won’t forget to test some of the standard functions already available too.