Hi, everyone, Here is a bug needing your help.
I am training my own model using tensorboardx to record loss. At the beginning everything goes well, and tensorboard shows the normal loss pic. However, when the epoch reaches 191(last time it was 110+), the program got stuck. It did NOT throw any error but never went ahead:
making a breakpoint at main_file does not help. When I pause the program, after a slightly long time, I landed in connection.py at:
stepping out continuously gets in event_file_writer.py:
Has anyone ever encountered this? or this is the bug in tensorboardx caused by a non-matched version?