Debug FSDP program

I am trying to debug the fsdp program, but it does not support setting breakpoints with pdb.
Can anyone share some experience in debugging the fsdp program?

I guess the issue you’re running into is one of pdb being incompatible with multiprocessing. There are some workarounds you can try. One is to configure stdin and then use pdb. See this tutorial: Debugging Resources — pyATS Documentation

another option is to use rpdb, which requires opening a port and telnet connecting to it for debugging.

In either case, you’ll have to be careful about only setting the breakpoint on one particular rank instead of on all ranks

1 Like

Thanks very much. The below methods work for me.

import sys
import pdb

class ForkedPdb(pdb.Pdb):
    """A Pdb subclass that may be used
    from a forked multiprocessing child

    """
    def interaction(self, *args, **kwargs):
        _stdin = sys.stdin
        try:
            sys.stdin = open('/dev/stdin')
            pdb.Pdb.interaction(self, *args, **kwargs)
        finally:
            sys.stdin = _stdin


...
ForkedPdb.set_trace()
...