Pippy I can't see backward pass

nomaue · January 25, 2023, 2:00am

Thank you @kwen2501!
I have one more question.
I also experimented the following code, but it seems not to work for pipelining.

github.com

pytorch/tau/blob/main/examples/ddp2pipe/ddp2pipe.py

import argparse
# import logging
import os
import socket

import torch
import torch.distributed
import torch.distributed.rpc as rpc
import torch.multiprocessing as mp
import torch.nn.functional as F
from torch import nn, optim
from torch.nn.parallel import DistributedDataParallel as DDP
from torch.utils.data import DistributedSampler
from torchvision import datasets
from torchvision.transforms import transforms
from tqdm import tqdm

from pippy import Pipe, PipelineDriverFillDrain, annotate_split_points, PipeSplitWrapper
from pippy.microbatch import TensorChunkSpec, CustomReducer
from pippy.utils import tp_transports

This file has been truncated. show original

when I set

DIMS = [28 * 28, 300, 100, 10]
DP_LAYERS = 2
PP_LAYERS = 1
#nnode=1, nproc_per_node=2

It works well for Dataparallel,
but when I set like below,

DIMS = [28 * 28, 300, 100, 10]
DP_LAYERS = 1
PP_LAYERS = 2
#nnode=1, nproc_per_node=2

or

DIMS = [28 * 28, 500, 250, 100, 50, 25, 10]
DP_LAYERS = 2
PP_LAYERS = 4
#nnode=2, nproc_per_node=4

It doesn’t work at all. could this problem be related to the issue I suggested at first? or did I set somethin incorreclty? If I need to post a new topic for this, I’ll do that.
Thank you.