I’m trying to test my model with torchdata and ddp (2 gpus or sometimes 6 gpus), however, the final number of samples is always less than the actural number here.
My datadir has 17 packed tar files, number 0~15 has 5120 entries, while the last one has only 2730 entries. If I add shardingfilter after filelister, it seems that all of datas in the last tar file will be ignored by the dataloader. I also tried to adding shardingfilter after the map function, however, there are still servel samples missing .
Is there anyway I can make sure I can have the correct number of files? Do I have to repack my dataset so they all have the same number of entries?
Many thanks in advance!
import torchdata from torchdata.datapipes.iter import FileLister, FileOpener rank, world_size = get_dist_info() rootdir = "/data/test/" dataset = FileLister(rootdir, "*.tar") # if dist: dataset = dataset.sharding_filter() dataset = FileOpener(dataset, mode="rb") dataset = dataset.load_from_tar(length=length) dataset = dataset.webdataset().map(postprocess_func) if dist: dataset = dataset.sharding_filter() data_loader = DataLoader( dataset, batch_size=batch_size, num_workers=num_workers, pin_memory=pin_memory, shuffle=shuffle, drop_last=False, **kwargs) cnts = 0 for ind, x in enumerate(data_loader): cnts += len(x) # cnts doesnt match here