Hi, I have been going through multiple tutorials but I am still have two questions about Dataset and Dataloader. Any clarification would help.
- Let’s say that I am trying to processing some text, the first step is to turn my input sentences to tensors. Can I directly create a cuda tensor when intializing the Dataset? Or should I create them on cpu and then move them to cuda when the dataloader iterates through them?
- How do I pick the worker numbers for the Dataloader? Is it tied to the number of GPUs available (i.e., if I have 4 gpus I should also use 4 workers to parallelize things)? What are the workers exactly doing, from the docs it reads “subprocesses to use for data loading.” are they just reading from memory? or are they also applying the transform specified in the dataset?