It may seem dummy for some of the developers, but this is the first time that I am dealing with around 50gb training data. Until now, I was only implementing homeworks and can store the data in my local.
Now, I want to debug the baseline code, visualize the data and training& validation curves by using tensorboard (if you are using a different one, I am open to suggestions), but the code is running on the remote server on the GPU. The dataset is available on the remote server. Since I am using mac there is no chance to run in my local the code also store the dataset.
I am wondering, how do you handle this kind of situations in bigger projects in terms of the scale?
- How to visualize data
- How implementation works (Are you implementing in your local but what about testing?)
I am looking for general advice. Could you please help me?
Thanks a lot