Visual watcher when training/evaluating or tensorboard equivalence?

pengsun · January 23, 2017, 6:39am

Hi,

As leveraging python ecosystem seems one of the main reasons for switching to pytorch from Torch 7, would you provide guidance to visually overseeing training/evaluating with 3rd party Python libraries that is hard to do in pure Torch 7 ? For example,

Something like the tensorboard that can be accessed remotely (Say, I have to run training on a remote machine with gui desktop disabled so I may want to oversee the progress from my local machine)
The cool panel shown like: https://deepmind.com/blog/reinforcement-learning-unsupervised-auxiliary-tasks/

Sorry if this topic is out the scope of this forum

smth · January 24, 2017, 3:40am

@pengsun there are some scripts provided in this thread: https://www.reddit.com/r/MachineLearning/comments/5pbdnj/d_visualizing_training_with_pytorch/

Particularly, look at:
https://github.com/TeamHG-Memex/tensorboard_logger

Atcold · January 24, 2017, 7:56pm

@pengsun, tensor-board bindings for PyTorch and Torch will be released roughly next week.

Atcold · January 31, 2017, 3:17pm

@pengsun, the TensorBoard extension is now available here.

pengsun · February 3, 2017, 5:56am

@smth, @Atcold, thanks for your links!

edgarriba · February 15, 2017, 12:11pm

@smth is there any plan to release an official logger?

AjayTalati · February 23, 2017, 12:00pm

Hi, noob question here - might be especially relevant for people moving over from TensorFlow?

I just wonder if anyone has got Crayon working for multi-threads? I can’t figure it out? I keep getting this error

Traceback (most recent call last):
  File "/home/ajay/anaconda3/envs/pyphi/lib/python3.6/multiprocessing/process.py", line 249, in _bootstrap
    self.run()
  File "/home/ajay/anaconda3/envs/pyphi/lib/python3.6/multiprocessing/process.py", line 93, in run
    self._target(*self._args, **self._kwargs)
  File "/home/ajay/PythonProjects/PyT_Neural_Arch_Search_v1_2/train_v1.py", line 201, in train
    foo.get_scalar_values("Mean Reward")
  File "/home/ajay/anaconda3/envs/pyphi/lib/python3.6/site-packages/pycrayon/crayon.py", line 167, in get_scalar_values
    return json.loads(r.text)
  File "/home/ajay/anaconda3/envs/pyphi/lib/python3.6/json/__init__.py", line 354, in loads
    return _default_decoder.decode(s)
  File "/home/ajay/anaconda3/envs/pyphi/lib/python3.6/json/decoder.py", line 339, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/home/ajay/anaconda3/envs/pyphi/lib/python3.6/json/decoder.py", line 357, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

It works fine through when I run the following from the interpreter,

from pycrayon import CrayonClient
cc = CrayonClient( hostname="http://127.0.1.1" , port=8889)
foo = cc.create_experiment("train_" + str(rank))

foo.add_scalar_value("Mean Reward", mean_reward, step = episode_count)
foo.add_scalar_value("Max Reward" , max_reward,  step = episode_count)

foo.get_scalar_values("Mean Reward")
foo.get_scalar_values("Max Reward")

Thanks for your help

AjayTalati · February 23, 2017, 1:35pm

Nevermind, it works without the extra lines,

foo.get_scalar_values("Mean Reward")
foo.get_scalar_values("Max Reward")

Also I needed to give distinct date time’s for each of the threads,

cc = CrayonClient( hostname="http://127.0.1.1" , port=8889)
exp_name = "train_" + str(rank) + "_" + datetime.now().strftime('train_%m-%d_%H-%M') # prevents server errors ??
foo = cc.create_experiment(exp_name)

ibadami · July 10, 2017, 9:10am

@Atcold Is the tensor-board bindings for PyTorch is already available?