There is no pytorch here. I just wanted to share the fact that 150 lines of code with numpy and a simple linear policy with a basic SGD can reach such performance in MuJoCo environments:
So it’s possible to explore ARS performance using other kind of policies than linear using Pytorch tools.
But I did this before version 0.4, it’s probably a bit old-fashion now.
Thank you very much (also for your fast response), this is great. However, it would be even greater if it can also include a simple classification example (rather than RL) where I guess the augmented random search can still be used (at least that is the case in the above example). Because it would be much easier for me to grasp it on such simple classification setting than RL, in which I am newbie; and give it a short in my existing problems. Cheers.