End-to-end A3C implementation with openai gym

denizs · July 11, 2017, 4:08pm

Hey all ,
I’m currently working on an implementation of A3C integrated with openai gym.
As some of the environments (e.g. CartPole-v0) don’t return an rgb array as an observation, I leverage env.render(mode='rgb_array') to obtain the pixel representation of the current game state.
However, this causes my processes to fail silently. It simply doesn’t execute and the program exits the training loop.

I was wondering whether anyone already encountered this problem and already found a fix for this. My current machine is a late 2013 Macbook Pro.
Tbh, I’m not that familiar with multiprocessing, but from what I have found on stack overflow, this might be related to the fact that you can’t call UI actions on a child thread.
Best,
Deniz

dgriff · July 11, 2017, 4:21pm

They show how to do it in pytorch tutorials here: http://pytorch.org/tutorials/intermediate/reinforcement_q_learning.html

Under input extraction section

denizs · July 11, 2017, 6:25pm

Thanks for your input @dgriff!
However, this tutorial addresses the DQN algorithm in opposition to A3C, I am trying to implement. The issue I ran into is that I can’t call render(mode='rgb_array') in different processes when using torch.multiprocessing in order to spawn multiple worker agents and asynchronously update the global model as described by Mnih et al. (https://arxiv.org/pdf/1602.01783.pdf).

dgriff · July 11, 2017, 6:49pm

ah ok didn’t see you u doing on a3c. will try on my a3c real quick and see if I have same issue

dgriff · July 11, 2017, 8:50pm

yeah couldn’t get to work either. I think its as you say you call the UI actions in the child threads. A3C kinda overkill for Cartpole anyway try another environment

you can try my a3c repo out here if your interested: https://github.com/dgriff777/rl_a3c_pytorch
Has the top scores in openai gym Atari games

denizs · July 12, 2017, 10:34am

Too bad! Thanks for the effort though
The overkill part is true, but I’d like to take the cart pole and transform it into a cart pole swingup env with some additional parameters for my master’s thesis as swingup is already a part of my thesis’ name
I guess I will dig into the ways that e.g. PongDeterministic-v3 works seamlessly with multiple threads (I guess it’s due to the fact that it has a c++ backend which is called on the main thread) and try to apply it to the python only env. Maybe I can find a way to directly write into an image buffer, without having the need to perform the actual UI updates or offload the rendering onto the main thread. Too bad I’m not that familiar with multiprocessing in python nor graphics

dgriff · July 12, 2017, 1:47pm

Pong works cause gym env setup so you don’t have to call env.render to get RGB values and instead states are already set up to receive the raw the values. Where cartpole we need to grab image and extract raw values from that. “At least that’s how it looks like to me in gym env setup at quick glance lol”

denizs · July 12, 2017, 2:02pm

That’s on way but you can also call render(mode='rgb_array') in your run() method. You can check out this gist to reproduce

dgriff · July 12, 2017, 2:16pm

Yeah I was saying it’s the fact that you need to use render() which in turn uses rendering code that gets the rgb values from a created image that I think causes the problem. Where rgb values in Atari come from converting just raw pixel data

github.com

openai/gym/blob/master/gym/envs/classic_control/rendering.py

"""
2D rendering framework
"""
from __future__ import division
import os
import six
import sys

if "Apple" in sys.version:
    if 'DYLD_FALLBACK_LIBRARY_PATH' in os.environ:
        os.environ['DYLD_FALLBACK_LIBRARY_PATH'] += ':/usr/lib'
        # (JDS 2016/04/15): avoid bug on Anaconda 2.3.0 / Yosemite

from gym.utils import reraise
from gym import error

try:
    import pyglet
except ImportError as e:
    reraise(suffix="HINT: you can install pyglet directly via 'pip install pyglet'. But if you really just want to install all Gym dependencies and not have to think about it, 'pip install -e .[all]' or 'pip install gym[all]' will do it.")

This file has been truncated. show original