[RL學習篇][#2] 簡單的grid_mdp測試程式

下方是用來簡單的測試 grid_mdp.py的程式,執行後會用隨機動作去跑動作。

 1 import gym
 2 import tensorflow
 3 import random
 4 from gym import wrappers
 5 
 6 env = gym.make('GridWorld-v0')
 7 
 8 env = wrappers.Monitor(env, './outputs/grid_mdp-experiment-', force=True)
 9 
10 for episode in range(100):
11     env.reset()
12     for i in range(100):
13         env.render()
14         next_state, reward, done, _ = env.step(random.choice(env.action_space)) # take a random action
15 
16         if done :
17             break
18 
19     print('episdoe: ', episode)
20 
21 wrappers.Monitor.close(env)

 ========================修正事項====================

2018.05.18:  執行上方程式有時出現以下錯誤訊息

  Traceback (most recent call last):
  File "/home/lsa-dla/PycharmProjects/grid_mdp/lsa_test1.py", line 11, in <module>
  env.reset()
  File "/home/lsa-dla/gym/gym/wrappers/monitor.py", line 37, in reset
  self._before_reset()
  File "/home/lsa-dla/gym/gym/wrappers/monitor.py", line 185, in _before_reset
  self.stats_recorder.before_reset()
  File "/home/lsa-dla/gym/gym/wrappers/monitoring/stats_recorder.py", line 68, in before_reset
  raise error.Error("Tried to reset environment which is not done. While the monitor is active for {}, you cannot call reset() unless the episode is over.".format(self.env_id))
  gym.error.Error: Tried to reset environment which is not done. While the monitor is active for GridWorld-v0, you cannot call reset() unless the episode is over.
  Exception ignored in: <bound method Viewer.__del__ of <gym.envs.classic_control.rendering.Viewer object at 0x7f1ed7e63208>>

修正方式將 

  for episode in range(100):

改成

      while True:

即可正常執行

---------------------------------------------------------------------------

原文地址:https://www.cnblogs.com/lishyhan/p/9052161.html