RL Problems

1.Delayed, sparse reward(feedback), Long-term planning

Hierarchical Deep Reinforcement Learning, Sub-goal, SAMDP, optoins, Thompson sampling, Boltzman exploration, Improving Exploration

2.Partial observability, Imperfect-Information

Memory, Nash equilibria, MCTS, self-play, LSTM, active perception, curiosity

3.Large state space, Large action space

Hardware, Distributon, Deeper Neural Network.

原文地址:https://www.cnblogs.com/huangshiyu13/p/7353706.html