Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor

Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor

2019-07-15 22:23:02

 

Paperhttps://arxiv.org/pdf/1801.01290.pdf or Updated Versionhttps://arxiv.org/pdf/1812.05905.pdf 

Projecthttps://sites.google.com/view/soft-actor-critic or https://sites.google.com/view/sac-and-applications/ 

TensorFlowhttps://github.com/haarnoja/sac 

PyTorchhttps://github.com/vitchyr/rlkit 

Demo videohttps://www.youtube.com/channel/UCxXt8Br3-wyluz9Q08-fsaA 

 

Good Related Bloghttps://zhuanlan.zhihu.com/p/70360272

 

==== Video Related Tutorials (A2C, A3C): 

A brief review of Actor-Critic Algorithms:   https://www.youtube.com/watch?v=aODdNpihRwM 

CS885 Lecture 7b: Actor Critic:        https://www.youtube.com/watch?v=5Ke-d1Itk3k 

DRL Lecture 6: Actor-Critic:          https://www.youtube.com/watch?v=j82QLgfhFiY&t=27s

Build an A2C agent that learns to play Sonic with Tensorflow (tutorial):   https://www.youtube.com/watch?v=GCfUdkCL7FQ

Reinforcement Learning 6: Policy Gradients and Actor Critics (Deep Mind):    https://www.youtube.com/watch?v=bRfUxQs6xIM&t=27s 

Actor Critic (A3C) Tutorial:         https://www.youtube.com/watch?v=O5BlozCJBSE 

Actor Critic Algorithms:            https://www.youtube.com/watch?v=w_3mmm0P0j8&t=2s 

 

 

==

 

原文地址:https://www.cnblogs.com/wangxiaocvpr/p/11191272.html