Neuromodulated SpikeTimingDependent Plasticity, and Theory of ThreeFactor Learning Rules

郑重声明：原文参见标题，如有侵权，请联系作者，将会撤销发布！

FRONTIERS IN NEURAL CIRCUITS, (2016): 85-85

Abstract

　　经典的Hebbian学习强调突触前和突触后活动，但忽视了神经调节剂的潜在作用。因为神经调节剂传递有关新奇性或奖励的信息，在神经调节剂对突触性可塑性和经典条件反射学习的有效性的影响中，同时也决定了对这些感觉刺激的反应中是否有新的记忆。在这篇综述中，我们重点讨论了与一个或几个相神经调节信号相关联的突触前和突触后活动的时间要求。在强调抽象概念模型和数学理论的同时，我们还讨论了神经调节的时序依赖可塑性的实验证据。我们强调了突触机制的重要性，它来自于感觉刺激和神经调节信号之间的时间间隔，并针对包括突触前活动、突触后变量以及神经调节剂在内的neo-Hebbian三因素学习开展了工作。

Keywords: STDP, plasticity, neuromodulation, reward learning, novelty, spiking neuron networks, synaptic plasticity (LTP/LTD)

1. INTRODUCTION

2. BASIC CONCEPTS: HEBBIAN AND MODULATED HEBBIAN PLASTICITY

2.1. Conceptual Example: Reward-Based Learning

2.2.ConceptualExample:Novelty-Based Learning

2.3. Conceptual Role of Neuromodulators in Plasticity

3. EXPERIMENTAL EVIDENCE FOR NEUROMODULATION OF STDP

3.1.Questions Regarding Modulated STDP

3.2. STDP Protocols in Conjunction with Neuromodulators

3.3. Traditional Plasticity Protocolsin Conjunction with Neuromodulators

4. THEORIES OF MODULATED STDP

4.1. Formalization of Modulated Hebbian Plasticity

4.2. Policy Gradient Models: R-max

4.3. Phenomenological Models: R-STDP

4.4. Temporal-Difference Learning with STDP

4.5. Beyond Rewards: Other Models of Three-Factor Learning Rules

5.DISCUSSION

5.1. A General Framework for Reward-Modulated STDP

5.2. Subtraction of the Expected Reward

5.3. Eligibility Traces and Synaptic Tagging

5.4. Role of the Post-before-pre Part of the STDP Window

5.5. Implications for the Search of Experimental Evidence

5.5.1.Outlook