Activate Activate Activate
contact  
Hello. Sign in to personalize your visit. New user? Register now.  

In
By author

Monthly
288 pp. per issue, 6 x 9,
illustrated
Founded: 1989
ISSN 0899-7667
E-ISSN 1530-888X
2008 ISI Impact Factor: 2.378

Neural Computation

February 2005, Vol. 17, No. 2, Pages 245-319
Posted Online March 13, 2006.
(doi:10.1162/0899766053011555)
© 2005 Massachusetts Institute of Technology
Temporal Sequence Learning, Prediction, and Control: A Review of Different Models and Their Relation to Biological Mechanisms

Florentin Wörgötter

Department of Psychology, University of Stirling, Stirling FK9 4LA, Scotland,

Bernd Porr

Department of Psychology, University of Stirling, Stirling FK9 4LA, Scotland,

PDF (3,595.812 KB) PDF Plus (1,555.991 KB)

In this review, we compare methods for temporal sequence learning (TSL) across the disciplines machine-control, classical conditioning, neuronal models for TSL as well as spike-timing-dependent plasticity (STDP). This review introduces the most influential models and focuses on two questions: To what degree are reward-based (e.g., TD learning) and correlation-based (Hebbian) learning related? and How do the different models correspond to possibly underlying biological mechanisms of synaptic plasticity? We first compare the different models in an open-loop condition, where behavioral feedback does not alter the learning. Here we observe that reward-based and correlation-based learning are indeed very similar. Machine control is then used to introduce the problem of closed-loop control (e.g., actor-critic architectures). Here the problem of evaluative (rewards) versus nonevaluative (correlations) feedback from the environment will be discussed, showing that both learning approaches are fundamentally different in the closed-loop condition. In trying to answer the second question, we compare neuronal versions of the different learning architectures to the anatomy of the involved brain structures (basal-ganglia, thalamus, and cortex) and the molecular biophysics of glutamatergic and dopaminergic synapses. Finally, we discuss the different algorithms used to model STDP and compare them to reward-based learning rules. Certain similarities are found in spite of the strongly different timescales. Here we focus on the biophysics of the different calcium-release mechanisms known to be involved in STDP.

Cited by

Wiebke Potjans, Abigail Morrison, Markus Diesmann. (2009) A Spiking Neural Network Model of an Actor-Critic Learning Agent. Neural Computation 21:2, 301-339
Online publication date: 1-Feb-2009.
Abstract | Full Text | PDF (321 KB) | PDF Plus (322 KB) 
André Grüning. (2007) Elman Backpropagation as Reinforcement for Simple Recurrent Networks. Neural Computation 19:11, 3108-3131
Online publication date: 1-Nov-2007.
Abstract | PDF (677 KB) | PDF Plus (690 KB) 
Bernd Porr, Florentin Wörgötter. (2007) Learning with “Relevance”: Using a Third Factor to Stabilize Hebbian Learning. Neural Computation 19:10, 2694-2719
Online publication date: 1-Oct-2007.
Abstract | PDF (751 KB) | PDF Plus (759 KB) 
Bernd Porr, Florentin Wörgötter. (2006) Strongly Improved Stability and Faster Convergence of Temporal Sequence Learning by Using Input Correlations Only. Neural Computation 18:6, 1380-1412
Online publication date: 1-Jun-2006.
Abstract | PDF (9978 KB) | PDF Plus (9835 KB) 

Technology Partner - Atypon Systems, Inc.
  CrossRef member COUNTER member