Meta-reinforcement learning in a thalamo-orbitofrontal circuit

Published: April 30, 2020, 10 a.m.

Link to bioRxiv paper: http://biorxiv.org/cgi/content/short/2020.04.28.066878v1?rss=1 Authors: Namboodiri, V., Hobbs, T., Trujillo-Pisanty, I., Simon, R., Stuber, G. Abstract: Learning to predict rewards is essential for the survival of animals. Contemporary views suggest that such learning is driven by a reward prediction error, the difference between received and predicted rewards. Here we show using two-photon calcium imaging and optogenetics in mice that a different class of reward learning signals exists within the orbitofrontal cortex (OFC). Specifically, the reward responses of many OFC neurons exhibit plasticity consistent with filtering out rewards that are less salient for learning (such as predicted rewards, or, unpredicted rewards available in a context containing highly salient aversive stimuli). We show using quasi-simultaneous imaging and optogenetics that this reward response plasticity is sculpted by medial thalamic inputs to OFC. These results provide a biological substrate for emerging theoretical views of meta-reinforcement learning in prefrontal cortex. Copy rights belong to original authors. Visit the link for more info