Understanding decision criterion learning

Understanding decision criterion learning: From signal detection theory to neural implementation

Project goals:

This project conducts behavioral and neurophysiological experiments to achieve a deeper understanding of adaptive choice algorithms and their neural implementation. We have previously developed an extension of the Kac-Dorfman-Biderman (KDB) model (see below) that is purely income-based (i.e. only learns from rewards and not from errors) and can explain animal data very well at the algorithmic level. However, this income-based model has so far only been tested against optimization, but not against an error-based learning model. The KDB model and related criterion learning models have originally been developed to describe human behavior in psychophysical experiments. In the original work by Dorfman & Biderman, it was found that humans learn from errors rather than income and we want to understand this difference between human and animal behavior. Furthermore, we need to test our model for a wider set of stimulus situations, e.g. whether criterion learning is stable and invariant when stimulus set size or stimulus probabilities change. Above and beyond working on the algorithmic level, our long-term goal is to connect the behavioral and algorithmic descriptions to the implementational, i.e. neural level. In order to do so, we will connect our criterion learning model with standard reinforcement learning models and neural network models that have previously been linked to concrete neural representations.

Background:

The goal of cognitive neuroscience is to understand how the brain controls intelligent behavior. Cognitive neuroscientists advocate a top-down research strategy that starts with a careful description of behavior and an understanding of the problem an animal’s behavior is solving, e.g. maximizing reinforcement or minimizing punishment. David Marr called this starting point for understanding the brain the computational level. Following a computational-level explanation the next step is to find algorithms that solve the animal’s problem in a way that is compatible with the observed behavior. On this algorithmic level, computer models or analytical models of hypothesized internal representations and processes are used to simulate an animal’s behavior. Such simulations should explain how the animal’s problem is solved and they should also explain errors, reaction times, and learning curves. Finally, the internal representations and processes that were used to simulate an animal’s behavior guide a search for neural mechanisms at the implementational level.

One of the simplest domains in cognitive neuroscience that seems ripe to bridge all of Marr’s levels is perceptual decision-making: An animal is tasked with giving a differential response that is dependent on a noisy sensory signal, e.g. press one lever if a specific stimulus is present and another lever if not. A correct response is reinforced; an incorrect response may be punished. Importantly, such tasks allow researchers to derive the optimal behavior that maximizes reinforcements, hence giving us a computational-level explanation. This is commonly done within the framework of signal detection theory (SDT). While SDT allows to compute the decision criterion that optimizes expected value over rewards and punishments, nothing is said about how exactly an observer chooses a particular criterion and adopts a different criterion when feedback conditions change. Some concrete algorithmic-level adaptations of SDT have been put forward that can explain the dynamics of criterion learning. One of the first and most influential ones was the Kac-Dorfman-Biderman (KDB) model, which suggests that the decision criterion is shifted by a constant value after feedback, with the direction being contingent on errors and reinforcement. While this model can be shown to lead to suboptimal limit behavior under specific conditions, similar deviations from optimality have been observed in human performance (e.g. in probability matching). Importantly, these suboptimal deviations from SDT are markers for the underlying mechanisms and could be identified through a careful theoretical analysis of the model, leading to very specific behavioral predictions.

Contact: Frank Jäkel

Project Details

Project:	Über das Erlernen von Entscheidungskriterien: Von der Signalentdeckungstheorie zur neuronalen Implementation
Project partners:	Technical University of Darmstadt, Prof. Frank Jäkel Johannes Gutenberg University Mainz, Prof. Maik Stüttgen
Project duration:	2019-2022 (36 months)
Project funding:	175 T EUR
Funded by:	DFG – Deutsche Forschungsgemeinschaft (German Research Foundation)
Project no.:	424828846