To investigate how humans learn correlations between

outc

To investigate how humans learn correlations between

outcomes we scanned 16 subjects using fMRI while they performed a “resource management” game. This task invoked a scenario whereby a power company generates fluctuating amounts of electricity from two renewable energy sources, a solar plant and a wind park. We instructed subjects to create an energy portfolio under a specific goal constraint necessitating keeping the total energy output as constant as possible (Figure 1A). Subjects accomplished this by adjusting weights that determined how the two resources were linearly combined. A normative best DAPT chemical structure performance is achievable by finding a solution that exploits knowledge of the covariance structure of these resources (Figure 1B), a task design that approximates a simple portfolio problem in finance. Importantly, the outcomes of the two resources covaried with each other and this correlation between the two outcomes changed probabilistically over time, requiring subjects to continuously update their estimate of the current

correlation structure. This task is well suited for assessing subjects’ estimate of the correlation strength because a good performance is only accomplished if subjects learn both the distribution of returns for each resource as well as their correlation. We rewarded participants according to how stable they kept the total output of their mixed energy learn more portfolio relative to the variance resulting from an optimal strategy (specified by MPT-calculated optimal weights). We speculated that subjects might solve the task by learning the correlative strength between the resources via a correlation prediction error, calculated from the cross-product of the individual resources’ outcome prediction errors (Figure 1C). This envisages that subjects represent a continuous measure of outcome correlation and update this metric on a trial-by-trial basis. To rule out alternative strategies we examined other computational models that could

not be used to guide choice in our task, and fitted the free parameters of each model to get model predicted portfolio weights that most closely resembled the actual responses for each subject. One such alternative model-based strategy is to exploit trial-by-trial evidence to update a representation of the portfolio weights directly instead of first estimating the correlation coefficient. Similar to correlation learning, this model makes assumptions about the structure of the task and uses individual resource outcomes as a basis for learning. The main difference between the covariance based model and this model is that in the former, subjects update an estimate of the correlation via a prediction error and then translate this correlation strength into task-specific weights on every trial, whereas in the latter the estimates of task-dependent weights (i.e.

Comments are closed.