Many things in life are simple to describe, yet difficult to understand. One such obvious fact is that if one consumes more calories than one expends, this will eventually lead to weight gain and, ultimately, overweight and obesity. But what causes the excess in consumption that has propelled the surge in obesity in the past decades?
Intuitively, it is tempting to assume that overeating is driven by greater pleasure derived from eating. To examine the cause for overindulgence, we often present food cues and track how the brain evaluates them. Such cues range from pictures of palatable food to simple geometric shapes predicting the delivery of chocolate milkshake in the scanner. In line with the “reward surfeit” idea, many studies have observed an increased response to food cues in the brain’s reward circuit in overweight and obese individuals. This is then interpreted as an increased desire elicited by the prospect of food.
What’s in a milkshake? It depends on what you expect
But here’s the kicker. Many studies have also looked at the brain response to drops of milkshake in the same regions of the brain and most of them have pointed to a reduced signal to the delivery of the milkshake in overweight and obese compared to healthy weight individuals. Accordingly, the reduced reward signal has been interpreted as an indication of reduced pleasure derived from eating. But why should we run the risk of overeating if eating leads to less pleasure? The “reward deficit” theory poses that we eat until we feel gratified. Now, if 100 kcal worth of food gives you less pleasure, you might eat until you hit the same intended level of gratification, thereby exceeding the energy you expend.
At first glance, these competing ideas seem to be contradictory. However, food cues and eating are by no means independent. For example, expectations play a key role in shaping the evaluation of a reward. Imagine that you bring home a delicious cookie for your significant other. If the cookie was unexpected, this will lead to a positive surprise—a reward prediction error. Similarly, if the cookie is much tastier than expected, this will lead to a prediction error. These prediction errors are then used to adjust our future expectations. Now, imagine that you bring home the same kind of awesome cookie every day: at some point, your significant other will expect to get the cookie. As surprise will cease over time, there will be no prediction error anymore. Also, if you bring home only the expected cookie for your anniversary, you may fail to meet the expectations altogether. This will then lead to a negative surprise and a corresponding prediction error, but I’ll spare you the details of how this may pan out.
Reward signals mix value and surprise
So how could expectations explain the divergent results on food reward in obesity? While most studies on food cues do not instantly deliver food inside the scanner, the majority of studies that do deliver food uses cues to indicate that the reward will be coming soon. Thus, when participants see the cue, they’ll have an idea of what to expect. This expectancy shapes the evaluation of the food that is later being consumed. Intriguingly, basically the same regions in the reward circuit that represent value are also involved in computing violations of expected value. Recall the cookie example: expectations can push the evaluation of the cookie in a positive or a negative direction. Still, it seems silly to conclude that this might push the pleasure you’d derive from eating the same delicious cookie straight to a dislike because you expected more.
Hence, the link between expectations and prediction errors muddles the interpretation of reward signals elicited from eating. To reconcile these conflicting findings in obesity, we have used a computational model. Published recently in Physiology & Behavior1, the model explicates how we learn to predict rewards using prediction errors. This framework is called reinforcement learning and has many powerful applications. For example, a virtual agent can learn via reinforcement learning to master Go or classic Atari video games and even exceed expert human levels of performance over time.
Recapitulating the typical food study using simulated volunteers
While the math of the underlying algorithms can be complicated, the idea is very simple. Whenever we get an unexpected reward, we update our expectations for preceding cues. Such a simple trial-and-error mechanism can identify useful cues whenever there is a consistent link between the cue and the reward. This link is strong if we think about the smell of freshly baked bread and its likely presence somewhere close by.
However, in the experiments studying the response of the reward system to tasty food, these links were not so consistent. Typically, the cues only predicted the food reward correctly in about half of the cases. This uncertainty is often used to tweak the design in neuroimaging. But it comes at the cost of altering how a reward is perceived once it is being delivered. As a result, participants inside the scanner might try to learn from the receipt of the reward what, on average, to expect based on the cue.
Learning what to expect from a food cue
Critically, we recreated the scenario using a simple reinforcement learning scheme. We reasoned that two parameters suffice to account for learning here. The reward sensitivity parameter reflects the value of the reward at stake. It closely corresponds to the concept of reward surfeit (high sensitivity) versus deficit (low sensitivity) that seems so difficult to reconcile in obesity. In addition, the learning rate controls how quickly reward expectations are updated.
To simulate how the opposing processing of rewards may arise in obesity, we systematically varied these two parameters. This simulation led to two key insights. First, if reward sensitivity is low, as posited by the reward deficit theory, it is not possible to recreate the reported findings of increased reward response to cues, but decreased response to food intake. Second, if the reward sensitivity is high, as posited by the reward surfeit theory, it is possible to recreate the reported findings if the learning rate is low. Intriguingly, altered reward learning unfolds as a novel contributor to overeating from this body of research. This finding is also nicely flanked by recent studies pointing to marked deficits in reward learning instead of deficits in simply devouring food.
Fuel, not fun: the motivational function of prediction errors
Collectively, our simulation results corroborate the reward surfeit theory of obesity, but they further elucidate learning as a crucial contributor. But what is driving these differences? We suspect that changes in energy metabolism are linked to changes in dopamine transmission in the brain. Dopamine is known to be critically involved in the control of action and reward learning and changes in reward signaling may, again, reflect a simple metabolic fact.
The higher the (lean) body mass of a person, the more energy he or she will expend. To meet this additional expenditure of energy, one must procure more food, while the value of a calorie is reduced because it literally doesn’t get you as far2. Prediction errors have been demonstrated to be sensitive to such differential worth of the reward (marginal utility). And there is one more consequence that may come with it. By reducing the magnitude of error signals, they become less powerful in adjusting behavior according to the goals of the person. Put simply, it might be harder to use rewards to stay on the right track, for example when it comes to exercising. However, to better understand this intricate balance, clearly more integrative research is needed, and we are relentlessly working on filling the remaining bits of the big puzzle.
To conclude, there is always more in a milkshake than plain pleasure. Therefore, we need to be careful in designing studies and analyzing results before we jump straight to conclusions about the causes of obesity. One way of fact-checking the literature is to use established models of reward and motivation. We are convinced that such models can be vital assets for future research on eating and obesity.
- Kroemer, N.B. & Small, D.M. Fuel not fun: Reinterpreting attenuated brain responses to reward in obesity. Physiol Behav 162, 37-45 (2016).
- Kroemer, N.B., Burrasch, C. & Hellrung, L. To work or not to work: Neural representation of cost and benefit of instrumental action. Prog Brain Res 229, 125-157 (2016).