Rhythmic activity in the medial and orbital frontal cortices tracks reward value and the vigor of consummatory behavior

This study examined how the medial frontal (MFC) and orbital frontal (OFC) cortices process reward information to guide behavior. We simultaneously recorded local field potentials in the two areas as rats consumed liquid sucrose rewards and examined how the areas collectively process reward information. Both areas exhibited a 4-8 Hz “theta” rhythm that was phase locked to the lick cycle. The rhythm similarly tracked shifts in sucrose concentrations and fluid volumes, suggesting that it is sensitive to general differences in reward magnitude. Differences between the MFC and OFC were noted, specifically that the rhythm varied with response vigor and absolute reward value in the MFC, but not the OFC. Our findings suggest that the MFC and OFC concurrently process reward information but have distinct roles in the control of consummatory behavior.


INTRODUCTION
The medial and orbital frontal cortices (MFC and OFC) are two of the most studied parts of the cerebral cortex for their role in value-guided decision making, a process that ultimately results in animals consuming rewarding foods or fluids. There are extensive anatomical connections between the various parts of the MFC and OFC in rodents (Gabbott et al., 2003;Gabbott et al., 2005;Barreiros et al., 2020), and the regions are part of the medial frontal network (Öngür and Price, 2000). The MFC and OFC are thought to have specific roles in the control of behavior and specific homologies with medial and orbital regions of the primate frontal cortex (MFC: Laubach et al., 2018;OFC: Izquierdo, 2017). The extensive interconnections between MFC and OFC suggest that the two regions work together to control value-guided decisions. Unfortunately, few, if any, studies have examined concurrent neural processing in these regions of the rodent brain as animals perform behavioral tasks that depend on the two cortical regions.
In standard laboratory tasks, the action selection and outcome evaluation phases of value-guided decisions are commonly conceived as separate processes (Rangel et al., 2008).
MFC and OFC may contribute independently to these processes or interact concurrently across them. Though there is some variation across published studies, most argue for MFC having a role in action-outcome processing (Alexander and Brown, 2011;Simon et al., 2015) and OFC having a role in stimulus-outcome (stimulus-reward) processing (Gallagher et al., 1999;Schoenbaum and Roesch, 2005;Simon et al., 2015). The present study directly compared neural activity in the MFC and OFC of rats as they performed a simple consummatory task, called the Shifting Values Licking Task, or SVLT (Parent et al., 2015a). Importantly, the task depends on the ability of animals to guide their consummatory behavior based on the value of available rewards, and performance of these kinds of tasks depends on both the MFC (Parent et al., 2015a,b) and OFC (Kesner and Glibert, 2007). The goal of the study was to use the SVLT to determine if the MFC and OFC have distinct roles in processing reward information, e.g.
varying with action (licking) in MFC and the sensory properties of the rewards in OFC.
Most published studies on reward processing used operant designs with distinct actions preceding different outcomes. For example, a rat might respond in one of two choice ports to produce a highly valued reward, delivered from a separate reward port. To collect the reward, the rat has to travel across an operant chamber and then collect a food pellet or initiate licking on a spout to collect the reward. In such tasks (Pratt and Mizumori, 2001;van Durren et al., 2009;van Wiingerden et al., 2010;Riceberg and Shapiro, 2017;Jarovi et al., 2018;Siniscalchi et al., 2019), neural activity during the period of consumption might reflect the properties of the reward, how the animal consumes it, and/or the behaviors that precede reward collection (e.g. locomotion). As such, it is difficult to isolate reward specific activity using such operant designs.
Several published studies have used simpler consummatory and Pavlovian designs, and found neural activity in the MFC is selectively modulated during active consumption (Petykó et al., 2009;Horst and Laubach, 2013;Petykó et al., 2015). None of these tasks used fluids with different reward values. Amarante et al. (2017) was the first study to examine if similar neural activity was associated with animals consuming different magnitudes of reward. The study used the SVLT and presented rats with rewards that differed in terms of the concentration of sucrose contained in the rewarding fluids. The study found that neural actiivty in the MFC is entrained the animals' lick cycle and the strength of entrainment varies with the value of the rewarding fluid, i.e. stronger entrainment with higher value reward. The study also used reversible inactivation methods to demonstrate the licking entrainment depends on neural activity in the MFC.
In the present study, we used the SVLT, and several variations on the basic task design, to study consumption related activity in MFC and OFC. A custom designed syringe pump was used to deliver different volumes of fluid over a common time period (Amarante et al., 2019).
Using the custom device, we were able to directly compare neural activity associated with differences in sucrose concentration and fluid volume. We further manipulated the predictability of changes in reward magnitude to assess how predictable and unpredictable rewards are processed and used a third, intermediate level of reward to assess if reward magnitudes are encoded in a relative or absolute manner. Our findings reveal several similarities -and key differences -in each cortical region across all behavioral tasks that may allude to specific roles for MFC and OFC in the control of consummatory behavior.

METHODS
All procedures carried out in this set of experiments were approved by the Animal Care and Use Committee at American University (Washington, DC). All procedures conformed to the standards of the National Institutes of Health Guide for the Care and Use of Laboratory Animals.
All efforts were taken to minimize the number of animals used and to reduce pain and suffering.

Animals
Male Long Evans and Sprague Dawley rats weighing between 300 and 325 g were used in these studies (Charles River, Envigo). Rats were given one week to acclimate with daily handling prior to behavioral training or surgery and were then kept with regulated access to food to maintain 90% of their free-feeding body weight. They were given ~18 g of standard rat chow each day in the evenings following experiments. Rats were single-housed in their home cages in a 12h light/dark cycle colony room, with experiments occurring during the light cycle. A total of 12 rats had a 2x8 microwire array implanted into either the MFC (N=6), the OFC (N=2) or one array in each area contralaterally (N=4). Arrays consisted of 16 blunt-cut 50-µm tungsten (Tucker-Davis Technologies) or stainless steel (Microprobes) wires, separated by 250 µm within each row and 500 µm between rows. In vitro impedances for the microwires were ~150 kΩ.

Surgeries
Animals had full access to food and water in the days prior to surgery. Stereotaxic surgery was performed using standard methods. Briefly, animals were lightly anesthetized with isoflurane (2.5% for ~2 minutes), and were then injected intraperitoneally with ketamine (100mg/ kg) and dexdomitor (0.25mg/kg) to maintain a surgical plane of anesthesia. The skull was exposed, and craniotomies were made above the implant locations. Microwire arrays were lowered into MFC (coordinates from bregma (AP: +3.2 mm; ML: + 1.0 mm; DV: -1.2 mm from the surface of the brain, at a 12° posterior angle; Paxinos and Watson, 2013) or into OFC (AP: +3.2 mm, ML: + 4.0 mm, DV: -4.0 mm; Paxinos and Watson, 2013). The part of the MFC studied here is also called "medial prefrontal cortex" in many rodent studies and the region is thought to be homologous to the rostral ACC of primates (Laubach et al., 2018). Four skull screws were placed along the edges of the skull and a ground wire was secured in the intracranial space above the posterior cerebral cortex. Electrode arrays were connected to a headstage cable and modified Plexon preamplifier during surgery, and recordings were made to assess neural activity during array placement. Craniotomies were sealed using cyanocrylate (Slo-Zap) and an accelerator (Zip Kicker), and methyl methacrylate dental cement (AM Systems) was applied and affixed to the skull via the skull screws. Animals were given a reversal agent for dexdomitor (Antisedan, s.c. 0.25 mg/ml), and Carprofen (5 mg/kg, s.c.) was administered for postoperative analgesia. Animals recovered from surgery in their home cages for at least one week with full food and water, and were weighed and monitored daily for one week after surgery.

Behavioral Apparatus
Rats were trained in operant chambers housed within a sound-attenuating external chamber (Med Associates; St. Albans, VT). Operant chambers contained a custom-made glass drinking spout that was connected to multiple fluid lines allowing for multiple fluids to be consumed at the same location. The spout was centered on one side of the operant chamber wall at a height of 6.5 cm from the chamber floor. Tygon tubing connected to the back of the drinking spout administered the fluid from a 60 cc syringe hooked up to either a PHM-100 pump (Med Associates) for standard experiments, or to a customized open source syringe pump controller (Amarante et al., 2019) that is programmed by a teensy microcontroller to deliver different volumes of fluid with the same delivery time from one central syringe pump. A "lightpipe" lickometer (Med Associates) detected licks via an LED photobeam, and each lick triggered the pump to deliver roughly 30 μL per 0.5 second. Behavioral protocols were run though Med-PC version IV (Med Associates), and behavioral data was sent via TTL pulses from the Med-PC software to the Plexon recording system.

Shifting Values Licking Task
The operant licking task used here is similar to those previously described (Parent et al., 2015a,b;Amarante et al., 2017). Briefly, rats were placed in the operant chamber for thirty minutes, where they were solely required to lick at the drinking spout to obtain a liquid sucrose reward. Licks to the light-pipe lickometer would trigger the syringe pump to deliver liquid sucrose over 0.5 sec. Every 30 sec, the reward alternated between of high (16% weight per volume) and low (4% wt./vol.) concentrations of liquid sucrose, delivered in a volume of 30 μL. In volume manipulation sessions, the reward alternated between a large (27.85 μL) and small volume (9.28 μL) of 16% liquid sucrose. Rewards were delivered over a period of 0.5 sec for all levels of concentration and volume using a custom made syringe pump (Amarante et al., 2019). The animal's licking behavior was constantly recorded throughout the test sessions.

Blocked versus Randomly Interleaved Licking Task
The Shifting Values Licking Task was altered to allow for comparison of blocked versus interleaved presentations of reward values. The first three minutes of the task consisted of the standard Shifting Values Licking Task, with 30 second blocks of either the high or low concentration sucrose rewards delivered exclusively during the block. After three minutes, the rewards were presented in a pseudo-random order (e.g., high, high, low, high, low, low, high) for the rest of the test session. With rewards interleaved, rats were unaware of which reward would be delivered next. Behavioral and neural data were only analyzed from the first six minutes of each test session. We focused on manipulating sucrose concentration, and not fluid volume, in this task variation, as concentration differences provided the most effects of reward value on licking behavior (see Figure 1D below).

Three Reward Licking Task
The Shifting Values Licking Task was modified, using a third intermediate concentration of sucrose (8% wt./vol) to assess if reward value influenced behavior and neuronal activity in a relative or absolute manner. In the first three minutes of each session, rats received either the intermediate (8%) or low (4%) concentration of sucrose, with the two rewards delivered over alternating 30 second periods as in the SVLT. After three minutes, the rewards switched to the high (16%) and intermediate (8%) concentrations, and alternated between those concentrations for the rest of the session. Behavioral and neural data were only analyzed from the first six minutes of each test session.

Electrophysiological Recordings
Electrophysiological recordings were made using a Plexon Multichannel Acquisition Processor (MAP; Plexon; Dallas, TX). Local field potentials were sampled on all electrodes and recorded continuously throughout the behavioral testing sessions using the Plexon system via National Instruments A/D card (PCI-DIO-32HS). The sampling rate was 1 kHz. The head-stage filters (Plexon) were at 0.5 Hz and 5.9 kHz. Electrodes with unstable signals or prominent peaks at 60 Hz in plots of power spectral density were excluded from quantitative analysis.

Histology
After all experiments were completed, rats were deeply anesthetized via an intraperitoneal injection of Euthasol (100mg/kg) and then transcardially perfused using 4% paraformaldehyde in phosphate-buffered saline. Brains were cryoprotected with a 20% sucrose and 10% glycerol mixture and then sectioned horizontally on a freezing microtome. The slices were mounted on gelatin-subbed slides and stained for Nissl substance with thionin.
Statistical testing was performed in R. Paired t-tests were used throughout the study and one or two-way ANOVA (with the error term due to subject) were used to compare data for both behavior and electrophysiological measures (maximum power and maximum inter-trial phase coherence) for high and low value licks, blocked versus interleaved licks, and highintermediate-low licks. For significant ANOVAs, the error term was removed and Tukey's posthoc tests were performed on significant interaction terms for multiple comparisons. Descriptive statistics are reported as mean + SEM, unless noted otherwise. Licking Task with differences in concentration (16% and 4% wt./vol.). Rats have been shown to acquire incentive contrast effects in the SVLT after this duration of training (Parent et al., 2015a). For the Blocked-Interleaved and Three Reward tasks, rats were tested after extensive experience in the SVLT and after two "training" sessions with the Blocked-Interleaved and Three Reward designs. The electrophysiological recordings reported here were from the animals' third session in each task.
Behavioral measures included total licks across the session, the duration and number of licking bouts, and the median inter-lick intervals (inverse of licking frequency). Bouts of licks were defined as having at least 3 licks within 300 ms and with an inter-bout interval of 0.5 sec or longer. Bouts were not analyzed in the Blocked-Interleaved Task; due to the unique structure of the task, bouts were all shortened by default due to a constantly changing reward in the interleaved phase of the task. While bouts of licks were reported in most tasks, electrophysiological correlates around bouts were not analyzed because there were often too few bouts (specifically for the low-lick conditions) in each session to deduce any electrophysiological effects of reward value on bout-related activity.
For analyzing lick rate, inter-lick intervals during the different types of rewards were obtained, and then the inverse of the median inter-lick interval provided the average lick rate in Hertz. Any inter-lick interval greater than 1 sec or less than 0.09 sec was excluded from the analysis. For licks during the randomly interleaved portion of the Blocked-Interleaved Task, more than two licks in a row were needed to calculate lick rate. To analyze behavioral variability of licks, we used coefficient of variation (ratio of the standard deviation to the mean) on high and low value inter-lick intervals that occurred within bouts.

Data Analysis: Local Field Potentials
Electrophysiological data were first analyzed in NeuroExplorer (http://www.neuroexplorer.com/), to check for artifacts and spectral integrity. Subsequent processing was done using signal processing routines in GNU Octave. Analysis of Local Field Potentials (LFP) used functions from the EEGLab toolbox (Delorme and Makeig, 2004) (Event-Related Spectral Power and Inter-Trial Phase Coherence) and the signal processing toolbox in GNU Octave (the peak2peak function was used to measure event-related amplitude). Circular statistics were calculated using the circular library for R. Graphical plots of data were made using the matplotlib and seaborn library for Python. Analyses were typically conducted in Jupyter notebooks, and interactions between Python, R, and Octave were implemented using the rpy2 and oct2py libraries for Python.
To measure the amplitude and phase of LFP in the frequency range of licking, LFPs were bandpass-filtered using eeglab's eegfilt function, with a fir1 filter (Widmann and Schröger, 2012), centered at the rat's licking frequency (licking frequency + inter-quartile range; typically around 4 to 9 Hz), and were subsequently z-scored. Analyses were performed with a pre/post window of 2 seconds, and the Hilbert transform was used to obtain LFP amplitude and phase.
For inter-trial phase coherence (ITC) and event-related spectral power (ERSP), LFP data was preprocessed using eeglab's eegfilt function with a fir1 filter and was bandpass filtered from 0 to 100 Hz. For group summaries, ITC and ERSP matrices were z-scored for that given rat after bandpass filtering the data. Peri-lick matrices were then formed by using a pre/post window of 2 seconds on each side, and the newtimef function from the eeglab toolbox was used to generate the time-frequency matrices for ITC and ERSP up to 30 Hz.
Since most of the lick counts from the Shifting Values Licking Task are generally imbalanced (with a greater number of licks for high versus low value rewards), we used permutation testing to perform analyses on amplitude and phase-locking in these studies. Licks were typically down-sampled to match the lower number of licks. 80% of the number of lower value licks were randomly chosen from each session. For example, if a rat emitted 400 licks for the high concentration sucrose and 200 licks for the low concentration sucrose, then 160 licks would be randomly chosen from each of data type to compare the same number of licks for each lick type. This permutation of taking 80% of the licks was re-sampled 25 times and spectral values were recalculated for each permutation. The maximum ITC value was obtained through calculating the absolute value of ITC values between 2 to 12 Hz within a ~150 ms window (+1 inter-lick interval) around each lick. The maximum ERSP value was also taken around the same frequency and time window. Then, the average maximum ITC or ERSP value (of the 25x resampled values) for each LFP channel for each rat was saved in a data frame, and each electrode's maximum ITC and ERSP value for each type of lick (high-value or low-value lick) were used in the ANOVAs for group summaries. Group summary for the peak-to-peak Event-Related Potential (ERP) size recorded the average difference between the maximum and minimum ERP amplitude across all frequencies, using + 1 inter-lick interval window around each lick. The mean ERP size for each electrode for each rat was used in the ANOVAs for group summaries. These analyses were performed for all behavioral variations.

Shifting Values Licking Task: Effects of reward magnitude on consummatory behavior
The Shifting Values Licking Task (Amarante et al., 2017; Figure 1A) was used to assess reward encoding across the MFC and OFC as 12 rats experienced shifts in reward value defined by differences in sucrose concentration or fluid volume. Shifts in concentration were between 16% and 4% sucrose in a volume of 30 μL. Shifts in volume were between 30 μL and 10 μL containing 16% sucrose. Concentrations and volumes alternated over periods of 30 sec Several measures of licking behavior varied with sucrose concentration or fluid volume: lick counts, inter-lick intervals, lick rate, and bout duration ( Figure 1C). All rats licked more for the high concentration reward compared to the low concentration reward (paired t-test; t (11)=10.76, p<0.001) ( Figure 1D). Rats also licked at a faster rate for the high concentration reward compared to the low concentration reward (paired t-test; t(11)=6.347, p<0.001) ( Figure   1E). Additionally, rats had increased bout durations when licking for the high concentration reward compared to the low concentration reward (paired t-test: t (11)   1F). There was no difference in variability of high or low concentration licks: the coefficient of variation for inter-lick intervals was the same (paired t-test: t(9)=0.864, p=0.41).
Rats behaved similarly when consuming the high concentration and large volume rewards. In volume manipulation sessions, rats emitted more licks for the large reward than the small reward (paired t-test; t(11)=4.99, p<0.001). However, this difference in lick counts was less robust than the difference in high and low concentration rewards during concentration manipulation sessions ( Figure 1D). Rats licked at a faster rate for large rewards compared to  To assess differences in power, we used a peak-to-peak analysis of ERPs during licks for the high-value and low-value rewards. The measure calculates the difference in the maximum and minimum ERP amplitude using a window centered around each lick. The size of the window was twice each rat's median inter-lick interval. LFPs in MFC showed increased amplitudes for high concentration rewards, as opposed to low concentration rewards (one-way ANOVA: F(1,278)=34.19, p<0.001). Figure 2B shows MFC ERPs for high and low concentration rewards of an example rat. This effect was not significant in OFC ERPs, as seen in Figure 2F (F(1,177)=0.557, p=0.456). We also measured ERSP, and found a decrease in MFC power from licks for the high to low concentration rewards specifically in the 4-8 Hz range  Figure 2D,H).
These findings suggest that LFP activity in both MFC and OFC similarly encodes aspects of preferred, versus less preferred, reward options. 4-8 Hz phase-locking was strongest for both the high concentration and large volume rewards, which may be evidence that the animal is acting within a preferred state with the goal of obtaining their most "valued" reward.
These findings provided further evidence suggesting that the entrainment of neural activity in MFC and OFC to the lick cycle tracks reward magnitude.

Blocked-Interleaved Task: Engagement in and the vigor of licking vary with reward expectation
The same group of 12 rats were subsequently tested in an adjusted version of the Shifting Values Licking Task, which will be referred to as the Blocked-Interleaved Task ( Figure   3A). In the first three minutes of the task, i.e. the "blocked" phase, rats behaviorally showed their typical differentiation of high versus low concentration rewards by emitting more licks for the high concentration reward ( Figure 3B, left), and licked at a faster rate ( Figure 3C, left). However, this pattern changed when the rewards were randomly presented in the "interleaved" part of the task. With a randomly interleaved reward presentation, rats licked nearly equally for high and low concentration rewards ( Figure 3B, right). We performed a two-way ANOVA on the number of licks by each lick type (high or low concentration) and portion of the task (blocked or interleaved). There was a significant interaction between concentration of reward and the  Additionally, there was a significant difference in lick rate by each lick type and portion of the task (F(1,33)=23.13, p<0.001; two-way ANOVA) ( Figure 3C). Post hoc analyses revealed that rats licked significantly faster for high versus low concentration rewards during the blocked portion (p<0.005). Lick rates for high versus low concentration licks during the interleaved part of the task were not significantly different (p=0.99). Notably, lick rate during access to either high concentration (p=0.005) or low concentration (p=0.002) rewards during the interleaved portion was significantly increased from lick rate during access to the low concentration reward in the blocked portion of the task.

Blocked-Interleaved Task: Dissociation of MFC and OFC with regards to response vigor
Having established that the Blocked-Interleaved Task can reveal effects of reward expectation on task engagement and response vigor, we next examined how neural activity in the MFC and OFC varies with these behavioral measures. We assessed changes in lick- In MFC, a two-way ANOVA revealed a significant interaction of lick type by portion of the task with ERP peak-to-peak size ( Figure 4A) as the dependent variable (F(1,564)=6.232, p=0.013). However, there were no differences between the ERP measures between high and low concentration licks during the blocked portion of the task (p=0.887) and between high and low concentration licks during the interleaved portion of the task (p=0.938). The same was true with ERSP measures for MFC LFPs; There was a significant interaction between lick type and portion of the task (F(1,564)=30.17, p<0.001; two-way ANOVA), but no significant difference between ERSP values between high and low concentration licks in the blocked (p=0.213) or interleaved (p=0.743) portions of the task. In OFC ( Figure 4D), there was no significant interaction of lick type and portion of the task by the amplitude size of the lick's ERPs  We directly compared ITC values in both regions with lick rate and total lick counts ( Figure 5). Post-hoc analyses displayed in Figure 5C revealed that in MFC there was a significant difference between ITC values for the high versus low concentration licks (as also documented at the top of Figure 4C Figure 5C) did not match the pattern for total licks ( Figure 5A).
In OFC, ITC values ( Figure 5D) did not match either the total-lick ( Figure 5A) or lickrate ( Figure 5B) comparisons, despite the qualitative similarity with the total number of licks (compare Figure 5D with Figure 5A). The only significant difference in ITC values in OFC was between the high and low concentration licks in the blocked portion of the task (as also documented at the top of Figure 4F). All other comparisons were non-significant. This pattern of post-hoc comparisons did not match either total licks (compare Figure 5A with 5D) or lick rate (compare Figure 5B with 5D).
Together with the results summarized in Figure 4, these findings from post-hoc testing in Figure 5 provide evidence that MFC and OFC encode different aspects of licking and reward value. There was a clear match between the pattern of lick entrainment in the MFC, but not the OFC, with the animals' licking rates. The correspondence between lick entrainment in MFC and the animals' lick rates provides support for the idea that MFC plays a role in response vigor. By contrast, OFC might be involved in more general aspects of motivation, e.g. to lick or not (reward evaluation) based on reward magnitude or the predictability of the environment.

Three Reward Task: Behavioral evidence for effects of relative reward value
The previous experiments assessed comparison of two levels of rewards (either high/low concentration or large/small volume) in the Shifting Values Licking Task. After seeing clear behavioral and electrophysiological differences between two rewards, we aimed to investigate how animals process reward with contexts involving three different rewards. In this experiment, we assessed if rats process rewards in a relative manner or in an absolute manner by implementing a third intermediate (8% wt./vol. sucrose concentration) reward.
In the Three Reward Task (Figure 6A Figure 6B). Post-hoc analyses revealed that rats emitted significantly more licks for the intermediate value 8% reward as opposed to the low value 4% reward in block 1 (p<0.001). In block 2, rats also emitted significantly fewer licks for the intermediate value 8% reward when it was paired with the high value 16% reward (p<0.001).
Rats also licked significantly less for the intermediate 8% reward in block 2 than they did in block 1 (p<0.001).
There was a more subtle effect for differences in bout duration across the different rewards (F(3,33)=5.333, p=0.004; two-way ANOVA) ( Figure 6C). Post-hoc analyses revealed no significant difference in bout duration for the 4% versus 8% in block one (p=0.098), yet there was a significant decrease in bout durations during access to the 8% versus 16% in block two (p=0.023). Bout durations during access to the intermediate 8% reward in block 1 versus block 2 were not different (p=0.20). While there was a significant effect of lick type on lick rate (F(3,33)=10.59, p<0.001; two-way ANOVA), post-hoc analyses revealed no major differences in lick rate of the licks for rewards in block 1 (p=0.17) or block 2 (p=0.31) ( Figure 6D),nor for the lick rate for 8% licks in block 1 versus block 2 (p=0.76).

Three Reward Task: Neural evidence for effects of absolute, not relative, reward value
The behavioral measures summarized above established that the Three Reward Task can reveal effects of relative value comparisons. We next analyzed electrophysiological signals from MFC and OFC (Figure 7) to determine if they tracked the animals' behavior in the task, and might encode relative differences in value, or some other aspect of value, such as the absolute differences between the three rewards. We found a significant difference between ITC values for the three different rewards in both MFC and OFC (MFC: F(3,627)=154.4, p<0.001; OFC: F(3,363)=13.29, p<0.001; two-way ANOVAs). Tukey post-hoc analyses revealed a difference in The ITC findings, at least in MFC, support the idea that the "higher value" and "lower value" rewards in each context are being encoded differently across contexts. They indicate that MFC encodes absolute reward value. Qualitatively, the ITC values in MFC seem to have the same pattern as the lick rate ( Figure  The encoding of value was less clear based on ITC measures from the OFC. These values did not directly match the licking behavior (in either rate, total licks, or bout duration) (compare Figure 8A,B with 8D), and did not show clear evidence for either absolute or relative encoding of reward. Instead, the results from Figure 8D indicate that OFC encodes reward value in a mixed absolute/relative manner (as in Supplementary Figure 3C).

DISCUSSION
We investigated the role of MFC and OFC in processing reward information as rats participated in various consummatory licking tasks. Rats process and express changes in reward size in roughly the same manner as with reward concentration, both behaviorally and electrophysiologically. LFP activity in both MFC and OFC is sensitive to changes in reward type (both volume and concentration). Our results reveal context-dependent value signals in both regions through randomly presented rewards and by introducing a third reward in the task.
Behaviorally, rats show evidence for a relative expression of rewards, while neural activity in MFC, but not OFC, shows an absolute encoding of reward value. Together, our findings suggest that rats sample rewards and commit to consuming a given reward when they are able to predict its value, and this behavior is coupled to neural activity in MFC and OFC that encode both the value of the reward and the animal's consummatory strategy. The subtle differences between the two regions follow the hypothesis that MFC represents action-outcome relationships and OFC represents stimulus-outcome relationships. MFC activity may provide the "value of the action" information to OFC, while OFC may evaluate the reward and provides feedback to MFC.

Rhythmic Activity and Reward Processing
Similar to our previous studies (Horst and Laubach, 2013;Amarante et al., 2017), neural activity was entrained to the lick cycle across all tasks in both MFC and OFC.
Entrainment was strongest for the high-value reward (either of size or sweetness) and varied with the animals consummatory strategy (persistently lick a highly preferred option or sample fluid and wait for better option). Previous studies have viewed this rhythmic activity as being driven by the act of licking, as rats naturally lick at 6-7 Hz (Travers et al, 1997;Weijnen, 1998;Host and Laubach, 2013). However, the activity cannot be explained solely by licking, as there are instances where phase-locking and behavior do not show the same pattern (e.g. the Blocked-Interleaved experiment), and the variety of studies reported here and in Amarante et al.
(2017) suggest a higher order role for the neural activity in the control of consummatory behavior.
Specifically how the rhythmic activity contributes to the control of behavior might be best understood by considering how this general frequency range has been interpreted in other types of behavioral tasks. Rhythms between 4 and 8 Hz are commonly referred to as "theta activity" and those found in the frontal cortex have been referred to as "frontal theta" (Cavanagh and Frank, 2014). There have been several proposals for the role of frontal theta in information processing. One idea is that the rhythm acts to break up sensory information into temporal chunks (Uchida and Mainen, 2003), and is related to the notion of a global oscillatory signal to synchronize neural activity across multiple brain structures throughout the taste-reward circuit (Gutierrez and Simon, 2013). Another idea is that frontal theta acts as an action monitoring signal (Cavanagh et al, 2012;Narayanan et al., 2013;Laubach et al., 2015), which can be generated through simple recurrent spiking network models (Bekolay et al., 2014). Finally, instead representing a specific function, frontal theta may act as a convenient "language" for distant brain regions to exchange information with each other (Womelsdorf et al., 2010). Our general findings contribute to this literature by suggesting that frontal theta acts as a value signal to guide consummatory behavior, which is the ultimate consequence of many goaldirected actions in natural environments.

A Common Code for Reward Magnitude
A major finding in the present study (Figures 1 and 2) was the similar electrophysiological signals in MFC and OFC are associated with the consumption of high and low concentration liquid sucrose rewards and large and small volume rewards. Although other studies have found either decreases (Kaplan et al., 2001) or increases in behavior with increases in concentration and volume rewards in the same study (Hulse et al., 1960;Collier and Myers, 1961;Collier and Wills, 1961), these studies did not investigate the electrophysiological correlates of consuming rewards. Our study is the first to show a generalized "value signal" in the frontal cortex that scales with increased size and increased concentration of liquid sucrose. These signals might underlie the computation of a common currency (Montague and Berns, 2005;Levy and Glimcher, 2011;Levy and Glimcher, 2012;Strait et al., 2014) for the amount of nutrient available in a given food item and contribute to value-guided control of consumption.

Evidence for the Contextual Control of Consumption
In the Blocked-Interleaved Task ( Figure 3A), rats who licked more, longer, and faster for the high concentration reward when rewards were blocked did not continue to do so during interleaved portion of the task ( Figure 3B-C). Instead, they licked nearly equally for the high and low concentration solutions, a result that is suggestive of the loss of positive contrast effects for the higher value fluid that is commonly found in the blocked design (Parent et al., 2015a).
Despite these differences in behavior, the rats' LFPs in MFC and OFC showed high levels of lick-entrained activity, essentially equal to that found during consumption of the higher value fluid in the blocked part of the session. This finding is hard to reconcile with enhanced lick entrainment reflecting reward contrast effects. If positive contrast engenders entrainment, then LFPs should have shown reduced phase locking to the lick cycle in the interleaved portion of the task. Instead, the results might suggest that LFPs in MFC and OFC are entrained to licking when rats engage in persistent licking, as was found in the periods with high concentration access in the blocked part of the sessions and across the entire interleaved part of the session, and entrainment is reduced when rats switch to sampling the fluid during periods with low value access in the blocked part of the session. By this view, LFP entrainment to the lick cycle could serve as a contextual marker for reward state and the behavioral strategy deployed by the rat to Contextual coding of reward value was also apparent in the Three Reward Task (Figures 6-8), where lick entrainment was stronger when the higher value option was available ( Figure 7). In this case, the strength of engagement, for MFC but not OFC, tracked reward value in an absolute manner, with entrainment being higher for the 16% sucrose solution compared to the 8% solution when both were the "best" option ( Figure 8C,D). These electrophysiological results were notably distinct from behavioral measures such as total licking output and lick rate ( Figure 8A,B), which provided evidence for relative value comparisons. Our electrophysiological results support theories of absolute reward value (Hull, 1943;Spence, 1956;Flaherty, 1982), as opposed to theories of relative reward value (Crespi, 1942;Black, 1968;Webber et al., 2015).
It is not clear from our studies if the reduction in entrainment when low value rewards are available is an active or passive process. For example, it is possible that some active input to the MFC and OFC denotes the temporal context (e.g. dopamine, hippocampus) and enhances entrainment when the higher value option is available. Alternatively, signals from sensorimotor regions of the frontal cortex, which sit in between the MFC and OFC, the oral sensory and motor cortices (Yoshida et al., 2009), might be reduced during periods with less intense licking, leading to a passive reduction in overall frontal lick entrainment. Future studies are needed to address these neural mechanisms of licking-related synchrony in the rodent frontal cortex.

Subtle Differences between MFC and OFC
MFC and OFC are ideal locations for representing several aspects of value-based decision making, since the cortical regions receives sensory input, project to motor planning areas, and are connected with dopamine-rich areas either directly or through the striatum (Sugrue et al., 2005). Both OFC and MFC play a role in processing and evaluating rewards, and show activity modulated around the receipt and consumption of reward as well as the execution of rewarding behaviors. These areas may be contributing to value-based decision making in a goal-directed system (Balleine and Dickinson, 1998;Rangel et al., 2008;Rangel, 2013), where the value of a given reward is computed, and information about previous outcomes can be used to update values of predicted future outcomes (i.e., predicted rewards). This idea agrees with our findings of increased theta phase-locking in MFC and OFC LFPs during licks for the high value rewards, whether that reward is sweeter or larger in volume, which can be viewed as subjective value.