Paper Title (use style: paper title)

Neurohydrodynamics as a heuristic mechanism for cognitive processes in decision-making Leon C. Hardy

Daniel S. Levine

University of South Florida St. Petersburg USFSP St. Petersburg, U.S.A. [email protected]

University of Texas Arlington UTA Arlington, U.S.A. [email protected]

Dahai Liu Embry-Riddle Aeronautical University ERAU Daytona, U.S.A. [email protected]

Abstract—We propose to model learning of optimal decision rules using a mathematically rigorous descriptive model given by a modified set of Cohen-Grossberg neural network equations with reaction-diffusion processes in an environment of uncertainty. Our theory, which we call Neurohydrodynamics, naturally arises within the framework of neural networks while utilizing the foundations of Decision Field Theory (DFT) for describing the cognitive processes of the mammalian brain in the decision-making processes. Human cognition and intelligence requires more than an algorithmic description by a formal set of rules for its operation; it must possess what Alan Turing called an uncomputable human “intuition” (oracle) as a guide for decision-making processes. We draw an analogy with an idea from Quantum Hydrodynamics, namely, that a “pilot wave” guides quantum mechanical particles along a deterministic path by stochastic “forces” naturally arising from Schrodinger's wave equation. This type of equation was also investigated by Turing, and the reaction-diffusion processes of real neurons have been shown to aid in pattern formation while exhibiting self-organization. Because searching over all courses of action is costly, in both resources and time, we seek to include a mechanism that shortcuts the decision-making processes as described by DFT. Some empirical research has determined that diffusion does occur in the cognitive processing of real human brains. We propose a model for high-level decision processes by combining diffusion with other mechanisms (e.g., adaptive resonance and neuromodulation) for the interactions between different brain regions. For dynamic decisionmaking tasks, the diffusion processes within certain parts of the frontal lobes and basal ganglia are assumed to interact in hierarchical networks that integrate emotion and cognition incorporating both heuristic and deliberative decision rules. Keywords - Decision Field Theory; Quantum Hydrodynamics; Oracles; Adaptive Resonance Theory; Morphogenesis; ReactionDiffusion Equation; Turing Machines

I.

INTRODUCTION

Understanding human cognition and intelligence requires more than an algorithmic description by a formal set of rules. It must also encompass what Alan Turing called an uncomputable human “intuition” [1]. As a mechanism for human intuition, Turing proposed the use of an “oracle” by a decision-maker to instantaneously specify which of their choices will computationally lead to a solution in a timely manner. Such guidance by an “oracle” is an important but not a very well understood aspect of human cognition and intelligence as it indicates when a decision has occurred, and halts any further unnecessary computation. In this sense, Turing anticipated the connectionist approach to artificial intelligence (AI), where a “program” is implicitly written using a set of rules within the realm of computability. Turing's influences do not end here. While perhaps not directly connected to Turing's oracle, De Broglie's “pilot wave” interpretation of quantum mechanics and the subsequent development of Bohm's Quantum Hydrodynamics (QHD) have proposed that quantum mechanical particles are guided along a deterministic path by stochastic ''forces" that naturally arise from Schrödinger's wave equation [2,3]. Remarkably, this type of reaction-diffusion equations were also investigated by Turing, and his theory of morphogenesis [4] was able to account for pattern formation and self-organization in chemical and biological processes. It should come as no surprise that the reaction-diffusion processes of real neurons have been shown to aid in pattern formation while exhibiting self-organization important to memory, learning, decision-making, emotion and cognition [5,6,7,8,9,10,11,12,13]. In particular, decision-making has been the subject of numerous cognitive models, some of them incorporating facts about brain regions but none of them definitive. Since the mid1950s, the emphasis of decision-making has shifted from a normative theory based on maximizing expected utility to a

descriptive theory based on heuristics that humans actually employ in real time [14,15]. Yet most decision-making models still deal with isolated decisions that do not impact the circumstances surrounding subsequent decisions [16,17]. Recently there have been some decision-making models that address the evolution of decisions over time by means of reinforcement learning in a stochastic environment [18,19]. This type of descriptive decision-making theories has begun to be incorporated into neural networks composed of modules with plausible analogies to specific brain regions [20,21,22]. As more data became available from brain imaging as well as single-neuron studies, these networks gradually integrated their design principles with emerging cognitive and neuroscience data. This convergence of modeling and experimental results has enabled the development of some models of multiple interacting brain regions involved in complex cognitive processes with emotional influences such as reward learning and revaluation [23,24]; risky decision-making on a gambling task [20,25]; and Bayesian optimization [26,27]. These developments represent significant progress in understanding the underlying dynamics responsible for the observed behaviors in psychological experiments. However, even thought it is well known that the underlying dynamics of the brain are nonlinear, normative theories such as Decision Foeld Theory (DFT) are linear. The DFT approach is still useful since it leads to tractable solutions that approximate the nonlinear case. DFT has incorporated probabilistic and dynamical properties that make it possible to predict choice probabilities, the choice probability distribution, the buying and selling price, cash equivalents and approach-avoidance behavior [19]. DFT is an abstract representation of the deliberation time between stimulus and response as executed by some yet to be determined neural system but some steps have been taken in this theory [28]. Having established our motivation, we propose both a descriptive and normative model given by a modified set of Cohen-Grossberg equations [29] with the addition of the natural reaction-diffusion processes occurring in real brains, which we call Neurohydrodynamics (NHD). It is well know that the reaction term captures the competition among stored representations resulting in the selection of only one representation [11] often called lateral inhibition. In fact, associative learning, lateral inhibition, interlevel resonant feedback, and opponent processing are the four important organizational principles of a neural network [30,31]. An additional principle, we call longitudinal excitation, is given by the diffusion term that evaluates the similarity between sensory and stored representation with a comparison executed locally on each representation. We do not mean to imply that cooperation does not occur in the Cohen-Grossberg-type equations; rather we make it explicit with the addition of the diffusion term. We choose the Cohen-Grossberg equations because they are well founded on psychological principles that describe the competitive-cooperation interactions between populations of neurons at one level of the brain, or of cognitive processing. Grossberg's Adaptive Resonance Theory (ART) is important since certain brain regions seem to have hierarchical structure [32,33,34]. Most importantly, Turing's “oracle” naturally arises within the proposed framework of neural

networks while providing a psychological framework for describing the cognitive processes and emotional influences of the mammalian brain in decision-making. In this paper, we begin by considering certain aspects of QHD important to possible neural mechanisms for human decision-making. Next we review DFT as a psychological basis from which to include the notion of an “oracle” or guidance mechanism for cognitive/emotional processes of the mammalian brain. Furthermore, we consider the brain regions responsible for these processes that occur during the deliberation process. Finally, we make the extension of DFT to NHD as a neural representation of these decision-making processes in the mammalian brain. II.

A BRIEF SURVEY OF QUANTUM HYDRODYNAMICS

In the famous Solvay Congress of 1927, De Broglie proposed a pilot wave interpretation of Quantum Mechanics but quickly abandoned his approach under severe criticism. About 30 years later, Bohm proposed Quantum Hydrodynamics (QHD), a complete theory of quantum mechanics, that exploited De Broglie's ideas. Since QHD and neurodynamics are diffusion processes, they share the same form for their respective equations of motion, and we are encouraged to propose a “pilot wave” interpretation for the neurodynamics of a neural network with a diffusion process. Furthermore, both of these ideas has been expressed by Turing in his theory of morphogenesis and by the notion of an oracle to aid computation. Let us now review the important aspects Bohm's QHD relevant to this paper. For a quantum mechanical particle of mass m, QHD is founded on the time-dependent Schrodinger's equation, which is given by

[

]

−ℏ2 2 ∂ r , t  V  r ,t   r , t=i ℏ , 2m ∂t

where ħ is Planck's constant. The square of the wavefunction ψ(r,t) over a small volume represents the probability of finding the particle of mass m in that region of space. Bohm assumes iS(r,t) that the wavefunction takes the polar form ψ(r,t)=R(r,t)e . In making this substitution, we have

[



∂ R −1 ∂ R ∂S ∂R ∂S ∂R ∂S = 2   R  2 S ∂t 2m ∂x ∂x ∂y ∂y ∂z ∂z and

−∂ S 1 = ∂t 2m

[      ] 2

2

∂S ∂S ∂S   ∂x ∂y ∂z

2

V r , t

ℏ2  2 R 2m R

]

where R(r,t) and S(r,t) are the amplitude and phase of the wavefunction. One then identifies the last term in the second equation as the quantum potential, which is given by

Q r ,t =

−ℏ 2  2 Rr , t 2m Rr , t

As was recognized by Bohm, the quantum potential is the source of a stochastic force that acts on and guides a particle deterministically along a trajectory. It is given by Newton's Laws of motion by providing the auxiliary equation F=‒∇ V(r,t), where the potential V(r,t) includes the quantum potential and other conservative potentials [35]. Some important properties of the quantum potential are: • • •

•

•

The quantum potential depends on the curvature of the amplitude of the wavefunction - its second derivative. The quantum potential can become quite large as R(r,t) becomes small. The kinetic energy of the wavefunction has two components: the flow and shape kinetic energy. The shape kinetic energy is related to the quantum potential, and so measures the “internal” stresses of the wavefunction. The quantum potential is nonlocal, that is the trajectories lose their independence and are organized by the quantum potential. The quantum potential introduces contextuality, that is the trajectory depends on its initial wavefunction.

Most importantly, the quantum potential provides a description for the global organization of the macroscopic motion of particles that depend on the initial conditions. These aspects of QHD will become important since the notion of selforganization are pronounced in Turing's morphogenesis and the idea of an oracle in decision-making processes. III.

REVIEW OF DECISION FIELD THEORY

The foundations of Decision Field Theory (DFT) are drawn from the areas of early motivational theories of approachavoidance conflict, further extensions of these theories to decision-making and information processing theories of choice response time [19]. The utility of this theory has been demonstrated in the prediction of choice probabilities, the distribution of choice response times, buying and selling prices, cash equivalents, and approach-avoidance movement behavior. The probabilistic and dynamical properties included in DFT are among the fundamental characteristics of human decisionmaking. This characterization is important since DFT formalizes the deliberation process – the time between stimulus and action. Thus, DFT and its extensions [36] are an abstract representation of the deliberation process as executed by some underlying and yet to be determined neural system. In [19], the deliberation process is initiated by a confrontation in which the decision-maker tries to anticipate and evaluate all of the possible consequences produced by each

course of action. The decision-maker must undergo a time consuming process of retrieving, comparing and evaluating these consequences in real time. Actions are not taken until a preference for one action becomes strong enough for the decision-maker to take an action. The probability of selecting one action over another is obtained from the distribution of preferences (or preference states) over time. P2

-P1 S1

AL

AR

S1

Choose S2

S2

-P2

P1 Figure 1. Choice problem.

In Figure 1, if a right (left) action AR (AL) is taken then the payoff P1 or P2 is received given an uncertain event S1 or S2. Negative payoffs are associated with the avoidance subsystems of the valence system as shown in Figure 2. Positive payoffs are associated with the approach subsystem. The decision system produces a preference state as prescribed by DFT. Then the motor system takes an action (AL or AR) based on the preference state of the decision system. Avoidance Subsystem -P2

-P1 w(S1)

w(S2)

VR VL

Decision System

P

Motor System

A

w(S2) w(S1) P1 P2 Approach Subsystem Figure 2. The basic framework of DFT: Valence System with avoidance/approach (gains/losses) subsystems, a decision system (preference state P) and a motor system (action A). P1 and P2 are payoffs. The main components of DFT are a valence system with approach-avoidance subsystems, a decision system and a motor system. In the approach-avoidance subsystems, losses and gains are weighted inputs that produce a valence input to the decision system. The preference state of the decision system provides input the the motor system that results in an action. The attention weight represents an association strength between an action and a consequence. Let P(t) denote the preference state at time t, h the time unit, c the goal gradient parameter, s the growth decay rate, δ the mean valence input, VR(t) the valence of a right action at time t and VL(t) the valence of a left action at time t. Then the

following stochastic differential equations for DFT describes the preference state, and is given by

P t=1−sh  P t−h[V R t−V L t ] =[1− s−c h] P t−h[ ht ] , where the residual input ε(t) is normally distributed noise with 2 a vanishing mean and a variance of σ =Var[VR-VL]. The first term is called the drift whereas the second term is call the diffusion term. The residual input is the diffusion term, and represents the trial-by-trial fluctuation about the mean. Thus, the variance expresses the difference in choosing VR over VL, or vice versa. The resulting choice probability is an increasing S-shaped function as given by the cumulative effects of the preference state distribution.

underlying neural processing of decision-making have computational properties. Human decision-making sometimes uses heuristics in taking an action, that is shortcuts to more complex processes. The evolutionary advantage of this shortcut in the process of decision-making prevents deleterious results, such as injury or death. In his thesis, Turing called the mechanism for these shortcuts in computation an “oracle”, which decide when to halt computations, and, thus, to take an action. We propose such a mechanism for the underlying neural processes in DFT as represented by a neural network. Before we begin our description of NHD, we wish to write the solution of the preference state of DFT in polar form as is -iS(t) iφ(t) done in QHD. So, we let P(t)=R(t)e and ε(t)=χ(t)e . Then the stochastic equation of motion for DFT becomes

Rt cos S t=[1− sc h] Rt−hcos S t [ h tcos t ]

Input Tape and Environment

Program

Output Tape Figure 3. The perception-action cycle of a Turing machine: Input/output tape, program and environment. In comparison with AI, computation is performed with an explicit program running on a Turing Machine (see Figure 3), which can be viewed as a decision-maker interacting with the environment using a reinforcement learning algorithm [37,38]. The decision-maker's perceptions about the environment help them to determine subsequent actions or behaviors, and is often called a perception-action cycle. While perception and actions are important in understanding the resulting behavior of a decision-maker, especially in terms of psychological experiments, a description of the deliberative processes between perception and action is required in order to capture the nature of human cognition and intelligence. If the program is implicit then neural networks provide a natural setting for a theory of deliberation since they can describe real nonlinear dynamics of the mammalian brain, including reaction-diffusion processes and noise. Indeed, Grossberg's mathematical theory of learning bridges the implicit computational algorithms of neural networks to the psychological implications of cognition, learning and behavior [39] as does DFT [19]. IV.

NEUROHYDRODYNAMICS

In this section, we introduce computational properties to DFT by proposing a neural network representation of the underlying neural processes in decision-making. During the deliberation time, processes occur that seem to have no psychological effect rather they have a physiological effect on the underlying neural system. These physical changes in the

Rt sin S t =[ 1− sc h ]R t−hsin S t  tsin  t , where these are the real and imaginary parts of DFT. These equations and the original equations represent temporal diffusion. So, the preference state P(t) can be represented as the activation of a neuron under a temporal diffusion process with the attention weights given by the diffusion term of DFT. However, in relation to the underlying neural processes, the deliberation process is about the “right time” and the “right place” for neural computation. From a psychological perspective, this characterization is about the context in which decisions are made but it lacks the shortcuts in computation. Since searching over all courses of action is costly, in both time and resources, we seek to include a mechanism that shortcuts the decision-making processes in DFT. Grossberg provides a rigorous mathematical description for a neural network with psychological foundations that are retained in DFT. Let N be the number of th neurons. In [29], for the j neuron, the nonlinear differential equation for the activation uj is given by n d uj =a j u j  b j u j ∑ k=1 c jk d k u j  , dt

[

]

which does not include a diffusion term. When electrons travel in asymmetric electromagnetic fields or diffusion across the synaptic cleft, as in the case of real neurons, the activation of neurons becomes spacially dependent [40,41,42,43,44] and the diffusion effect can no longer be ignored. Under these circumstances, we can modify DFT by requiring the preference state represent the activation of space dependent neuron and that it to be a function of both time and space. So, we introduce a spacial diffusion process of a neural network as given by a

Cohen-Grossberg-type set of stochastic partial differential equations of the form n ∂u j =a j u j  b j u j ∑k =1 c jk d k u j  ∂t  j ⋅ e j  x , t   u j  x , t E  x ,t  ,

[

]

2

where μ is the diffusion coefficient, ∇ is the Laplacian, ε(x,t) is white noise in space and time of the external environment, and uj(x,t) denotes a continuous neuronal activation wave. The last term in the first equation represents the flux in maximal th changes of activation as weighted by ej(x,t) of the j neuron. Diffusion effects are now included since electrons travel in asymmetric electromagnetic fields as in the case of real neurons, where activation of neuron become spacial dependent [40,41]. Additionally, we add an auxiliary equation of motion for the attention weights using reinforcement learning. We assume th th the time rate change in the weight between the i and j neuron is proportional to the time rate change in the neuropotential and the amount of “work” of the neuropotential over a time interval with recent contributions to the “work” more influential. Then the auxiliary equation for the attention weights becomes

dc jk  x ,t  dN k  x , t =− jk dt dt tf

×∫t  c N k  c jk , e i

−k 

jk

d  I  x , t , th

where λk denote the “forgetting” rate of the k neuron, ηjk denote the learning rate, and the internal residual εI(x,t) is normally distributed noise that arises from the noisy process of diffusion. Analogous to the quantum potential in QHD, we define the neuropotential as 2

 Rj N j  x , t =− j , R j  x , t which provides guidance of the diffusion process during the evolution of the neurodynamics of the neural network. While attention weight modifications of neural network can be given by several descriptions, we will use the time difference (TD) methods of reinforcement learning. In TD methods, the change in the synaptic weight is multiplicatively proportional to the time change in the neuropotential and an integrated gradient of the neuropotential of NHD [38]. The neuropotential is viewed as a measure of the “internal stress” of the activation of a neuron. Since the neuropotential is independent of the activation amplitude, it only depends on the activations shape. The initial conditions are carried by the neuropotential as it reaches equilibrium counterbalancing other potentials that may be present. The most significant feature of the neuropotential is that each neuron depends on other neurons with all subject to organization by the whole.

While activations or preference states arises during a temporal diffusion process that produces estimates of the expected utility in DFT, one should include a spacial dependent component. Thus, one should consider brain regions that combine probabilities through the appropriate mechanisms, such as lateral inhibition, a hierarchy or associative learning. Such mechanisms are included in the Cohen-Grossberg equations as they capture the nonlinear dynamics of these brain regions. These mechanisms are not given in DFT as it is a linear model for the decision-making process. We use Grossberg's approach since these mechanisms are included in this general nonlinear theory for a neural network representation of human decision-making. V.

CONNECTIONS TO NEUROSCIENCE

Recall that the preference state in DFT corresponds to the activation of a neuron in a neural network. However, as we have seen, there is no mechanism for adjusting the weights in DFT since only temporal diffusion is present in this theory. Real brains have spacial diffusion (as well as temporal diffusion) that can depend on the transmission of electrical signals or neurotransmitters. Heuristics in the decision-making process requires some mechanism to shortcut computations during the deliberation time, that is a way to circumvent deliberation rules. With the addition of a spacial diffusion term in the Cohen-Grossberg equations, we propose an ART-type model that includes heuristics, and thus shortcuts the computations of more deliberative processing. Then we associate our model to the proposed brain regions responsible for carrying out the decision-making processes using either deliberation or heuristics. In the mammalian brain, the striatum is well known to have extensive arborization of dendrites and axons creating a network of distance dependent laterally inhibited neurons [43,43] mediated by dopamine using reinforcement learning. From an information processing perspective, the striatum has three basic operations: preparation for preprogrammed action, exclusive selection of a particular response and learning from knowledge of results. Thalamic disinhibition of sensory information occurs when striatal neurons inhibit the globus pallidus internal (GPi) permitting the cortex to engage in an action selected by the striatal neurons. Hence, the corticalstriatal loop through the thalamus clearly plays a role in the motorsensory decision-making processes utilizing reinforcement learning [28,44]. In the acquisition of a motor sequence [45], simultaneous activity is observed in the corticostriatal and cortico-cerebellar systems. Both are active during the early stages of learning but the activity of the corticocerebellar system decreases with practice. Once performance is achieved, the cortico-striatal system remains active suggesting that this region is critical for long-term retention of practiced movements. Another important cortico-strio-pallidal system involves the OFC and vmPFC and the nucleus accumbens, which are closely connected with the amygdala and thereby responsible for the emotional influences on higher-order decision-making processes. Neuromodulators such as dopamine and serotonin play a strong role in regulating those brain structures

responsible for fear and anxiety [46,47]. Dopaminergic transmission in the nucleus accumbens has been shown to be important for maintaining responses in conditions of intermittent reward. Even the basolateral amygdala seems to have the ability to pair stimuli with reward when the primary reward has occurred in the remote past [48]. The association cortices are implicated in executive control and are reciprocally connected to the hippocampus, the site of working memory. The hippocampus encodes the current context or “episode” by which actions or behaviors are judged to be relevant or irrelevant. The hippocampus is known to be strongly engaged in early training but it has little effect in the latter stages of learning. On the other hand, the medial prefrontal cortex (mPFC) has been shown to have a profound effect on the latter stages of learning but little effect on the earlier stages of learning. The ACC is a detector of complexity or controversiality for cognitive processes and emotional influences. The DLPFC manipulates relationships in working memory and is required for choices of a new rule if an old one is found not to be rewarding. The OFC makes effective decisions based on rewards and emotional or social contexts. But the basal ganglia are the final “gate” at which actions are permitted or prevented. Hence, these structures are complementary in function by virtue of their intervention in the learning process albeit at different times. Different aspects of reward evaluation in the mammalian brain are processed by the anterior cingulate and orbital prefrontal cortices, and the ventral striatum. The anterior cingulate cortex (ACC) and the orbital prefrontal cortex (OFC) mediate the different aspects of reward-based behaviors, error prediction, and the choice between short- and long-term gains [49,50,51]. The dorsolateral prefrontal cortex (DLPFC) and the dorsal striatum are involved in cognitive function critical to decisionmaking. The DLPFC is engaged when working memory is required for monitoring behavior. The reward signal, critical to the striatal role in learning, is mediated by the midbrain dopamine cells. Recent human imaging studies have divided the prefrontal cortex into the dorsal anterior cingulate cortex (dACC), the ventral, medial prefrontal cortex (vmPFC), OFC, and DLPFC based on specific roles for mediating different aspects of error prediction and decision-making [52,53]. While the various neural systems involved in emotion, cognition, and goal direction operate somewhat autonomously, there is also evidence for a hierarchical arrangement among regions implicated in executive function. For example, the amygdala seems to cumulatively learn the reward or punishment value of particular stimuli and so does the OFC. While the two areas are reciprocally interconnected, the OFC representations are more flexible and carry more information about previous timing of rewards and punishments [54]. This leads to the notion that both regions are essential for processing of emotional significance with the OFC processing at a higher level of abstraction than the amygdala. The DLPFC seems to process at still higher a level of abstraction than the OFC, as indicated by the fact that monkeys with OFC lesions cannot learn changes in reward value of specific stimuli whereas monkeys with DLPFC lesions cannot learn changes in the salience of categories of stimuli [55]. These results led Levine [50] to suggest that these three regions of (basolateral)

amygdala, OFC, and DLPFC are arranged in some sort of hierarchical network with learnable bottom-up and top-down influences analogous to those between levels of sensory cortex (see Figure 4). The ACC would then play the role of a “reset” that modulates these top-down and bottom-up connections based on the success or failure of goal-directed actions. This enables the network to simultaneously encode competing rules of different levels of complexity, such as unconsciously applied heuristics [14] and deliberative rules that require thought [56]. Also the network includes executive control that can decide among competing rules in a particular situation based on both external and internal contexts. F3 (DLPFC)

F2 (OFC)

ART MODULE FOR COMPLEX RULES ERROR (ACC)

ART MODULE FOR HEURISTICS

F1 (AMYGDALA) Figure 4. Basic neural framework for encoding multiple decision rules at varying levels by means of an adaptive resonance network. Arrows denote excitation, filled circles inhibition, semicircles learning [21]. In [54,57], Bechara and his colleagues developed the Iowa Gambling Task involving a series of trials (100 trials in the earliest versions), in each of which the participant sees the same four virtual card decks on a computer screen and must choose one of them. After each deck choice, the participant receives feedback on the amount of (virtual) money won or lost. The choice of each deck leads to winning money on every trial, and the amount of money won is larger on two of the decks than on the other two. However, the choices are also associated with periodic losses (either on every second or every tenth trial). The amounts of the losses are such that choice of either deck with a larger positive payoff leads to negative expected earnings, whereas choice of either deck with a smaller positive payoff leads to positive expected earnings. Most normal adults, but not patients with damage to either the amygdala or the OFC, choose the “bad” decks in early trials but gradually learn to switch to the “good” decks after receiving feedback over a series of trials. In another decision-making task, Gureckis and Love [58] gave their participants a task called “Farming on Mars” in which they had to choose between one of two robots that would

generate oxygen for future human inhabitants of that planet. The payoff contingencies were set up so that there was a “longterm” and a “short-term” robot. On any given trial, choosing the short-term robot would yield more oxygen than choosing the long term robot. Yet both the oxygen payoffs from the short- and long-term robots depended on previous choices, and were both increased by more choices of the long-term robot in the last ten trials. Hence, for each trial the short-term robot seemed the better choice, but the actual optimal strategy was to always choose the long-term robot.

VI.

CONCLUSIONS

Certainly, these executive neural systems are a substrate for human decision-making that can either be short-term or longterm in its orientation, depending on a combination of task requirements, available time, and personality factors. A variety of decision tasks have been used in behavioral experiments involve situations in which the strategies leading to maximizing short-term and long-term payoffs are clearly different. Yet decision-making involving effort requires more computational time and resources in order to arrive at an action. Either the using deliberation rules are used exclusively or a combination of deliberation rules with heuristics is implemented. In the latter case, the computational time is shorten and a decision produces an action – good or bad. If no deliberation rule is used then decision-making is completely determined by heuristics. We have proposed a neural framework for decision-making that implements Turing's uncomputable human “intuition” to shortcut deliberation rules using adaptive resonance theory. This extension to DFT utilizes the principle of the “right time” and the “right place” for a decision-making process by including spacial diffusion. Neurons are well known to have distant dependent (diffusion) properties either due to electromagnetic inhomogeneity or to the presence of neurotransmitters. With these considerations, NHD includes Alan Turing's “oracle” for making shortcuts to the decisionmaking process that use time-consuming and resource-hunger deliberation rules. REFERENCES [1]

[2] [3] [4]

[5] [6]

[7]

Turing, A. M. (1938). Systems of logic defined by ordinals, Proc. Lond. Math. Soc., Ser. 2, 45: 161-228. This was Turing’s Ph.D. thesis, Princeton University. Bohm, D. (1952). A suggested interpretation of the quantum theory in terms of hidden variables. I. Physical Review, 85, 166-179. Bohm, D. (1952). A suggested interpretation of the quantum theory in terms of hidden variables. II. Physical Review, 85, 180-193. Turing, A. M. (1952). The chemical basis of morphogenesis. Philosophical Transactions of the Royal Society of London. Series B, Biological Sciences, 237, 37-72. Dabrowski, L. (1993). A diffusion model of a neuron and neural nets. Biological Cybernetics, 68, 451-454. Hamdache, K., and Labadie, M. (2009). On a reaction-diffusion model for calcium dynamics in dendritic spines. Nonlinear Analysis: Real World Applications, 10, 2478-2492. Kargupta, H., and Ray, S. R. (1994). Temporal Sequence Processing based on the Biological Reaction-Diffusion Processes. IEEE World Congress on Computational Intelligence, 4, 2315-2320.

[8]

[9]

[10]

[11] [12]

[13]

[14] [15] [16]

[17]

[18]

[19]

[20]

[21]

[22]

[23]

[24]

[25]

[26]

[27]

[28]

[29]

Price, C. B., Wambacq, P., and Oosterlinck, A. (1992) The reactiondiffusion cellular neural network., Proceedings of Second International Workshop on Cellular Neural Networks and their Applications, 14-16 Oct 1992 264-269. Ratcliff, R., Hasegawa, Y. T., Hasegawa, R. P., Smith, P. L., and Segraves, M. A. (2007). Dual diffusion model for single-cell recording data from the superior colliculus in a brightness discrimination task. Journal of Neurophysiology, 97, 1756-1774. Roxin, A., and Ledberg, A. (2008). Neurobiological models of twochoice decision making can be reduced to a one dimensional nonlinear diffusion equation. Computational Biology, 4, 1-13. Yuasa, H., Ito, S., and Ito, K. M. (1997). Associative memory with the reaction-diffusion equation. Biological Cybernetics, 76, 129-137. Zhao, X. (2004). Stability in the steady state of the second-order Hopfield neural networks with reaction-diffusion terms. Dynamics of Continuous, Discrete and Impulsive Systems, Series A: Mathematical Analysis, 11, 569-587. Grossberg, S. (1975). On the development of feature detectors in the visual cortex with applications to learning and the reaction-diffusion systems, Biol. Cybernetics, 21, 145-159. Tversky, A., and Kahneman, D. (1981). The framing of decisions and the rationality of choice. Science, 211, 453-458. Tversky, A., and Kahneman, D. (1974). Judgment under uncertainty: Heuristics and biases. Science,185, 1124-1131. Hogarth, R. M. (1981). Beyond discrete biases: Functional and dysfunctional aspects of judgmental heuristics. Psychological Bulletin, 90, 197-217. Jagacinski, R.J., and Flach, J.M. (2003). Control theory for humans: Quantitative approaches to modeling performance. Mahwah, New Jersey: Erlbaum. Barron, G., and Erev, I. (2003). Small feedback-based decisions and their limited correspondence to description-based decisions. Journal of Behavioral Decision Making, 16, 215-233. Busemeyer, J. R. and Townsend, J. T. Decision Field Theory: A Dynamic-Cognitive Approach to Decision Making in an Uncertain Environment, Psychological Review, 100, (1993), 432-459. Levine, D. S., Mills, B. A., and Estrada, S. (2005). Modeling emotional influences on human decision making under risk. Proceedings of International Joint Conference on Neural Networks, August, 2005, 16571662. Hardy, L. C., Levine, D. S. and Liu, D. (2009). On the th neurohydrodynamics of neural networks. Post-Conference Proceedings of the 13 World Multi-Conference on Systemics, Cybernetics and Infomatics: WMSCI 2009. Wagar, B. M., and Thagard, P. (2004). A neurocomputational theory of cognitive-affective integration in decision making. Psychological Review, 111, 67-79. Brown, J. W., Bullock, D. and Grossberg, S. (2004). How laminar frontal cortex and basal ganglia circuits interact to control planned and reactive saccades, Neural Networks, 17, 471-510. Dranias, M., Grossberg, S. and Bullock, D. (2008). Dopaminergic and non-dopaminergic value systems in conditioning and outcome-specific revaluation. Brain Research, 1238, 239-287. Frank, M. J., and Claus, E. D. (2006). Anatomy of a decision: Striatoorbitofontal interactions in reinforcement learning, decision-making, and reversal, Psychological Review, 113, 300-326. Kim, S., Hwang, J., Seo, H. and Lee, D. (2009). Valuation of uncertain and delayed rewards in primate prefrontal cortex, Neural Networks, 22, 294-304. Rushworth, M. F. S. and Behrens, T. E. J. (2008). Choice, uncertainty and value in prefrontal and cingulate cortex. Nature Neuroscience, 11, 389-397. Busemeyer, J.R., Jessup, R. K., Johnson J. G., and Townsend J. T. , Building bridges between neural models and complex decision making behavior, Neural Networks, 8, (2006), 1047-58. Cohen M. A. and Grossberg, S. (1983). Absolute stability of global pattern formation and parallel memory storage by competitive neural networks, IEEE Trans. Sys., Man and Cybernetics, 15, (1983), 815-826.

[30] Grossberg, S. and Levine, D. S. (1987). Neural dynamics of attentional modulated Pavlovian conditioning: Blocking, inter-stimulus interval, and secondary reinforcement, Applied Optics, 26, 5015-5030. [31] Grossberg, S. and Schmajuk, N. A. (1987). Neural dynamics of attentional modulated Pavlovian conditioning: Conditioned reinforcement, inhibition, and opponent processing, Psychobiology, 15, 195-240. [32] Carpenter, G. A. and Grossberg, S. (1987). A massively parallel architecture for a self-organizing neural pattern recognition machine, Computer Vision, Graphics, and Image Processing, 37, 54-115. [33] Carpenter, G. A. and Grossberg, S., ART 2: Self-organization of stable category recognition codes for analog input patterns, Applied Optics, 26, (1987), 4919-4930. [34] Carpenter, G. A. and Grossberg, S. (1990). ART 3: Hierarchical search using chemical transmitters in self-organizing pattern recognition architecture, Neural Networks, 3, 129-152 [35] Wyatt, R. E. (2005). Quantum dynamics with trajectories: Introduction to quantum hydrodynamics. Springer. [36] Lee, S., Son, Y., Jin, J., Decision field theory extensions for behavior modeling in dynamic environments using Bayesian belief networks. Information Sciences, 178, (2008), 2297-2314. [37] Hutter, M. (2005). Universal artificial intelligence: Sequential decisions based on algorithmic probability. Heidelberg: Springer-Verlag. [38] Sutton, R. and Barto, A. (1998) Reinforcement Learning, MIT Press. [39] Grossberg, S. (1967). Nonlinear difference-differential equations in prediction and learning theory. Proceedings of the National Academy of Sciences, 58, 1329-1334. [40] Wan, L. and Zhou, Q., Exponential stability of stochastic reactiondiffusion Cohen-Grossberg neural networks with delays, Applied Math. Comp. 206, (2008), 818-824. [41] Wu, R. and Zhang, W., Global exponential stability of delayed reactiondiffusion neural networks with time-varying coefficients, Expert Sys. Appl. 36, (2009), 9834-9838. [42] Bielecki, A. and Kalita, P. (2008). Model of neurotransmitter fast transport in axon terminal of presynaptic neuron. J. Math. Bio., 56, 559576. [43] Wickens, J., Theory of the Striatum, Pergamon Press, New York, 1993. [44] Wickens, J. (1997). Basal ganglia: Structure and computations. Network: Computation in Neural Systems, 8, R77-R109. [45] Doyon, J., Bellec, P., Amsel, R., Penhune, V., Monchi, O., Carrier J., Lehericy S., and Benali, H. (2009) Contributions of the basal ganglia and functionally related brain structures to motor learning. Behavioural Brain Research, 199, 61-75. [46] de Oliveira, A. R., Reimer, A. E., and Brandao, M. L. (2009). Role of dopamine receptors in the ventral tegmental area in conditional fear. Behavioral Brain Research, 199, 271-277. [47] Gray, J. (1995) A model of the Limbic System and Basal Ganglia: Applications to Anxiety and Schizophrenia, In Gazzaniga, M. S. (Ed.), The Cognitive Neuroscience, Cambridge: MIT Press, 1165-1176. [48] Simmons, D. A., and Neill, D. B. (2009). Functional interaction between the basolateral amygdala and the nucleus accumbens underlies incentive motivation for food reward on a fixed ratio schedule. Neuroscience, 159, 1264-1273. [49] Botvinick, M. M., Cohen, J. D., and Carter, C. S. (2004). Conflict monitoring and anterior cingulate cortex: An update. Trends in Cognitive Sciences, 8, 539-546. [50] Levine, D. S. (2009). Brain pathways for cognitive-emotional decision making in the human animal. Neural Networks, 22, 286-293. [51] McClure, S. M., Laibson, D. I., Loewenstein, G., and Cohen, J. D. (2004). Separate neural systems value immediate and delayed monetary rewards. Science, 306, 503-507. [52] DeMartino, B., Kumaran, D., Seymour, B., and Dolan, R. (2006). Frames, biases, and rational decision-making in the human brain. Science, 313, 684-687.

[53] DeNeys, W., Vartanian, O., and Goel, V. (2008). Smarter than we think: When our brain detects we?re biased. Psychological Science, 19, 483489. [54] Bechara, A., Damasio, A. R., Damasio, H., and Anderson, S. W. (1994). Insensitivity to future consequences following damage to human prefrontal cortex. Cognition, 50, 7-15. [55] Dias, R., Robbins, T. W., and Roberts, A. (1996). Dissociation in prefrontal cortex of attentional and affective shifts. Nature, 380, 69-72. [56] Pacini, R., and Epstein, S. (1999). The interaction of three facets of concrete thinking in a game of chance. Thinking and Reasoning, 5, 303325. [57] Bechara, A., Tranel, D., Damasio, H., and Damasio, A. R. (1996). Failure to respond autonomically to anticipated future outcomes following damage to prefrontal cortex. Cerebral Cortex, 6, 215-225. [58] Gureckis, T. M., and Love, B. C. (2009). Learning in noise: Dynamic decision-making in a variable environment. Journal of Mathematical Psychology, 53, 180-193.