Over the past several decades, economists, psychologists, and neuroscientists have conducted experiments in which a subject, human or animal, repeatedly chooses between alternative actions and is rewarded based on choice history. While individual choices are unpredictable, aggregate behavior typically follows Herrnstein's matching law: the average reward per choice is equal for all chosen alternatives. In general, matching behavior does not maximize the overall reward delivered to the subject, and therefore matching appears inconsistent with the principle of utility maximization. Here we show that matching can be made consistent with maximization by regarding the choices of a single subject as being made by a sequence of multiple selves-one for each instant of time. If each self is blind to the state of the world and discounts future rewards completely, then the resulting game has at least one Nash equilibrium that satisfies both Herrnstein's matching law and the unpredictability of individual choices. This equilibrium is, in general, Pareto suboptimal, and can be understood as a mutual defection of the multiple selves in an intertemporal prisoner's dilemma. The mathematical assumptions about the multiple selves should not be interpreted literally as psychological assumptions. Human and animals do remember past choices and care about future rewards. However, they may be unable to comprehend or take into account the relationship between past and future. This can be made more explicit when a mechanism that converges on the equilibrium, such as reinforcement learning, is considered. Using specific examples, we show that there exist behaviors that satisfy the matching law but are not Nash equilibria. We expect that these behaviors will not be observed experimentally in animals and humans. If this is the case, the Nash equilibrium formulation can be regarded as a refinement of Herrnstein's matching law.
The ability to represent time is an essential component of cognition but its neural basis is unknown. Although extensively studied both behaviorally and electrophysiologically, a general theoretical framework describing the elementary neural mechanisms used by the brain to learn temporal representations is lacking. It is commonly believed that the underlying cellular mechanisms reside in high order cortical regions but recent studies show sustained neural activity in primary sensory cortices that can represent the timing of expected reward. Here, we show that local cortical networks can learn temporal representations through a simple framework predicated on reward dependent expression of synaptic plasticity. We assert that temporal representations are stored in the lateral synaptic connections between neurons and demonstrate that reward-modulated plasticity is sufficient to learn these representations. We implement our model numerically to explain reward-time learning in the primary visual cortex (V1), demonstrate experimental support, and suggest additional experimentally verifiable predictions.
It is widely believed that learning is due, at least in part, to long-lasting modifications of the strengths of synapses in the brain. Theoretical studies have shown that a family of synaptic plasticity rules, in which synaptic changes are driven by covariance, is particularly useful for many forms of learning, including associative memory, gradient estimation, and operant conditioning. Covariance-based plasticity is inherently sensitive. Even a slight mistuning of the parameters of a covariance-based plasticity rule is likely to result in substantial changes in synaptic efficacies. Therefore, the biological relevance of covariance-based plasticity models is questionable. Here, we study the effects of mistuning parameters of the plasticity rule in a decision making model in which synaptic plasticity is driven by the covariance of reward and neural activity. An exact covariance plasticity rule yields Herrnstein's matching law. We show that although the effect of slight mistuning of the plasticity rule on the synaptic efficacies is large, the behavioral effect is small. Thus, matching behavior is robust to mistuning of the parameters of the covariance-based plasticity rule. Furthermore, the mistuned covariance rule results in undermatching, which is consistent with experimentally observed behavior. These results substantiate the hypothesis that approximate covariance-based synaptic plasticity underlies operant conditioning. However, we show that the mistuning of the mean subtraction makes behavior sensitive to the mistuning of the properties of the decision making network. Thus, there is a tradeoff between the robustness of matching behavior to changes in the plasticity rule and its robustness to changes in the properties of the decision making network.
The probability of choosing an alternative in a long sequence of repeated choices is proportional to the total reward derived from that alternative, a phenomenon known as Herrnstein's matching law. This behavior is remarkably conserved across species and experimental conditions, but its underlying neural mechanisms still are unknown. Here, we propose a neural explanation of this empirical law of behavior. We hypothesize that there are forms of synaptic plasticity driven by the covariance between reward and neural activity and prove mathematically that matching is a generic outcome of such plasticity. Two hypothetical types of synaptic plasticity, embedded in decision-making neural network models, are shown to yield matching behavior in numerical simulations, in accord with our general theorem. We show how this class of models can be tested experimentally by making reward not only contingent on the choices of the subject but also directly contingent on fluctuations in neural activity. Maximization is shown to be a generic outcome of synaptic plasticity driven by the sum of the covariances between reward and all past neural activities.
A persistent change in neuronal activity after brief stimuli is a common feature of many neuronal microcircuits. This persistent activity can be sustained by ongoing reverberant network activity or by the intrinsic biophysical properties of individual cells. Here we demonstrate that rat and guinea pig cerebellar Purkinje cells in vivo show bistability of membrane potential and spike output on the time scale of seconds. The transition between membrane potential states can be bidirectionally triggered by the same brief current pulses. We also show that sensory activation of the climbing fiber input can switch Purkinje cells between the two states. The intrinsic nature of Purkinje cell bistability and its control by sensory input can be explained by a simple biophysical model. Purkinje cell bistability may have a key role in the short-term processing and storage of sensory information in the cerebellar cortex.
Many neurons in the brain remain active even when an animal is at rest. Over the past few decades, it has become clear that, in some neurons, this activity can persist even when synaptic transmission is blocked and is thus endogenously generated. This “spontaneous” firing, originally described in invertebrate preparations (Alving, 1968; Getting, 1989), arises from specific combinations of intrinsic membrane currents expressed by spontaneously active neurons (Llinas, 1988). Recent work has confirmed that, far from being a biophysical curiosity, spontaneous firing plays a central role in transforming synaptic input into spike output and encoding plasticity in a wide variety of neural circuits. This mini-symposium highlights several key recent advances in our understanding of the origin and significance of spontaneous firing in the mammalian brain.
The calculation and memory of position variables by temporal integration of velocity signals is essential for posture, the vestibulo-ocular reflex {(VOR)} and navigation. Integrator neurons exhibit persistent firing at multiple rates, which represent the values of memorized position variables. A widespread hypothesis is that temporal integration is the outcome of reverberating feedback loops within recurrent networks, but this hypothesis has not been proven experimentally. Here we present a single-cell model of a neural integrator. The nonlinear dynamics of calcium gives rise to propagating calcium wave-fronts along dendritic processes. The wave-front velocity is modulated by synaptic inputs such that the front location covaries with the temporal sum of its previous inputs. Calcium-dependent currents convert this information into concomitant persistent firing. Calcium dynamics in single neurons could thus be the physiological basis of the graded persistent activity and temporal integration observed in neurons during analog memory tasks.
The cerebellar cortex contains the majority of the neurons in the central nervous system, which are well organized in a lattice-like structure. Despite this apparent simplicity, the function of the olivo-cerebellar system is still largely unknown. In this thesis I have tried to contribute to the understanding of the system by studying three aspects of the dynamics of neurons and their relation to the function (see bellow). Although these questions have emerged from the study of the olivo-cerebllar system, the results are more general, and relate to neuronal dynamics and the function of other brain structures.
In many biological systems, the electrical coupling of nonoscillating cells generates synchronized membrane potential oscillations. This work describes a dynamical mechanism in which the electrical coupling of identical nonoscillating cells destabilizes the homogeneous fixed point and leads to network oscillations via a Hopf bifurcation. Each cell is described by a passive membrane potential and additional internal variables. The dynamics of the internal variables, in isolation, is oscillatory, but their interaction with the membrane potential damps the oscillations and therefore constructs nonoscillatory cells. The electrical coupling reveals the oscillatory nature of the internal variables and generates network oscillations. This mechanism is analyzed near the bifurcation point, where the spatial structure of the membrane potential oscillations is determined by the network architecture and in the limit of strong coupling, where the membrane potentials of all cells oscillate in-phase and multiple cluster states dominate the dynamics. In particular, we have derived an asymptotic behavior for the spatial fluctuations in the limit of strong coupling in fully connected networks and in a one-dimensional lattice architecture.
Tremor is a potentially disabling pathology that affects millions of people. The inferior olive (IO) has been implicated in several types of tremor.1,2 In particular, electrical synapses have been shown to be essential for the generation of oscillatory activity in the IO,3 which may manifest as tremor. In a recent paper,4 we described how the electrical coupling of non-oscillating cells can generate oscillatory network behavior. Here we apply this dynamic mechanism to the IO and discuss the possible clinical applications...
In several biological systems, the electrical coupling of nonoscillating cells generates synchronized membrane potential oscillations. Because the isolated cell is nonoscillating and electrical coupling tends to equalize the membrane potentials of the coupled cells, the mechanism underlying these oscillations is unclear. Here we present a dynamic mechanism by which the electrical coupling of identical nonoscillating cells can generate synchronous membrane potential oscillations. We demonstrate this mechanism by constructing a biologically feasible model of electrically coupled cells, characterized by an excitable membrane and calcium dynamics. We show that strong electrical coupling in this network generates multiple oscillatory states with different spatio-temporal patterns and discuss their possible role in the cooperative computations performed by the system.
The Hebrew University websites utilize cookies to enhance user experience and analyze site usage. By continuing to browse these sites, you consent to our use of cookies.