neural networks to simulate human learning - Semantic Scholar

10 downloads 0 Views 47KB Size Report
11800 Penang, Malaysia. ... a 'modular' neural network architecture that integrates a variety of neural networks in some principled fashion. We propose a ...
NEURAL NETWORKS TO SIMULATE HUMAN LEARNING: A SHIFT TOWARDS ‘MODULAR’ ARCHITECTURES Syed Sibte Raza Abidi School of Computer Sciences Universiti Sains Malaysia 11800 Penang, Malaysia. Email: [email protected]

ABSTRACT Neural networks have a natural propensity for learning - learning from being instructed or from experience. We believe that neural networks provide substantial opportunities for simulating various human learning activities in that these networks emphasise an inherent adaptive learning ability, either supervised (based on instructions) or unsupervised (based on observations). However, to use neural networks for simulating aspects of human cognition and development it remains of interest to examine the parallels, if any, between human learning and neural network learning mechanisms. For that matter, we suggest a possible interpretation of traditional psychological notions of human learning in a neural network terminology. We argue that in order to simulate aspects of human learning, it is important to use a ‘modular’ neural network architecture that integrates a variety of neural networks in some principled fashion. We propose a framework for developing ‘modular’ neural network architectures, and present ACCLAIM, an exemplar ‘modular’ neural network architecture for simulating the development of early child language.

1. INTRODUCTION Learning is an extremely studied subject, particularly in psychology and more recently in Artificial Intelligence and Neural Networks (or Connectionist Networks). Neural networks emphasise an inherent adaptive learning ability, either supervised (based on instructions) or unsupervised (based on observations). Both the neural network community and psychologists widely suggest that due to their learning capabilities, neural networks provide substantial opportunities for simulating various human learning activities. However, it remains of interest to examine the parallels, if any, between human learning and neural network learning mechanisms. The influence of psychology on neural network learning has been always very direct: Hebbian learning, that is, to reinforce the connection weights between simultaneously active units, was inspired by early Pavlovian learning models. More recently, some neural network researchers and philosophers of science, including McClelland (1988), Bechtal & Abrahamsen (1991) and Seidenberg (1993) have suggested a neural network based interpretation of psychological notions of learning, in particular notions of human learning proposed by the eminent Swiss psychologist Jean Piaget. In this paper, we firstly provide an interpretation of psychological notions of human learning in a neural network terminology. This exercise leads towards establishing the aptness of neural networks for simulating aspects of human learning and cognition. Next, we argue that to perform realistic simulations of human learning activities it is important to use ‘modular’ neural network architectures, so as to incorporate the variety of constraints that need to be addressed by a realistic simulation. In this regard, we propose a framework for developing modular neural network architectures. Finally, we demonstrate how neural networks can be used to simulate, or more accurately trained, to mimic the development of language amongst children during infancy. We present an exemplar modular neural network architecture -- ACCLAIM, which not only simulates aspects of the development of child language both at the oneword and two-word stage, but also produces child-like one-word utterances and two-word sentences.

2. AN INTERPRETATION OF HUMAN LEARNING IN TERMS OF NEURAL NETWORK TERMINOLOGY According to eminent developmental psychologist Jean Piaget, “learning is the acquisition of knowledge due to some particular information provided by the environment. Learning is inconceivable without a theoretically prior interior structure of equilibrium which provides the capacity to learn and the structuring of the learning process; in a wide sense, it includes both” (Furth, 1969: 294). Furthermore, the so-called cognitive development is made possible through an interaction between what Piaget’s calls assimilation - a process by which perceptual stimuli are absorbed and interpreted, and accommodation - a co-occurring process whereby the ‘internal structures’ are adjusted to facilitate the assimilation of the new perceptual stimuli. It may be noted that Piaget’s definition of human learning includes references to environment, prior interior structure, capacity to learn and a learning process. Piagetian notions of learning, synthesising biological growth and environmental influence, have a computational interpretation, albeit rather a simplistic one. We argue that a computational interpretation of psychological notions of learning need to incorporate data structures that can learn. By learning we mean that the data structures should have the tendency to modify or expand to incorporate new information acquired by way of continuous interaction with the environment. Traditional AI structures may suffice to represent knowledge, but they lack the ability to learn in a developmental, time-varying manner. On the contrary, we find neural networks having a natural propensity to learn - either from experience or from being told, and that their learning mechanisms have some empathy with Piaget’s notions of learning. Our attempt to reinterpret Piaget’s notion of learning in neural network terminology takes into account James McClelland’s (1989) seminal paper, in which he provides an introductory exposition of 1 connectionism’s influence on modelling aspects of cognition. McClelland’s essay includes the description of ‘the learning principle’ governing cognitive development: “adjust the parameters of the mind in proportion to the extent to which their adjustments can produce a reduction in the discrepancy between the expected and observed events” (1989:20). It is of interest to note that, from McClelland’s learning principle, which claims to capture the residue of Piaget’s notions of human learning, emerges a neural network based interpretation of Piaget’s notions of learning, as noted in table 1. Piagetian Constructs Parameters of the mind Expected event Observed event Adjustment of the parameters Discrepancy reduction

Analogous Neural Network Notions Connections among units. Both entities are amenable to alteration due to experience. Desired pattern of activation over the network's output units. Actual pattern of activation produced over the network's output units. Connectionist learning processes that involve adjustment of connections. 'Error minimisation' process during connectionist learning, reducing error between expected and observed pattern of activation.

Table 1: Correspondence between Piagetian constructs and analogous neural network notions.

We believe that, an advantage of suggesting a neural network based interpretation of Piagetian notions of learning is that these notions can now be implemented into neural networks and can be observed by simulating a variety of cognition oriented scenarios, for instance the simulation of concept development, language development, language production and so on. 3. A SHIFT TOWARDS ‘MODULAR’ NEURAL NETWORK ARCHITECTURES

1

We would use the wordConnectionism or Connectionist as a synonym of neural networks

Over the years neural network technology has certainly matured, in a theoretical sense new neural network architectures and learning algorithms have been formulated, also the philosophical implications of neural networks are seemingly now more well-grounded. We believe that now when the efficacy of neural networks is widely accepted, the scope of cognitive modelling using neural networks need to be expanded. Previously, many neural network researchers have attempted to simulate aspects of human learning, in particular linguistic behaviour, using a single neural network and learning mechanism -- the multilayered backpropagation network, a controlled feedback loop, implementing a supervised learning algorithm. The success of such simulations was determined in terms of the ability of the neural network to associate a set of input patterns with a corresponding set of output patterns. Although, the strategy employed by early researchers is seemingly valid for simulating ‘low-level’ cognitive activities, however if one needs to simulate a ‘high level’ cognitive activity, which involves an interplay of a variety of cognitive aspects, the single neural network approach would certainly prove inadequate. Developmental psychologists have consistently argued that human development, which may include the development of language, sensori-motor control, visual recognition, and object permanence, etc. is achieved through different learning mechanisms, for instance error correction, classical conditioning, self-organisation, pattern classification and so on. Therefore, to perform a realistic simulation of some aspect of human cognitive development, in our case language development, one at least needs a unified framework that (a) incorporates a variety of learning mechanisms; (b) manipulates a variety of inputs - perceptual, verbal, functional, etc.; (c) includes both localist and distributed representation schemes; and (d) satisfies multiple simultaneous constraints. This brings into relief the need for a ‘modular’ neural network architecture: an architecture that integrates both supervised and unsupervised learning algorithms in a unified framework, thus exploiting the collective strengths of a variety of neural networks to provide a more ‘realistic’ simulation. In a modular neural network architecture the individual neural networks retain their structural and functional distinctness and can be viewed as independent 'modules' of a model. Development of modular neural network architectures, in simple terms, requires the ‘mixing and matching’ of the relative strengths of a variety of neural networks. We propose a framework for developing modular neural network architectures that distinguishes candidate neural networks on the basis of their intrinsic characteristics such as learning mechanisms, input/output representation schemes, environmental considerations and so on (Abidi, 1994 & 1996). Our framework mainly emphasises (i) psychological and neurobiological distinctions between various neural networks when selecting neural networks to simulate certain tasks; (ii) architectural specifications - determining the number of layers, number of units in a layer, activation update functions and learning parameters; (iii) a plausible connectivity scheme by which various neural networks can efficiently communicate with each other; and (iv) variety of training strategies, including (a) one neural network learning its training data independently; (b) two or more neural networks learning their specified training data simultaneously; and (c) a co-operative training strategy where one or more neural networks transform the training data to a representation scheme that is interpretable by the principal neural network being trained. We present below a seven phase strategy for developing modular neural network architectures: I.

II.

III. IV. V.

identify the sub-tasks constituting a complex cognitive activity. Use an individual neural network to simulate a sub-task. Such a neural network can be regarded as an independent module of the modular architecture. design appropriate neural networks that can simulate the sub-tasks. The design metrics are the number of layers, number of units in each layer, connectivity pattern of the layers and the activation function of the units. develop a knowledge representation scheme that can be shared by other neural networks, i.e., the knowledge stored in one network is accessible to other networks in the modular architecture. establish a communication mechanism among the neural networks so that information is accessible throughout the modular architecture. train each neural network with its respective stimuli either separately or if needed in conjunction with other related networks.

VI. represent explicitly the knowledge learnt by each neural network, such that it is understandable and has some significance to an external observer. VII. formulate a processing scheme that may synchronise the overall operation of the modular architecture. The processing scheme should retain concurrency, enhance the processing strengths of various networks and at the same time avoid unnecessary cross-talk (influence) between the neural networks. 4. ACCLAIM - A MODULAR NEURAL NETWORK ARCHITECTURE Language development is an exemplar ‘high level’ human cognitive development; a complex activity that seems improbable to simulate using just a single neural network. Rather, a realistic simulation of child language development would require a variety of multi-layered neural networks: for instance, one to learn to process lexical input and output, yet another to learn phonology and more networks to learn concepts, semantic relations and word-order. Based on earlier proposals advocating the ‘modularity’ approach for simulating high-level cognitive activities, we present a modular neural network model ACCLAIM (A Connectionist Child LAnguage Development & Imitation Model), which simulates child language development within the age group 9 - 24 months. ACCLAIM systematically synthesises both supervised and unsupervised learning neural networks (including Kohonen Maps, Backpropagation networks, Hebbian Connections and the Spreading Activation mechanism), based on our psycholinguistic model of child language development. ACCLAIM (see figure 1b) has been used to simulate the development and categorisation of concepts amongst children together with the lexicalisation of these concepts: the 'concept memory' and 'word lexicon' have been simulated using two independent Kohonen Maps that are linked together through a Hebbian Connection based ‘naming connection network’. Backpropagation networks have been used to implement a 'conceptual relation network’ (for one-word sentence production) and to implement a 'wordorder network’ (for two-word sentence production). Children's evolving semantic performance has been simulated by a ‘semantic relation network’ using an Hebbian Connection Network. The training data used for our simulation is based on 'realistic' child language data acquired from various child language studies. PERCEPTUAL INPUT (Semantic Features)

Uni tary Concept Modul e Perceptual Stimuli

Concept Memory

concepts

Conceptual Relations Network

One-word utterance Concept Memory Kohonen Map

Nami ng Connecti on Modul e Concepts

Concept Memory

Naming Connections Network

Word Lexicon

Perceptual Stimuli

Naming Connection Network Hebbian Connections

Word Lexicon Kohonen Map

Words

Semanti c Relati on Modul e concept Concept categories Semantic Relations Memory Network

LINGUISTIC INPUT (Adult two-word collocations)

Conceptual Relations Network

Semantic Relations Network

Backpropagation

Hebbian Connections

Learnt Semantic Relations

Word-Order Testing Network Backpropagation

Word Order Module Two-word collocation

Word Lexicon

words

(a)

Word-order Network

Learnt Word-order

ONE-WORD UTTERANCES

TWO-WORD SENTENCES

(b)

Figure 1: (a) Four neural network modules each comprising more than one neural network and simulating some aspect of child language development; (b) The modular architecture of ACCLAIM - an integration of various neural networks each simulating an aspect of child language development

Each of ACCLAIM’s constituent neural networks can be envisaged as an individual entity, embodying a different kind of knowledge. These neural networks can then be configured based on our psycholinguistic model to realise a variety of neural network ‘modules’, where each module simulates an aspect of child language development. For instance, the naming connection module which simulates concept naming, comprises three neural networks - the concept memory, word lexicon and naming connection network. It should be noted that within a neural network module the constituent neural networks retain their identity and interact with each other. In ACCLAIM, four different neural network modules (shown in Figure 1a) relevant to child language development are implemented by integrating various neural networks. One of the advantages of the modular design of ACCLAIM is that knowledge learnt by an individual neural networks is utilised by more than one module, for instance the concepts stored in the concept memory are used by three modules - the one-word module, naming connection module and the semantic relation module. At a deeper level, each module again can be envisaged as an independent model, capable of simulating a psycholinguistic process on its own. For instance, a simulation of the child’s development of semantic relations can be performed by just employing the semantic relation module. The modular approach of ACCLAIM makes it possible to work with one module at a time; enabling the simulation of the process with a variety of data without taking into account other modules. More attractively, at a later stage the results of the simulation obtained from one module can be used in simulations involving other modules. Finally, the efficacy of our simulation of child language development carried out using ACCLAIM is demonstrated by the fact that ACCLAIM is able to produce both one-word and two-word sentences in a certain situation, which are similar with the kind of sentences produced by a child in the same real-life situation. Furthermore, ACCLAIM is able to handle novel, noisy and incomplete input situations by generalising to produce adequate and meaningful responses. 5. CONCLUSIONS We have suggested a neural network interpretation of psychological notions of human learning which would assist researchers investigating the role of neural networks in simulating human learning and cognition. The move from single towards modular neural network architectures may form the basis for more elaborate and realistic simulations of human cognitive activities that involve an active interplay between a variety of processes. Furthermore, by way of ACCLAIM we have demonstrated how neural networks can be used to simulate high-level cognitive activities, in particular child language development The architecture of ACCLAIM and the resultant processing capabilities achieved, should be an indicator as to how functionally and structurally divergent neural networks when synthesised together in a meaningful manner, i.e. based on a psycholinguistic model, can simulate a high-level cognitive activity. REFERENCES Abidi, S.S.R. (1996) Neural Networks and Child Language Development: Towards a ‘Conglomerate’ Neural Network Simulation Architecture. To present at International Conference on Neural Information Processing ‘96, Hong Kong. Abidi, S.S.R. & Ahmad, K. (1996) Child Language Development: A Connectionist Simulation of the Evolving Concept Memory. M. Aldridge (Ed.) Child language. Clevedon: Multilingual Matters Ltd. Abidi, S.S.R. & Ahmad, K. (1994) 'Connectionism as a Model for Child Language Development'. In Artificial Intelligence & Cognitive Science, Seventh Annual Irish Conference, Dublin. Bechtel, W. & Abrahamsen, A. (1991) Connectionism and the Mind. Oxford: Basil Blackwell. Furth, H. G. (1969) Piaget and Knowledge. Chicago: The University of Chicago Press.

McClelland, J. (1989) PDP: Implications for Cognition and Development. R. Morris (Ed.) Parallel Distributed Processing: Implications for Psychology and Neurobiology. Oxford: Clarendon Press. Seidenberg, M. (1993) Connectionist Models and Cognitive Theory. Psychological Science, Vol. 4, pp. 228-235.