author. We wish to thank George Kiss for supplying a magnetic tape version ...... .51 saw scissors .25 .18 .08 .29 climbing hanging .20 .16 .11 .14 hammer fist .20.
H ILLINOI
S
UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN
PRODUCTION NOTE University of Illinois at Urbana-Champaign Library Large-scale Digitization Project, 2007.
.Technical Report No.
296
SAPIENS: SPREADING ACTIVATION PROCESSOR FOR INFORMATION ENCODED IN NETWORK STRUCTURES Andrew Ortony University of Illinois at Urbana-Champaign Dean I. Radin Bell Telephone Laboratories,
Columbus,
Ohio
October 1983
Center for the Study of Reading READING EDUCATION REPORTS
o0
o)\a^ ^
-
UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN 51 Gerty Drive Champaign, Illinois 61820 BOLT BERANEK AND NEWMAN INC. 50 Moulton Street Cambridge, Massachusetts 02238
The Nationa Institute o: Educatior U.S. Departmento Education Washington. D.C. 20201
CENTER
FOR THE
Technical
STUDY
OF
READING
Report No. 296
SPREADING ACTIVATION PROCESSOR FOR SAPIENS: INFORMATION ENCODED IN NETWORK STRUCTURES Andrew Ortony University of Illinois at Urbana-Champaign
Dean I. Radin Bell Telephone Laboratories,
Columbus,
Ohio
October 1983
University of Illinois
at Urbana-Champaign 51 Gerty Drive Champaign, Illinois 61820
Bolt Beranek and Newman Inc. 50 Moulton Street Cambridge, Massachusetts 02238
The research reported herein was supported in part by the National Institute of Education under Contract No. US-NIE-C-400-76-Q116, and in part by a Spencer Fellowship awarded by the National Academy of Education to the first author. We wish to thank George Kiss for supplying a magnetic tape version of the Associative Thesaurus, the Coordinated Science Laboratory and Computing Services Office of the University of Illinois for granting computer time on the DEC-10, David Waltz for the use of a disk pack which housed a modified version of the Associative Thesaurus, and Harry Blanchard and Glenn Kleiman for their helpful comments on earlier versions of this manuscript.
EDITORIAL
BOARD
William Nagy Editor Harry Blanchard
Patricia Herman
Nancy Bryant
Asghar Iran-Nejad
Pat Chrosniak
Margi Laff
Avon Crismore
Margie Leys
Linda Fielding
Theresa Rogers
Dan Foertsch
Behrooz Tavakoli
Meg Gallagher
Terry Turner
Beth Gudbrandsen
Paul Wilson
Ortony & Radin
1
SAPIENS
2
Ortony & Radin
SAPIENS
SAPIENS: Spreading Activation Processor for
Abstract
Information Encoded in Network Structures This report describes
a
process
memory
in
semantic
computer
implementation
of
a
spreading
and discusses its performance on some tasks often
employed in psychological studies of human language processing. thesaurus
containing
over
between them was used as
the
activation
16,000
words
and
data
base,
thus
all
An
associative
free-associative strengths
making
SAPIENS
confront
the
information and computation problems inherent in large data base manipulation.
One of the hallmarks of intelligence is and
utilize
information
discounting irrelevant comprehension
or
relevant
to
information.
perception)
the solution of a problem while largely
In many cognitive tasks (such
this
accessed from memory more or less
the ability to efficiently identify
means
that
immediately,
exhaustive, or near exhaustive search.
thus
information--is
important
practical
described
in
this
an
information must be
precluding
any
kind
of
subset
of
the
totality
of
important theoretical question in psychology and an
question report
language
Consequently, the relevance problem--the
problem of how to identify a potentially relevant stored
relevant
as
for
takes
AI an
(Artificial
Intelligence).
The
work
AI perspective on the problem, using the
context of natural language processing as its
basis,
although
the
principles
upon which it is based are quite general.
Research in AI has devised various domain-specific mechanisms with
the relevance problem.
Robinson's
(1965, 1968)
early example of an approach that proved fruitful
in
for
dealing
resolution principle is an the
field
of
automatic
theorem proving. In the domain of scene analysis Waltz (1975) solved the problem by taking advantage of distinguishing
the
the
huge
reduction
in
data
storage
problem
solving
from
physically possible pairs from the logically possible pairs
of line junctions in a two dimensional representation of a scene. of
resulting
proper,
a
In
the
area
general guiding principle has been the careful
choice of knowledge representation.
It quickly became apparent that the
choice
could have a dramatic influence on the ease of problem solution, the old problem
SAPIENS
3
Ortony & Radin
Two developments in AI and
to
pertinent
especially
are
psychology
the
The first is the emergence from earlier but vaguer accounts
problem.
relevance
not
primarily
example, Raphael, 1976).
so
on
much
representations (which is the problem of schemas,
scripts,
of
structure the
rather
memory,
and
scripts (e.g. Schank & Abelson, 1977),
be
is the recognition that the distinction between programs and data need
much
less sharp than was generally supposed. The notion is that much information that and
appears on the surface to be declarative in nature can be, represented
as
observation
This
procedural.
important underlying principle of Planner-like languages evidenced
also
the
in
the
devaluation
consequence
of
generalized
knowledge
to
tended
the
the
rely upon associative networks.
(c.f.
is
often
more an
constitutes Hewitt,
1972),
(whether implemented or not) have generally been
structure
of
the
things
organization
knowledge
However, associative networks
system of
structures,
the here
of
Norman
program/data to
be
Rumelhart (1975).
and
called
distinction
is
viewed
merely
as
spaces
in
conduct a specific kind of search operation known as the intersection
to
which
1975).
search (Quillian, 1968; Collins & Quillian, 1969; Collins & Loftus,
The
general mechanism employed is that of spreading activation, and the mechanism is considered reached
is
upon
(e.g.
schemas
have
and
overall
The second, related, development, primarily from AI,
Rumelhart & Ortony, 1977).
advantageously
than
Models of the macrostructure of
represented in memory. (e.g. Minsky, 1975),
the
In other words, it depends on
knowledge representations, variously called frames
generalized
of
nature
on
about
proposals
detailed
1952) of increasingly
Piaget,
working
those
on the overall organization and interrelations of
as
etc.)
to
concern
chief
individual knowledge
of
structure
internal
the
such representations in the system. (e.g., Bartlett, 1932;
depends
Efficient access to all and only the relevant knowledge structures
for
(see,
example
convincing
of the mutilated chess board being a simple but
SAPIENS
4
Ortony & Radin
to have succeeded when it discovers an intersecting node that can
from
the
be
This limited use, however, fails to
different source nodes.
A
that
"schemas," are partly
capitalize on the power both of network representations themselves, and spreading activation mechanism. the potential power of the
The principal purpose of SAPIENS was
spreading
activation
mechanism
the
and
of
the
to harness semantic
procedural and partly declarative. network
representation
to
simulate
schema
selection.
Furthermore, this was
Because the utilization of a schema (in an AI system or in human cognition) undertaken in the context of results
a data base of
sufficient size that the
of automatically, it offers a powerful way of dealing with part
of
the
system's
design could be generally applicable rather than relegated to
relevance
the
the category of ungeneralizable "toy" problems. problem,
but
it
does only deal with part of it. for
bringing
the
into
play in the first place.
of
schema
utilization),
it
was possible to ignore the structure of the
The nodes are simply English
words,
although
we
make
the
In this report we describe a assumption
computer implementation of a process
to
appropriate nodes in the net.
schemas
Given this goal (as opposed
The missing component is the that
schema selection mechanism which is responsible candidate
principles
a great deal of potentially relevant information becoming available
in
for doing this--a process proposed
that
as
such,
they
can, in principle, be treated as the names of
earlier schemas.
in Ortony (1978). Our assumption that a network of words can be regarded schema
names
is
an
important
simplifying
assumption
as
which
a
netweork warrants
of some
Ortony & Radin
elaboration. he
5
SAPIENS
In a complete representation, we assume that there would
at least three different
episodic level.
have
levels--a lexical level, a conceptual level, and an
The lexical level represents connections between words
without
distinguishing between distinctly different meanings that a word might have. the lexical level a word like bank could have connections to other of
which
to
words,
At some
Ortony & Radin
6
Consequently,
the
process
cannot presuppose that a semantic representation of
the input has already been achieved. process
has
to
operate
SAPIENS
at
This
means
that
the
schema
selection
a relatively impoverished semantic level.
Of the
three levels that we have proposed, only the lexical level is devoid of semantic content.
(e.g., money) have to do with the "financial institution" sense, some A spreading activation mechanism ought to have at least the following
(e.g., weeds) with the "side of
a river" sense, and some with
other
senses
characteristics: the word such as those related to basketball, airplanes, etc. level represents associations between words, not between connections
are
represented
at the next,
Thus, the lexical
concepts.
conceptual, level.
five
of
Conceptual
(a)
context
sensitivity,
different patterns of activation for context conditions;
the
which
same
permits
input
the
string
production of
under
different
(b) efficiency, permitting a mechanism to operate in a space
At this level, we containing perhaps
tens of thousands of nodes;
(c)
decreasing
activation
over
suppose that there are separate schemas for the distinct meanings of a word like time, bank.
Furthermore,
so
as to prevent every input from activating the entire network forever;
these schemas are ordinarily not directly connected to one (d) summation of activation from different sources, so as to permit differential
another.
However, they are connected
through the lexical
level
in
the
sense activation
that
they are all directly connected
levels on equally distant nodes;
and (e) an activation threshold for
to the word or words that constitute their each node which determines whether it will transmit activation to other nodes.
labels or names.
Finally, we assume that representations involving some of
more
specific
the Based on these principles, a processor
noteworthy
experiences
centered
developed. represented at the episodic level.
At this
for
operating
on
a
network
was
around particular schemas are
level,
individual
In
it,
spreading
activation
is used as a mechanism not just for
representations finding intersections, but for identifying constellations of candidate nodes for
are
again directly associated with particular schemas, perhaps indexed in terms employment
of notable deviations from the canonical representations (see Schank, 1982).
in
restrict the the
present
the
process
at hand.
set of potentially relevant nodes.
The ability to select
report, we investigate the degree to which schema selection can be set
facilitated through processes
that are restricted to the lexical
level.
of
both practical and
theoretical reasons why this that it
is much easier
to
of potentially relevant nodes for possible use in subsequent processing an
important
component
in
an
intelligent
construct
a
data
base
While
we acknowledge that it is not sufficient to endow a system with
of genuine wisdom, we were unable to resist the name SAPIENS--Spreading
lexical
associations
than
it
Processor The theoretical reason is that
the
for
Information
Encoded
in
Network
Structures.
The program was
schema written in MACLISP and implemented on a DEC-10 computer. The data
process
is
of
Activation
is to construct a comparably sized data base of
conceptual and episodic structures. selection
small
is a worthwhile enterprise. system.
The practical reason is
a
There is, as we have already suggested,
are
In other words, the mechanism is used to
In
necessity
a pre-semantic process.
associative process whose goal is to permit a determination of the meaning
base
was
an
That is, it is a of
some
thesaurus
consisting of over 16,000 words and all free-associative
input. strengths between them (Kiss,
1968).
Since
the
data
base
was
empirically
Ortony
7
& Radin
generated
(by
SAPIENS
soliciting associative responses from students), and since it is
quite large, the network can be regarded as a reasonable facsimile of (part
of)
small subset of relevant concepts potentially usable for further processing
Four tasks were used to examine the performance of SAPIENS, While
It was considered important to use a large, empirically realistic data base
these
two
emphasize that they are not intended as
reasons.
First,
simulated process, semantic
it was necessary to have a data base with
diversity
and
trivializing the problem. called
in order to be able to test the effectiveness of the
"combinatorial
with
sufficiently
explosion"
problem contain
great
interconnections
Second, since a large data base was
simulated semantic network models example,
rich
a
than
Quillian, 1968, encoded about 850 nodes).
a
to
used,
had to be addressed. less
deal
are
aware
of.
avoid
the
so-
Most computer
thousand
nodes
(for
The present system works on
a data base an order of magnitude larger than any other semantic network we
of
system
Clearly the time requirements resulting from the massively
by,
for example, inference or problem solving mechanisms.
some arbitrary individual's lexical map.
for
SAPIENS
Ortony & Radin
tasks
they
were
some
of
suggested by published experimental work, it is important to simulations
of
experiments.
Rather,
should be viewed as illustrations of the kind of problems that SAPIENS can
handle and of the way in which it handles them.
Thus, we do not view the
tasks
as merely being relevant to the question of whether a spreading activation model can, in principle, provide simple solutions to the schema selection problem to
such
issues
as
lexical
disambiguation.
other can in principle solve some set
and
Proposals that some mechanism or
of problems are not very
compelling.
We
view performance on the tasks as demonstrating that a spreading activation model employing a realistically large number
of
nodes
does
solve
these
problems.
increased number of possible paths in a network of 16,000 nodes put much greater
Furthermore, we consider it important that we have a working program to do this,
demands
rather
on
the
processor.
Finally, we felt that
only with a relatively large
and semantically diverse data base would it be possible to explore the potential
from
the
entire
16,000
the
names
of
it
quickly
word network a restricted set of 10 to 20 These words can be
thought
of
as
the best candidate schemas for subsequent top-down processing by
other mechanisms. In this respect, SAPIENS is analogous to the filtering process in
Waltz's
(1975)
words.
line segments (out of
by a semantic description
most-
thousands of possibilities) for further processing
mechanism.
proposal--the
SAPIENS takes an input string and finds a
The
enterprise,
therefore,
is
an
AI
be
used
to
disambiguate
ambiguous
second task seeks to show how standard typicality effects found in
various laboratory tasks
can
performance
cued
in
a
mock
be
accommodated
recall
by
"experiment"
SAPIENS. is
Third,
examined.
describe a simple examination of the effects of manipulating word
SAPIEN's
Finally, we
order
of
an
input string.
program that generates semantic descriptions of scenes with
shadows. This filtering process takes a scene and finds a small subset of likely
theoretical
The first task shows how context can
The main result of SAPIENS is that, given several input words,
relevant words without extensive searching.
a
enterprise.
of the system for dealing with a broad range of tasks.
identifies
than
It should be emphasized that we do not claim that the mechanism we is
sufficient
to
realize
complete
solutions to
schema utilization is equally important. conceived,
a
spreading
activation
all such problems.
However, we do
mechanism
may
well
claim be
that, a
propose Clearly properly
fundamentally
Ortony & Radin
SAPIENS
important ingredient insofar as it represents a viable solution selection problem.
to
the
schema
We consider it significant that so much can be achieved with
so little, considering that
the
only
data
that
SAPIENS
operates
Ortony & Radin
with
are
associative strengths between words.
10
SAPIENS
The words that are used as input strings to SAPIENS are called seeds. seed
is
weighted so that the original associative strengths can effectively be
altered to simulate context effects and decreasing expansion
consists
of
accepting
each
input
activation
seed
over
complete, Before discussing the design characteristics of
the
program
it
will
be
define and clarify our use of a number of key terms.
First, in
the the
discussion that follows we do not distinguish between nodes, words, and
intersections
are
found
within
relevant
forward environment or RFE.
the
expanded
list,
concepts.
This
is
time-slice
of
English
words.
(and therefore the RFE) occur within one
or spread.
This breadth-first
expansion in one time-slice simulates
computation process, and thus spreading activation in SAPIENS can
be
However, our theoretical orientation regarded
views words as
comprise
The expansion of all the input seeds
The network contains nodes which a parallel
representations
them,
these
not because there is no difference, but because in the
present context the differences do not matter. are
and
schemas and the identification of intersections
or
An
in turn as a stimulus and
intersections, along with the activation levels associated with to
time.
creating a single list of all responses to these stimuli. After the expansion is
Design Features
helpful
Each
labels or names for schemas or
concepts.
We
can
as a parallel process which simultaneously spreads activation out from
consistently several input seeds.
maintain
this
position
even
though
underlying schematic structure that
we have made no attempt
to implement the The operation of SAPIENS
such nodes would have to have if
they
solution.
to be employed in a full-fledged natural language processor. The network links
itself, then, consists of some 16,000 English words
and
is analogous to growing a crystal in
a
saturated
were Input
seeds
used
chemical seeds
used
interconnected
associative
to
to
start
a
start
spreading
crystal-growing
activation process,
are similar to
and
the
densely
their network
is similar to a densely saturated chemical
which represent the empirically based associative strengths between them. solution. The seeds constitute a core around which layer upon layer of molecules
Associative
strength is an integer measure (1-100) of how many times a
response (or
(R)
was
given
to
a
specific
stimulus
(S)
by 100 people (Kiss, 1968).
nodes)
cluster.
This
clustering
around the seed core. The first layer is proportion of
the associative strength "sent" from S to R is called
formed
of
molecules
attracted
to
the
seeds.
are
the total amount of activation currently being
is the sum of all activations
coming
into
that
further
any
other stimulus words that
set
of
In a
are "transmitting" or "sending" activation.
The associative thesaurus contains both forward (S-R) and The
point,
word an attraction threshold fails to be reached and the growing process stops.
from
most
After several layers have formed, there is
less of a tendency for the crystal to continue growing. At some received at that word. It
that
activation. strongly
The activation level of a word is
produces an onion-like series of shells
The
reverse
nodes that can be reached from (i.e. as responses
(R-S)
similar fashion,
the
connected
around
first
SAPIENS
spread
creates
a
cluster
of
strongly
data. nodes
the input seed core.
As each new layer is formed, the
to) a particular size of the cluster increases, and more nodes are pulled into the RFE.
word is called the forward environment. layer, however, is formed of successively more remotely related nodes.
Each new
11
Ortony & Radin
related to the core. clusters
after
Normally, SAPIENS stops
Design Example
are
mildly
related
at
a
find
small subset of
since
spread
second
the
but more importantly, the
best,
purpose for the spreading has been fulfilled in one or two spreads, quickly
second
namely,
to
for simplicity's sake, fictitious but realistically representative data.
using,
of
weights
all responses and associative strengths from the
useful
for
notice
likely
be
to
Figure 1.
entire network.
that
On average, each node in the network spreads out to a
algorithm,
goal-searching be
eventually
reached, It is
interconnected. prevent untold
especially
thousands since
20
about
upon
thousands
the
network
associates. would
nodes
of
so
is
the was
fruit
In
0.51.
other
words,
they
seeds
plus the intersections are re-input as
is
important
empirically-based
are
Insert Figures
the in the expanded space and to
result.
will
as illustrated in Figure 2.
As
as that
is,
an
area
of
very
input
for
next
the
time-slice.
The new input list contains the original the
intersection
words
at
weights
form as more intersections are added to the inputs. which
The
is created
high seeds with input weights 1, together with
to
create
seeds, it becomes more likely that most
of the activation will circle in on itself, begin
transition
1 and 2 about here
Once the intersections are found and the strengths summed, a new list
will
fact
the
---------------a new list of the intersections and summed strengths
activation
as shown in
new seeds. The result of
this re-input is to restrict the kinds of intersections that more and more intersections are input as
to
probabilities.
The next step is to find intersections original
It
seeds.
proportion of subjects producing banana as a response to the stimulus
densely
spread
input
with
The value 51, for example, from fruit to banana, refers to
this same density that enables the clustering approach to
thousands of nodes from being activated. At each new
both
the total activation output from each seed equals 100,
that
further processing, but the ones that are useful are found without searching the
fruit,
First, apple and fruit are expanded to create a list containing
1.
Not
are
and
Suppose the input seeds are the two words, apple
relevant concepts for further processing.
all of the words found represent concepts that
In
SAPIENS
12
We shall now describe a fairly detailed example of the operation of SAPIENS
another layer about the first, but is associatively less strongly
wraps
successive
Ortony & Radin
The
The first spread creates a tight cluster around the seed core. spread
SAPIENS
inputs are recursive, each input containing part of the
previous
one,
reflect
the
proportion of total activation on each word (see Figure 3).
and The purpose of using proportional weights on the intersected nodes is to prevent
the
effect
is
to quickly excise from the network the relevance structure that an
activation explosion on the next time-slice.
The weights on a seed are used
exists around the original inputs. to multiply the associative strengths example,
since
response from red activation
the
of
each
response
to
that
seed.
For
weight on the node red is 0.3, the activation sent to each
will
be
multiplied
by
0.3.
This
means
that
the
total
sent from red will be 0.3 x 100, or 30, because each node originally
outputs a total of 100.
The original seeds are always re-input at weight 1
and
SAPIENS
13
Ortony & Radin
The maximum strength sum attainable after any spread is 100xN + 100,
successive intersection nodes are always input at proportional weights which sum to 1 to simulate the attention being placed on the input seeds (high activation) of
process
the
and
activation
spreading
operating
activation)
(low
on
N
the
equals
number
outputs a strength of intersected
intersected nodes.
100 (weight = 1),
the
second
and
is forced to equal 100
nodes
spread,
sum
the
of
strengths
(sum of weights = 1).
injected into the network at
the
to prevent an activation explosion),
to 1 (so as
decreases rapidly. the
spread,
is constrained to sum
the activation on intersections
increases in length on each
As the new seed list
be
to
happen
word
each
on
activation
in
intersections
too,
for these particular responses are the only
It is,
of
possible,
course,
and
different strength sums. For example, if the maximum strength is 300, and
desirable,
proportional
activations
on
continues because some nodes
additional
receive
will
seeds produces a sum
of
60
activation
percent.
(and one time-slice)
is complete.
spreads,
another
and
the new input list, the spread
available
activation
Thus,
the first pair of seeds
and
the
second
pair
represents
is better "integrated" (in the sense of
"inter-related") than the second pair by 10 percent of the available activation.
apple
containing
always
more
intersections and
The
are
exceptions
included
in
the
RFE
it
which
is
Each word in the RFE has an activation
smaller activation strength sum. "Association" in this sense is
whether they received
levels
associated
the total amount of activation received by that word in the
time-slice that just ended. The activation activation
strength
of
strength
nodes currently in the RFE.
activation strength sum is 74.
sum
is
the
sum
of
a
the
thus better thought of as activation or not.
hence a higher activation strength sum.
Weakly associated seeds may produce an RFE with only a few intersections, and sources.
correspondingly are
an
Normally all nodes that appear in the RFE must
have received activation from at least two which
produce
and RFE
seeds,
10
At the end of each cycle the result is
Notice, as can be seen in Figure 4, that the input seeds
fruit do not have strength sums.
pair
a sum of 30 after 2 spreads, we can say that the first pair represents
Input seeds that are strongly associated with one another will
a new RFE.
with
2
from 20 percent of the
Continuing the example, with the creation of
after
RFE words will change as the
relevant
newly accessed nodes.
input
intersections)
the relevant forward
produces
cycle
of
beginning
nodes receiving activation in the network.
one pair of
spread
In our example,
successive
among
the
all
A maximum attainable strength sum enables us to make meaningful comparisons
proportional
environment (RFE) must decrease. that
from
300 is the maximum attainable strength sum. This can
then
occur only if all of the responses (from the input seeds and the Since the total input weight of all intersected nodes
where
This is because each input seed always
input seeds.
of
if a total activation of 300 is Insert Figures 3 and 4 about here
SAPIENS
14
Ortony & Radin
all
In the present example, the
a measure of "inter-relatedness" between words.
SAPIENS
15
Ortony & Radin
the
Program Performance
usual
SAPIENS
16
Ortony & Radin
are vague, being consistent with a wide range of
rather they
sense,
very different kinds of referents. Lexical Disambiguation described
The first task to be
The most
this problem is the need to disambiguate homonyms on
the basis of contextual information. of
the
comprehension]
with
problems is
models
puts it:
As J. Anderson (1976) all
more
deal with multiple word senses or multiple syntactic possibilities spreading
model provides
activation
simulate
on
a
computer,
question of their psychological validity. (p.
The degree to which SAPIENS is disambiguation
problem was
able
448
than
The one
solution
the
to
of
The
input seeds.
intersections
meaning
of
the
For example, consider the words mint, bank, bar, fruit, and the usual sense:
between the sense of mint as candy, and as
they
each
a
place
for
have
manufacturing
in distinguishing the sense of bank as
institution from the sense in distinguish the sense of bar as
which
to
ought
be
if yellow were part of the
whereas,
The results of the simulation
Insert Table 1 about here
more
degree of
rivers
have
banks.
And
we
coins.
a financial wanted
to
a place to have a drink, from the sense in which
a rod is a bar. The remaining cases, fruit and game, are not ambiguous words
in
interconnectedness or integration of
an input. Thus, for example,the first two rows pecuniary
mint
of
sense
is
better
in
the
RFE
in
Table than
1
suggest
that
river appears high in the RFE (see
Row
3
that the original seeds
of
the
Table
1)
the
the gastronomic sense. appear
only
receive activation from their associates.
they themselves
if
of
measure
the word clusters associated with
integrated
SAPIENS is
Another interesting property of
because
it
So,
received
activation from water, stream, and/or other activated elements in the RFE.
distinct meaning. In particular, we were interested in distinguishing
Similarly, we were interested
lemon,
context, one might expect the reverse to be true.
... are hard
investigated by injecting a context-setting word and
first three are ambiguous in
the context for fruit, apple
Recall that the activation strength sum can be taken as a
that were clearly associated with the contextually appropriate
game.
or
instances of the concept.
are presented in Table 1.
found in one or two spreads were usually sufficient to produce clusters of words
word.
banana,
than
impose
The
)
contribute
to
the target (ambiguous) word into the network as
disambiguous
available
to
ought
is irrelevant to the
difficulty
this
but
...
provide
to
ought
the potential for associative context
to prime a word's meaning. These parallel, strength mechanisms to
constraints on the relative accessibility of different Thus, for example, if red is part of
they run into difficulties when they must
that
it
context
terms,
vague
an ambiguous word. In the case of the
language
[computer-based
current
context
of greater availability of words related to the appropriate meaning of
evidence
typical example of
One
the
in the face of more than one alternative.
appropriate word meanings
selection of
of
problem
to
sensitive
If SAPIENS is appropriately the
with
concerned
is
A general observation that can be made about these results ranked
highly
nodes in the RFE are, as
cases where the RFE is bank
in
related, related
large (as
is that the most
a rule, semantically highly related. In
indicated by a high strength sum),
for
example
the context of money, the least highly ranked concepts are very weakly often
representing
concepts.
For
syntagmatically,
example,
make
rather
than
paradigmatically
appears very low on the list for bank.
Ortony & Radin
17
SAPIENS
Presumably it is only there at all because one puts money that the
bank.
One
suppose that and
the
way
makes
into
to interpret the varying numbers of nodes in the RFE is
the more nodes there are in it, the better integrated the
RFE
to is,
more knowledge there is associated with the particular use of the term
(compared only, of words,
one
uses
course, to alternative uses
giving
rise
to
larger
frequency, or more typical ones.
RFEs
of
the
might
same
be
term).
regarded
In
other
as the higher
However, caution should be exercised
in
Ortony & Radin
18
SAPIENS
drink and The mother made a drink suggest different kinds of bar
was
crowded
and
The
while
The
bar was rusted imply different kinds of bars. Since
SAPIENS finds items relevant to all of followed
drinks
the
input
words,
the
RFE
for
drink
by bar will reflect both a bar sense of drink and a drink sense of bar.
Similarly, finding a relation between red and fruit, and fruit and red entails a comparable process, except that the seeds are (technically) unambiguous.
this Typicality and Verification Tasks
respect: it
is not being claimed that, for example, the most probable use of the
word mint,
is the pecuniary sense, but only that that
than
particular
the
present
data base.
preferred
sense
is
more
alternatives that were investigated in the context of the
It is perfectly possible that mint as
sense in some other data base.
herb,
would
be
the
Furthermore, a realistic test of this
would need to control for the frequency of
the context-setting word,
It is now a well-established finding
probable
which
was
which
true
typical the example, x, is of the category, y, Shoben,
&
schema
selection
(presumably
in
top
down
processing),
it
can
be
concluded
speed
with
Rips, 1974).
(see, e.g.,
Rosch,
1973;
Smith,
If apple is a more typical fruit than strawberry, then
This fact is, of course, consistent with the view that
exemplars are more available than less typical
ones.
The
second
for set of
use
the
the sentence An apple is a fruit will take less time to verify than the sentence
more typical If the program is viewed as a model of
that
of the form "An x is a y" can be verified depends on how
A strawberry is a fruit.
not done in the present case.
later
sentences
in Psychology
simulations was conducted to see whether or not SAPIENS would demonstrate
that SAPIENS is such differential availability.
reasonably successful. small
relevant
subset
This relevant subset, related
to
the
The clustering process was supposed to of
find
a
the nodes without resorting to extensive searching.
the RFE, ought to have, and
input
quickly
seeds
taken together.
did include, nodes
that
were
The program, it can therefore be
concluded, is able to restrict the candidate nodes for subsequent processing that
so
those candidates are likely to be relevant to the appropriate meaning of
a
Injecting apple followed by fruit containing
that SAPIENS does not know which seed is
the context-setting word
activation
the
RFE
By comparison, if
the
with
several
different
kinds
of referents.
an
RFE
listed
in
order
of
seed
nodes
are
strawberry
activation
proportion
of
0.21
and
summer, and
(sum
strength
64).
Thus,
as expected,
the more typical exemplar, apple, produces an typical one.
Similarly,
robin
are followed
compatible
are
comprises cream, juice, jam, raspberry, pie, tart,
RFE with greater overall strength than the less Both words
in
and
the target word. If both nodes are somewhat ambiguous, a "complementary
disambiguation" can occur. Consider for example, drink and bar.
results
level; the proportion of available activation was 0.32 (sum strength
for this RFE was 97).
equals which is
network
the words pie, pear, orange juice, tree, juicy, tart, banana, green,
food after one spread, with an Note
the
sauce, and summer after just one spread. These words
fruit,
word as determined by the context.
into
The bartender made a
by bird gives bird, song, sparrow, swallow, thrush, and starling after
SAPIENS
19
Ortony & Radin
two spreads, with an activation proportion of
while
0.38 (sum strength of 114),
penguin followed by bird gives bird, fly, black, feathers, feather, flight, skky, and wing after two spreads, with a smaller activation proportion
of
(sum
0.29
Finally,
observations
of
couple
of
First,
RFEs.
it
for
again,
present
really
not
is
this
purposes
should be mentioned that psychological experiments on that
are
subjects
faster
at
the
about
rejecting non-examples like A
are very fast at contents
it
fly. As it is currently set up, SAPIENS cannot handle
examples of category members than poor ones, but also that they
good
verifying a
making
worth
is
but
negative properties, important.
to
sentence verification routinely reveal not only
strength of 87) than in the robin case. At this point it
unable
are
penguins
SAPIENS
20
Ortony & Radin
is
pencil
a
bird.
RFEs
The
should be remembered that the data that form the corresponding
the
from
resulting
inputs
can
be
interpreted
in
manner
a
basis of the network were collected several years ago and represent some amalgam consistent with this finding, for, in the case where the term refers to of
a
non-
Assuming
number of undergraduate students at Edinburgh, Scotland.
large
a
of the category, the RFE tends to be empty so that the sum strength is
instance English than the general tendency of the dialect to be closer to that of British
equal to or close to zero. to
it
English,
American
be
should
recognized that robins (being relatively
uncommon in Britain) are not, in fact, the best thrushes,
and
peculiarities
of
this
kind.
more
drink bar.
by
U.S.
standards
In Britain one of
(especially at the bar) is
(perhaps
green as they are red
pies
in
used
causes no real problems.
and
highly activated.
wing
Nurses have to be licensed.
example
present most
typical
things
to
drink
in
a
pub
Sometimes
as
a
word
this is just a plural form, sometimes a
Furthermore, it is
interesting to note that the second It is possible
that
words
contributed enough activation to fly for it to become
On the other hand,
perhaps
fly
refers
to
the
fact
that
cue.
cue
remote
In
the
actress would be the close cue for sentence (1) and the remote
cue for sentence (2), while doctor would be the close cue for sentence the
highest RFE word for the pair penguin/bird was fly. feathers
(2)
Later, subjects were given either a "close" or a "remote" recall
a pint (of beer).
feather and feathers).
Nurses are often beautiful.
Another
tarts.
and
(1)
might be the occurrence of pint in the RFE for
the
subjects studied a
number of simple sentences such as (1) and (2).
verb form, and so on. Ideally, these would have been removed, but their presence
like
In an experiment reported by Anderson and Ortony (1975),
of
the experience of most
Notice also, that the RFE often contains more than one cognate (e.g.
number
so) because there is a subcategory of apples known as cooking apples
which are always green (and sour) and are peculiarity
There are a
For example, it is
English people that apples are typically just as even
Cued Recall
starlings are all better. Notice that these other good instances
in fact find their way into the RFE for robin and bird. other
birds--sparrows,
of
exemplars
for
sentence
(1).
The
experiment
showed
that
(2)
and
what
was
semantically close or remote depended on the entire sentence rather than on part of
it
since
sentences
the
cues
were
which included key
not words
sufficient
to permit the recall of control
like beautiful and licensed.
For
example,
21
Ortony & Radin
the cues were not
SAPIENS
differentially effective, indeed, they were not
all for control sentences like (la)
effective
at
and (2a).
Ortony & Radin
just
one
source
Landscapes are often beautiful.
(2a)
Taverns have to be licensed.
cue
probability
account
sentences: distant
is
an
5d).
example
The of
intersection a
seed-cue
between
the
intersection
RFE2
and
the
(Figure 5e).
The
activation strength sum of the seed-cue intersection is directly related to
Ortony (1978) has suggested that a model along the lines of the present one can
SAPIENS
will not exceed its threshold because insufficient activation
will be received (Figure expanded
(la)
22
for
these
results.
close recall cues
recall
cues
Different RFEs are created for the different
integrate strongly with their respective RFEs,
integrate
weakly
and
sum is
of
the
recalling the input given a cue because the activation strength
a measure of the overlap between the input seed RFE and the cue
RFE.
It
is reasonable to assume that if a cue RFE completely overlaps (and restimulates) the input
seed RFE that the probability of recalling the
inputs
will
be
very
with them. These integration strengths high.
On the other hand, if a cue RFE does not overlap any portion of the input
reflect the probability of a cue eliciting recall of a sentence. The task in the Anderson and Ortony first
creating
(1975)
experiment
was
simulated
by
RFEs from input seeds corresponding to the substantive words in
the to-be-learned
sentences.
Then the expanded cues were intersected
input seed RFEs to find the amount of operations used by SAPIENS is
integration between them.
with
the
The sequence of
seed RFE, it is unlikely that the
input
associates
reactivated.
of
the
inputs
were
will
be
recalled Therefore,
strength sum is greater than another, it seems reasonable former
to
because if
no
close
one activation
assume
that
the
sum will reflect a greater probability of input sentence recall than the
latter sum.
The results
are summarized in Table 2.
illustrated in Figure 5. Insert Table 2 about here
Insert Figure 5 about here In almost all cases The process begins when seeds A and B are injected into the network (Figure 5a).
Intersections are then found between expanded A and expanded B. This area
is called RFE1 (Figure 5b).
The seeds and RFE1
(Figure
in
5c)
as
explained
are
re-input
the basic design section.
to
the
This gives rise to a
second RFE (RFE2) resulting from the intersections of seeds A and B, (Figure
5d).
part of this that
a
Any
word
that
with
RFE1
receives activation from at least two sources is
(and subsequent) RFEs because the notion of
certain
network
a
threshold
amount of activation is received by a node before
transmit activation to other nodes.
requires
it begins to
In SAPIENS, a node receiving activation from
Ortony (1975).
the trends are in the direction found by
in
any way affected the validity of the test.
the cue sexy, for the sentences about nurses was the
original
network. revert
and
Not all the words from the original experiment were available in
the network, consequently some minor changes were needed. changes
Anderson
cue
used
None of these wording For example, the use of
necessitated by the
fact
that
in Anderson and Ortony (1975), actress, was not in the
As an aside, it should perhaps be mentioned that
this
forced
us
to
(for the purposes of science, only) to the standards of sexism that were
pervasive at that time the data base was assembled some 20 years ago.
SAPIENS
23
Ortony & Radin
It is important to emphasize that no
claims
specific comprehension and memory processes. whatever the comprehension process is,
are
being
made
about
here
All that is being supposed is that
bus
is weighted high, and bus-related items are ranked high when the input seed is weighted high.
an
of
it does involve the establishment
SAPIENS
24
Ortony & Radin
Insert Figure 6 about here RFE
to
limit
also assumed that that RFE could be used as part of the
representation
is
It
the schemas that are brought into focus for that process.
the
of
the
This result suggests that the salience of remembered
sentence
reproduced
(or
from
it),
input
an
play
can
words
a part that apparently can be role
important
in
determining
The
the activation levels of nodes in an RFE.
gainfully employed in the retrieval process. tend
activation levels are important because the nodes with the most activation Word Order Effects
to be more relevant than lower activated nodes.
Compare sentences (3a) and (3b).
Needless to say, assigning weights of 10 to the agent, and 1 and the patient
(3a)
The boy smashed the bus.
(3b)
The bus smashed the boy.
is rather an arbitrary way of handling the problem, yet it seems
to work for all that. weights In sentence (3a) images of vandalism and boy-related items are likely to come to mind, whereas in sentence (3b),
accidents and bus-related items come to mind.
Ordinarily SAPIENS would accept boy, smash, and bus parallel input string.
exactly
the
one
simultaneous,
same
That is, boy, smash, bus and bus, RFE.
Since
different meanings, it would be nice if
sentences
smash,
and
boy
(3a) and (3b) actually have
be
would
A more sophisticated approach
select
to
the
the basis of some kind of optimization procedure. Since SAPIENS has
on
For
role.
case
no parsing ability, the user has to apply the weights to each
natural language it would be necessary to recognize, for example, that
handling
Thus the syntactic features of subject and predicate, or
actor and object, are lost. produce
as
verb
the
to
the
be made. The purpose of the present demonstration, however, is only
to
weights
to
adjustments
appropriate
the input was in passive form so as to permit
to show that,
in principle, SAPIENS has
to
flexibility
the
be
sensitive
to
simple syntactic constraints.
the relevant syntactic information could Conclusion
somehow be preserved. Natural language processing, like other One way in which this can be done is to give a greater weight to the
areas
of
AI,
has
to
face
the
agent problem of how to reduce the search set of candidate representations if it hopes
of
the
input sentence.
The agent (subject) might, for example, be assigned an to utilize appropriate ones to facilitate top down processing.
input weight of 10, and the verb and patient might be assigned weights of,
to 1.
proved
be
a
somewhat
stubborn
problem
in
the
past,
and
one
which
becomes
SAPIENS
appears
Such weights could be regarded as reflecting the salience of each case role. increasingly difficult as the size of the data base increases.
Figure 6 shows the effects on the RFE that such a manipulation of
weights
has. to
As
This has
say,
one
might expect, boy-related items are ranked high when the input seed boy
be
a viable general solution to this problem, because it quickly produces a
Ortony
& Radin
small set help
25
SAPIENS
behind the design of mechanisms
that
SAPIENS
would
be
are
essentially
employed
independent
AI
domains
as
well.
embody a realistic representation of
the
particular
in a complete natural language processing
The chief constraint on SAPIENS the
associative
SAPIENS
References
The principles
of
system, and, in fact, there is no reason why SAPIENS could not
nodes.
26
of candidates and is able to do so in a manner that would permit it to
in a range of important natural language processing tasks.
other
Ortony & Radin
be
utilized
in
is that it has to
connections
between
its
In our implementation the onerous task of assembling these data was done
Anderson, J. R.
Language, memory, and thought.
Anderson, R. C., & Ortony, A. polysemy.
New York:
On putting apples into
Wiley, 1976.
bottles -- A
problem
of
Cognitive Psychology, 1975, 7, 167-180.
Bartlett, F. C.
Remembering.
Cambridge:
Collins, A. M., & Loftus, E. F. processing.
A
Cambridge University Press,
spreading
activation
theory
of
1932. semantic
Psychological Review, 1975, 82, 407-428.
independently. Collins, A. M., & Quillian, M. R. While spreading activation occasionally
used)
before
been shown to be viable or significant
size.
proposal, and as since
it
SAPIENS
mechanisms
have
been
proposed
(and
in
AI,
in both psychology and AI, such a mechanism has not efficient is
when
applied
to
a
data
base
of
any
an implemented system, rather than an abstract
such, the specific details
of
its
design
become
important,
is just such details that distinguish approaches that could only work
in principle from those that work in fact.
Retrieval time from semantic memory.
Journal
of Verbal Learning and Verbal Behavior, 1969, 8, 240-247. Hewitt, C.
Description and theoretical analysis (using schemata)
language
for
A
proving theorems and manipulating models in a robot (AI Tech.
Report No. 258). Kiss, G. R.
of planner:
Cambridge, Mass.:
M.I.T., 1972.
Word associations and networks.
Journal
of
Verbal
Learning
and
Verbal Behavior, 1968, 7, 707-713. Minsky, M.
A framework for representing knowledge.
psychology of computer vision. Norman, D. A., & Rumelhart, D. E.
New York:
In P. H. Winston (Ed.), The
McGraw-Hill, 1975.
Explorations in
cognition.
San
Francisco:
Freeman, 1975. Ortony, A.
Remembering, understanding, and representation.
Cognitive
Science,
1978, 2, 53-69. Piaget, J.
The origins of intelligence in children.
New
York:
International
Universities Press, 1952. Quillian, M. R. processing. Raphael, B.
Semantic memory.
In
Cambridge, Mass.:
M.I.T. Press, 1968.
The thinking computer.
M. Minsky
San Francisco:
(Ed.),
Semantic
Freeman, 1976.
information
SAPIENS
27
Ortony & Radin
Robinson, J. A.
A machine-oriented logic based
on
the
resolution
Journal of the Association for Computing Machinery, 1965, 12, Robinson, J. A.
The generalized
resolution
principle.
In
principle.
23-41.
D. Michie
(Ed.),
Table 1 Machine
Intelligence 3.
Edinburgh:
Edinburgh University Press, 1968.
Results of Disambiguation Simulation Rosch, E. H.
On the internal structure of perceptual and
In T. F. Moore (Ed.), New York:
Cognitive development and the acquisition of language.
& Ortony, A.
C. Anderson,
Dynamic Memory.
Schank, R. C.,
& Abelson,
Hillsdale, N.J.:
The representation of knowledge in
R. J. Spiro
acquisition of knowledge. Schank, R.
&
W. E. Montague
Hillsdale, N.J.:
(Eds.),
memory.
Schooling
Number of Nodes in RFE
Strength Sum
Proportion of Total Activation
3
22
.07
11
77
.26
9
87
.29
23
123
.41
3
23
.07
beer, drink, pint
7
83
.28
fruit
orange,
4
25
.08
red
fruit
apple,
green, hot
3
31
.10
card
game
play,
poker, board
3
28
.09
ball
game
play,
football,
3
33
.11
chess
game
board,
2
40
.13
First Three RankOrdered Nodes in the RFEa
Context Word
Target Word
chocolate
mint
sweet,
coin
mint
money, cash, gold
river
bank
river, water, stream
money
bank
money,
metal
bar
iron, door,
drink
bar
yellow
In
and the
milk,
taffee
Erlbaum, 1977.
New York: Cambridge University Press, 1982. R.
P.
Scripts,
plans,
goals
and
silver, book
understanding.
rod
Erlbaum, 1977.
Smith, E. E., Shoben, E. J., & Rips. A
categories.
Academic Press, 1973.
Rumelhart, D. E., R.
semantic
Structure and process in semantic
featural model for semantic decisions.
memory:
Psychology Review, 1974, 81,
green, bananna
214-
241. Waltz, D.
Understanding line drawings
of scenes with shadows.
(Ed.), The psychology of computer vision.
New York:
In P. H. Winston
tennis
McGraw-Hill, 1975.
aRelevant
play
Forward Environment
Table 2 Figure Captions Results of Cued Recall
Simulation ____
Proportion of Total Activation With
Target Sentence Paira
The nurse washed her
Close Cue
Remote Cue
shampoo
detergent
hair.
The nurse washed her clothes.
.36
.18
.18
.22
Figure 1.
Expansion of nodes apple and fruit.
Figure 2.
red and pie are
intersections found
in the expanded
environment of apple and fruit. Figure 3.
Original
seeds, in this example the nodes apple and fruit,
are always weighted at 1.0.
Intersections, here the nodes red and pie,
are proportionally weighted at fractions which reflect
sexy Nurses are often beautiful
.23
Nurses give health care.
.09 saw
doctor .17
.51
of total
the proportion
activation on each node.
Figure 4.
The end result after one time-slice
is a relevant forward
scissors environment (RFE).
The farmer cut the wood.
.25
.18
The farmer
.08
.29
climbing
hanging
The ivy covered the walls.
.20
.16
The picture covered the walls.
.11
.14
cut the
fabric.
Figure 5. in the cued
recall
Figure 6.
hammer The man hit the nail.
.20
The man hit the jaw.
.07 tv
The man watched
the show.
The man listened to the
aWords b
in italics
show.
fist
.11 .19 radio
.11
.01
.13
.11
represent input seeds.
As presented, the close cue usually produces a higher proportion of total activation for the first member of the pair and a lesser proportion for the second. Similarly, the remote cue produces a lesser proportion for the first member and a higher proportion for the second.
Steps
involved in creating the seed-are
experiment
intersection used
simulation.
Result of simulation of syntactic effects.
31
21
12
44
10
8
red t=22/74=.3
pie wt=52/74= .7
(a)
(b)
(c) (d)
(e)
1 higher activation
RFE rank
lower activation
I 2 3 4 5 6 7 8 9 10 I 12
/III
\
-
I
)y2 ) -
N.
/ N.>
-~ c: C) 0
IE
C-)
boy-related
J) E
E
01) .0l
items
L. L.
L.
1
-0
c} 0
C0
-a
T3
O 0 0
t-JC ot
L,
L0) 1L.
Q.
c L.
4-
bus-related items
I
This straight line shows the ranking of nodes in the RFE produced by boy smash bus.
2
This curve shows how the nodes are ranked lower for boy-related items and higher for bus-related items when the input seeds are bus smash boy.
I