Memory & Cognition 1999,27 (I), 106-115

A beautiful day in the neighborhood: What factors determine the generation effect for simple multiplication problems? BRYAN J. PESTA University ofAkron, Akron, Ohio, and Cleveland State University, Cleveland, Ohio and RAYMOND E. SANDERS and MARTIN D. MURPHY University ofAkron, Akron, Ohio In four experiments, we examined the generation effect for the free recall of simple multiplication answers. Large-product-size problems showed a consistent generation-effect advantage over smallproduct-size problems, except when each answer was generated twice, via two different sets of operands (Experiment 2). Also, measures of problem-solution time and strategy use accounted for the largeproduct-size advantage. Across experiments, however, small-product-size problems (but not largeproduct-size problems) showed considerable variation in the size of their generation effect. We discovered that solving small-product-size problems via direct memory retrieval increased the episodic recall probability of other problems that were near neighbors to the generated answer, and we attribute this result to a spreading activation mechanism in semantic memory. A measure of neighbor activations, combined with RTto solve each problem, accounted for 51% of the observed generation-effect variance.

The generation effect is better memory for items actively produced by a participant than for items merely supplied by an experimenter (Jacoby, 1978; Slamecka & Graf, 1978). For example, memory for a word will be better if the participant generates it from a fragment (e.g., Cuck), than if the participant simply reads it (e.g., truck). Similarly, memory for a multiplication answer will be better if the participant solves the problem (e.g., 3 X 7 = ??) than if the participant just reads the answer (e.g., 3 X 7 = 21). In Slamecka and Graf's (1978) original experiments, generation effects were found using synonym, antonym, rhyme, associative, and category generation rules. The generation effect is also robust. It occurs with words (Slamecka & Graf, 1978), nonwords (McElroy & Slamecka, 1982; Nairne & Widner, 1987), and numbers (Crutcher & Healy, 1989; Gardiner & Rowley, 1984). Evidence exists, however, that the size ofthe generation effect depends on the relative degree of difficulty involved in generating an item or set of items. Tyler, Hertel, McCallum, and Ellis (1979) found that more difficult word stem completions were remembered better than were less difficult ones. Fiedler, Lachnit, Fay, and Krug (1992) obtained a similar result comparing memory for word stems that varied in how much context they provided about the to-be-generated items. The generate condition providing the most context (e.g., a car rolls on four wh_C) produced a smaller genThe authors thank Les Fisher for his assistance with this manuscript and Peter Graf, Alice Healy, Reed Hunt, and an anonymous reviewer for valuable comments. Correspondence should be addressed to B. 1. Pesta, 14610 Harley, Cleveland, OH 44111 (e-mail: [email protected]).

Copyright 1999 Psychonomic Society, Inc.

eration effect than did the generate condition providing no context (e.g., wh_C). Problem difficulty may also moderate the generation effect for simple multiplication problems. For example, Pesta, Sanders, and Nemec (1996) had participants read and solve simple multiplication problems. They manipulated problem difficulty via product size, which refers to the size of each problem's answer. Typically, smallproduct-size and large-product-size problems are solved in different ways: Most adults solve the former via direct memory retrieval of the answer and the latter via computation (see, e.g., Ashcraft, 1992). Differences in solution strategy are indeed the primary reason that largeproduct-size problems are more difficult to solve than small-product-size problems. In current models of mental arithmetic, though, problemfamiliarity determines whether an answer is computed or retrieved from memory. Since, developmentally, children are exposed to more small-product-size problems than large-product-size problems in elementary school (see, e.g., Ashcraft, 1992; Ashcraft & Christy, 1995), small-product-size problems in turn are more often retrieved from memory than are large-product-size problems. Even after an item's answer is stored in memory, however, familiarity can still affect its rate ofretrieval via strength ofrepresentation in memory (Ashcraft, 1992). Therefore, product size represents a manipulation ofboth stimulus familiarity and problem difficulty. Pesta et al. (1996) hypothesized that differences in solution strategy for small-product-size and large-productsize problems may translate into differences in the size of

106

GENERATION EFFECT

their resulting generation effects. Because computation requires more attention or "cognitive effort" than memory retrieval, computed multiplication answers should show a larger generation effect than retrieved multiplication answers. Consistent with this hypothesis, Pesta et al. found a strong generation effect for large-productsize problems but no generation effect for small-productsize problems. Therefore, in contrast to using verbal materials where familiarity increases the generation effect (see, e.g., Gardiner, Gregg, & Hampton, 1988; Nairne & Widner, 1988), in Pesta et aI., more difficult, less familiar problems showed the largest generation effect. That problem difficulty moderates the generation effect, however, is more an observation than an explanation. Simple effort accounts of the generation effect have fallen out offavor in the literature, partly because oftheir poor specification (see Greene, 1992, and Mitchell & Hunt, 1989, for discussions). Also, in some studies, a generation effect was found by using rules that require little effortful processing. For example, Donaldson and Bass (1980) used a generate rule that was trivially simple: Participants merely added the letter e to the end of each fragment (e.g., tabC, appC). Yet, even this small difference between reading and generating produced the effect. One can also obtain the effect with simple letterswitching generation rules (see, e.g., Nairne & Widner, 1987). Effortful processing is thus not a necessary condition for producing the generation effect. So, something else must explain the product size X generate condition interaction that exists for multiplication problems. Before pursuing such an explanation, why should psychologists be interested in memory for multiplication answers? We believe there are several good reasons. First, studies using multiplication problems speak directly to theories of the generation effect based on verbal items. Gardiner and Rowley (1984) first examined the generation effect for multiplication problems primarily to rule out a strong version of the lexical activation hypothesis (the idea that an item must have lexical status in semantic memory to show a generation effect). Second, multiplication facts can be ideal stimuli for anyone interested in the organization of knowledge in memory. The set of multiplication problems and answers is far smaller, with a more knowable structure, than the number of words in the average person's vocabulary. Finally, as we show, some interesting effects that exist in this area have strong parallels in the verbal domain. So, why does problem difficulty moderate the generation effect? Our explanation starts with the procedural account of McNamara and Healy (1995), which appeals to the operand-retrieval strategy. At memory test, participants may retrieve the operands to previously studied multiplication problems. The operands serve as powerful memory cues by allowing the regeneration of potential target answers. The participant then decides whether the regenerated answers occurred in the study phase of the

107

experiment. The operand-retrieval strategy takes advantage of transfer appropriate processing and encoding specificity (see, e.g., Roediger, 1990; Tulving & Thomson, 1973) for the generated, but not read, study items. These mechanisms then produce the generation effect.' The procedural account alone, however, does not explain why product size moderates the generation effect. Small-product-size and large-product-size problems contain the same number of operands, so the account would need to make the odd assumption that participants retrieve mostly large-product-size operands at memory test. An important difference exists, though, between solving small-product-size and large-product-size problems, which is not incorporated into the procedural account. Solving a large-product-size problem via computation results in the generation of intermediate steps. These steps may be retrieved for use as cues to the target answers, just as the operands presumably are. In effect, solving largeproduct-size problems via computation generates more potential cues to the target item than does retrieving a small-product-size answer directly from memory. For example, one computational strategy for solving the problem 8 X 12 = ?? is to first generate the intermediate steps of8 X 10 = 80 and 8 X 2 = 16, and then to add each result for the final product, 96. At test, the participant could retrieve the operands, the intermediate steps, or both for use as cues to the target answer. In contrast, solving a small-product-size problem via direct memory retrieval (e.g., 3 X 2 = ??) does not involve intermediate steps. The number of memory cues for these items is therefore limited to the operands that were present at study. EXPERIMENT 1 By combining the procedural account with the intermediate-steps hypothesis, a precise testable explanation exists for why problem difficulty moderates the generation effect. Our first concern, however, was to replicate the basic effect. We thus had college students read and solve multiplication problems varying in product size and then administered a free recall test for all answers. Method Participants and Design. The participants for our experiments were undergraduates in (voluntary) classroom demonstrations ofthe generation effect, and each experiment used different people. The participants here were 6 males and 18 females with a mean age of 22.3 (range = 18-38) years. The design was a 2 X 2 factorial, with product size (small, large) and generate condition (read, generate) as within-subjects variables. Materials. The experimental packet contained 24 multiplication problems selected from the 2 X 2 through 12 X 12 table. The first two problems (i.e., 10 X 10 = ??, and 10 X 9 = ?~) reappeared as the last two problems and served as filler items used to reduce primacy and recency effects. The remaining 20 problems consisted of 10 small-product-size (i.e., answers ranging from 8 to 28) and 10 large-product-size (i.e., answers ranging from 42 to 108) problems.

108

PESTA, SANDERS, AND MURPHY

We dichotomized product size to use an analysis of variance (ANOYA), but no conclusions changed when treating the variable as continuous in regression analyses. Within each level of product size, 5 problems appeared as read items (e.g., 2 X 6 = 12, or 6 X 7 = 42), and the other 5 appeared as generate items (e.g., 4 X 5 = ?~,or8 X 9 = ??). We then arranged these into the packet vertically, one on every fourth line. The participants either solved (for generate items) or copied (for read items) their answers onto an answer space that was printed horizontally next to each problem. The items were ordered to alternate between small-product-size and large-product-size problems, and between read and generate problems. A second experimental packet exactly reversed the assignment of read and generate conditions to items. The generate items in the first packet (e.g., 4 X 5 = ??, 8 x 9 = ??) served as the read items in the second packet (e.g., 4 X 5 = 20, 8 x 9 = 72), and vice versa. We distributed the packets at random to the participants. The instructions, printed on the first page of each packet, included two sample problems, one read and one generate, that the participants worked through. The instructions also required the participants to (I) work through the packet at their own pace, (2) not go back to a problem previously worked on, and (3) remember all ofthe answers in the packet, whether read or solved. Procedure. The 20-min experiment was conducted in a classroom, and it began with distribution of the experimental packets, handed out face-down. The experimenter then read the instructions aloud, pausing as the participants worked through the samples. Each participant then completed the problems at his or her own pace. When finished, each participant exchanged the study packet for a recall sheet. Instructions on the recall sheet asked the participants to write down, in any order, all of the answers they could remember, whether read or solved. The participants were also cautioned not to guess and were asked to write down an answer only if reasonably sure it appeared in the study packet.

Results and Discussion The level of statistical significance used for all analyses was p < .05. A preliminary statistical test found no significant effects involving the counterbalancing factor. Also, the mean percentage of errors made when generating the 10 multiplication answers was low (M = 2.5%, SD = 0.44; no errors were made when copying the read answers in any experiment). Finally, the mean number of recall intrusion errors (where the participant recalled an answer not contained in the study packet) was also low (M = 0.29, SD = 0.46). Mean free recall out of five possible for any cell and standard deviations are shown in Table 1. For each ANOVA described in this study, a parallel analysis was run using proportion correct recall as the dependent variable. These analyses excluded trials in which the participant generated a wrong answer to a problem. Given the small number of errors made, however, the conclusions reached in each analysis were identical, and so we report only the former. The ANOVA on the Experiment 1 data revealed significant effects of generate condition [F(l,22) = 17.0, MS e = 1.11], product size [F(1,22) = 9.0, MSe = 1.72], and the generate condition X product size interaction [F( 1,22) = 5.6, MSe = 1.45]. As shown in Table 1, a generation effect (i.e., a significant difference between recall of generate and read items) existed for large-product-

Table 1

Recall Means, Standard Deviations, and the Generation Effect by Experiment Condition Read

Generate

SD

Difference

1.32 1.00

0.29 1.46*

Small Large

Experiment 2, Same Operand 1.91a 1.17 1.84a 1.25 1.16 2.94 b 1.29 1.59a

0.07 1.35*

Small Large

Experiment 2, Different Operand 1.24 1.53a 1.39 2.53' 0.80 1.23 1.53a 2.66'

1.00* 1.13*

Small Large

1.89b 2.63'

Experiment 3 1.13 1.02a 1.17 1.20a

0.94 1.14

0.87* 1.43*

Small Large

3.00 a 4.03 b

Experiment 4 1.16 2.79 a 0.91 2.73 a

1.20 1.50

0.21 1.30*

Product Size

M

Small Large

1.79a 3.17 b

SD

M

Experiment I 1.56 1.50a 1.05 1.71"

Note-Maximum score = 5. Means within experiments not sharing superscripts differ at p < .05 (via Tukey LSD). *A generation effect significantly different from zero.

size but not small-product-size problems. Furthermore, postexperimental tests in Table 1 show that recall of the generate answers rather than read answers drove the interaction. In sum, we replicated the generation effect for large-product-size multiplication problems but not small-product-size multiplication problems, reported earlier by Pesta et al. (1996). The operand retrieval strategy, alone, cannot explain this result, because smallproduct-size and large-product-size problems contain the same number of operands. If combined with the intermediate-steps hypothesis, though, the operand-retrieval strategy can perhaps account for the generation-effect difference across product size. Operand retrieval, by itself, may not provide enough retrieval cues to produce a strong generation effect for small-product-size problems. The generation of large-product-size answers, however, more often involves computation, which allows for both the operands and the intermediate steps to serve as retrieval cues. The result is a strong generation effect for these items. Ifthis explanation is correct, increasing the size of the generation effect for small-product-size problems should be possible, simply by increasing the number of retrieval cues that generating these items creates. EXPERIMENT 2 To test this hypothesis, in Experiment 2, each multiplication answer was presented twice. In the same-operand condition, the operands at first and second presentations were identical (e.g., 3 X 6 = 18, followed 20 problems later by 3 X 6 = 18). In the different-operand condition, different problems-with the same answers-were given

GENERATION EFFECT across presentations (e.g., 3 X 6 = 18, followed 20 problems later by 2 X 9 = 18). The operand-retrieval strategy predicts a larger generation effect in the different-operand condition than in the same-operand condition, because twice as many potential retrieval cues (i.e., operands) for the target answers are present in that condition.

Method Participants and Design. The participants were 64 undergraduate psychology students, with mean ages of 25.4 (SD = 7.2) and 24.4 (SD = 6.1) years (same-operand and different-operand conditions, respectively). Each group contained 8 males and 24 females (by chance), with I additional participant excluded for not completing all multiplication problems. The design was a 2 X 2 X 2 mixed factorial, with product size (small, large) and generate condition (read, generate) as within-subjects variables. The betweensubjects variable was operand condition (same or different operands at second presentation), and we randomly assigned participants to its levels. Materials and Procedure. The experimental packets contained 40 multiplication problems, not counting the primacy and recency items described previously. The 40 problems had only 20 unique answers, because each answer occurred twice, using either the same operand or a different operand at the second presentation. Also, 8 of the 40 problems contained an operand larger than 12 (e.g., 27 X 3 = ??, 14 X 2 = ??). We included these problems because there are too few answers in the 2 X 2 through 12 X 12 multiplication table that are solvable by more than one unique combination of operands (which was also our reason for manipulating operand condition between subjects). An equal number of these problems, however, occurred in the read/generate and same-zdifferent-operand conditions. Ten ofthe 20 products were small (i.e., range = 8-36) and 10 were large (i.e., range = 42-108); half of each type were read and half were generated. The items were ordered into the packet by alternating between read and generate and between small and large product size and by separating the repetition of each answer by 20 other problems. We then compiled three other experimental packets, counterbalancing the items in the read/generate and same-/differentoperand conditions across packets. We distributed the packets at random to the participants, and the procedure of Experiment 2 was identical to that of Experiment I.

Results and Discussion No effects involving the counterbalancing factor were significant. The mean percentage of errors made when solving the multiplication problems was 8.1% (SD = 0.82) for the different-operand condition and 5.3% (SD = 0.88) for the same-operand condition; this difference was not significant (t < 1.0). The participants in the former condition, however, made more recall intrusions (M = 1.3, SD = 1.3) than did the participants in the latter condition (M = 0.63, SD = 1.0) [t(62) = 2.13]. This effect was probably due to the fact that the different-operand participants solved twice as many different problems as the same-operand participants. Mean recall level out of five possible and standard deviations for each cell appear in Table I. The ANOVA revealed main effects of generation [F(1,60) = 50, MSe = 1.1], the generation X product size interaction [F(1,60) = 9.9, MSe = 0.8], and the generation X product size X operand condition interaction [F(1,60) = 6.7,

109

MSe = 0.8]. The operand manipulation, by itself, did not interact with the generation effect[F(1,60) = 1.87, MS e =

1.1]. The difference scores and postexperimental tests listed in Table I show generation effects of similar size for all conditions except same-operand, small-productsize problems. Furthermore, as in Experiment I, recall of the generate answers drove the highest order interaction. The operand-retrieval strategy predicted the data pattern for small-product-size problems here. A generation effect existed for these problems only in the differentoperand condition, where twice as many retrieval cues were present as in the same-operand condition. A result not predicted by this hypothesis, alone, was that the operand manipulation did not affect large-product-size problems. An explanation may follow, however, by combining the operand-retrieval strategy and the intermediate steps hypothesis. Because no intermediate steps are involved in solving a small-product-size problem, the use of different operands effectively doubled the number of retrieval cues to these answers at memory test. In contrast, large-product-size problems may have already contained enough retrieval cues to produce a strong generation effect, because the generation of intermediate steps is a necessary by-product of computing these answers. The logic described here appeals to diminishing returns-adding more retrieval cues to an already memorable target item (i.e., a computed large-product-size answer) had a smaller impact on the generation effect than did doubling the amount ofretrieval cues available to a less memorable target item (i.e., a retrieved small-product-size answer).Thus, our operand manipulation had an impact on the generation effect for small-product-size problems, but not for large-product-size problems.

EXPERIMENT 3 Experiments I and 2 did not provide a direct link between the type of mental processing participants engage in when solving a multiplication problem and the size of the generation effect. Instead, we assumed that computation was more likely when solving large-product-size problems than when solving small-product-size problems (see, e.g., Ashcraft, 1992). In Experiment 3, we directly tested whether problem-solution strategy (i.e., computation versus memory retrieval) moderates the generation effect: We had participants report strategy use for each multiplication problem they solved. The mental arithmetic literature shows convergence between selfreports of strategy use and on-line measures of problemsolution time (Geary, Frensch, & Wiley, 1993; Geary & Wiley, 1991; LeFevre et aI., 1996). Therefore, we asked participants for self-reports of strategy use after they solved each multiplication problem. Unlike Experiment 2, however, each problem was presented only once. If our logic holds, the self-report categories should moderate the generation effect, with computed answers showing larger generation effects than retrieved answers.

110

PESTA, SANDERS, AND MURPHY

Method Participants and Design. The participants were 40 female and 14 male undergraduates with a mean age of 23.S (range = 17-39) years. Two additional participants were excluded because they solved the read items, as shown by their self-reports. The design replicated that in Experiment I. Materials. The experimental packets contained 20 items, not counting the previously described primacy and recency filler items. The 20 items included 10 small-product-size (i.e., answers ranging from 12 to 30) and 10 large-product-size (i.e., answers ranging from 42 to lOS) problems. Two of the 10 large-product-size problems contained an operand greater than 12 (i.e., 4 X 16 = ??, and 18 X 3 = ??). We included these to assess the validity of our self-report data. Because few people should have these answers committed to memory, computation should be selected as the solution strategy. The study packets used here contained four self-report categories printed next to each problem's answer space. As the participants read or solved the problems, they checked a space next to one of the four possible strategies, which were (a) The answer was providedfor me (for read items), (b) I remembered the answer immediately, (c) I remembered the answer after some time, and (d) I had to figure out the answer. We had the participants rate strategy use for read items to unconfound the act of rating with the read/generate independent variable. Procedure. The instructions were similar to those in Experiment I but also included information on strategies for solving simple multiplication problems. Examples of easy (e.g., 2 X 2 = ??) and difficult (e.g., II x II = ??) multiplication problems were provided to illustrate the difference between retrieving an answer from memory and having to figure out or compute an answer. The participants were then introduced to our self-report categories, and other sample problems were provided for which strategy use was rated. A read problem was included in the samples, and the participants were asked to use the category, The answer was providedfor me, exclusively for these problems. The procedure of Experiment 3 was otherwise identical to that of Experiment I.

Results and Discussion No source of variance involving the counterbalancing factor was significant. The mean percentage of errors made when solving the multiplication problems was 2.4% (SD = 0.47), and the mean number ofrecall intrusion errors was 0.24 (SD = 0.51). Table 1 shows mean recall values and standard deviations by product size and generate condition for this experiment. The ANOVArevealed main effects ofproduct size [F(l,52) = 11.4,MSe = 1.01],generate condition [F(l,52) = 63, MS e = 1.13], and the product size X generate condition interaction [F(l,52) = 4.6, MS e = 0.90]. As shown in Table 1, although the difference in recall ofthe read answers across product size was again not significant, small-product-size problems showed a small but significant generation effect. We suspect that rating strategy use for problems here increased the memorability of the generate answers. The key result, though, is that large-product-size problems again showed a larger generation effect than did small-product-size problems. The strategy self-report data, summarized in Table 2, are based on the ratings of half the sample (i.e., 27 participants for each problem), because half the items any participant experienced occurred in the stimulus set as read items. The table excludes data on eight small-product-size problems solved exclusively via immediate memory re-

Table 2 The Generation Effect and the Percentage of Participants Reporting Computation, Immediate, or Delayed Memory Retrieval as the Solution Strategy for Experiment 3 Problems Strategy Type

Problem

Generation Immediate Retrieval Effect (%)* (%)

3 x S = 24

22

93

3x9=27 6 X 7 = 42 6xS=48 18X3=54 7 x S = 56 7 x 9 = 63 4 x 16 = 64 6 x 12 = 72 7 x 12 = 84 S x 12 = 96 9 x 12 = 108

33 IS 26 30

89 89

Delayed Retrieval

Answer Computation

(%)

(%)

4 II II

4

o

37 7 30

93 II 78 78 19 30

22

22

33 37

IS

IS 19 30

II

22

33

II 19 19 II

o o

7 78 3 3 71 55 59 55 67

Note-The self-report percentages are based on n = 27. *Calculated by subtracting the percent recall for the 27 participants reading the item from the percent recall for the 27 other participants generating the item.

trieval of the answer. We conducted several analyses on the self-report data, but all yielded the same conclusion: Solution strategy did not moderate the generation effect for large-product-size problems. Consider just our initial analysis, which involved calculating percent correct recall values for large-product-size problems across each possible level ofstrategy. The means were 44% (SD = 19) for immediate retrieval, 64% (SD = 39) for delayed retrieval, and 50% (SD = 36) for computed answers. The ANOVA was not significant, nor was any pairwise comparison. Again, this conclusion followed in all other analyses (these also included recall of the read items to assess generation-effect differences among the items more directly). The breakdown of strategy use by problem in Table 2 is consistent with commonsense notions of problem difficulty, yet the self-report data did not moderate the generation effect. In our attempt to make sense of this result, we discovered the peculiar data pattern illustrated in Table 3. The table lists generation effects, ranked within experiments and ordered by product size, for each problem used in more than one of our experiments. For example, the problem 2 X 8 = 16 appeared in all three experiments, but its generation effect varied dramatically across experiments (i.e., large in Experiment I, intermediate in Experiment 2, and small in Experiment 3). In fact, small-product-size problems showed a modest inverse correlation [r( 19) = -.10, p > .05] for the generationeffect ranks across all pairs of same answers." The data followed a different pattern for large-productsize problems. The generation-effect ranks were relatively stable for the same problems across experiments, and the reliability correlation [r(21) = .51] was positive and significant. We believe the Table 3 data reflect a

GENERATION EFFECT

Table 3 Actual and Predicted Generation-Effect Ranks for Identical Problems as a Function of Experiment Experiment I Problem

Experiment 2

Experiment 3

Actual Predicted Actual Predicted Actual Predicted Small Products

4x2=8 3x3=9 3 X 4 = 12 2 X 8 = 16 2 X 9 = 18 4 X 5 = 20 3 X 8 = 24 3 X 9 = 27 7 X 4 = 28

9 10 7 2 I 7 5 3 7

2.5 9 9 2.5 I 5.5 5.5 5.5 9

6 6 7 7 7 8 9

9 8 7 6 5 4 I

6 9.5 2.5 6 6 9.5 6

3 9 1.5 5 7.5 7.5 5

2.5 7.5 2.5 7.5 4.5 7.5 7.5

1.5 8.5 8.5 3.5 5 1.5 3.5

6 8 8 2.5 2.5 2.5 6

8 5.5 3.5 1.5 7 3.5 1.5

7 3 3 7 3 3 7

Large Products X X X X

7 = 42 8 = 48 8 = 56 9 = 63 X 12 = 84 X 12 = 96 X 12 = 108

8 3.5 5.5 5.5 3.5 I 3.5

2.5 2.5 2.5 6 6 8 6

Note-Small ranks indicate larger generation effects, and ranks are within product size. The correlations between actual and predicted values are r(23) = +.64 for small-product-size problems, and r(21) = - .18 for large-product-size problems. The correlations between the actual generation-effect ranks, using all possible pairs of same answers (i.e., reliability) are r(l9) = -.10 for small-product-size problems and r(21) = +.51 for large-product-size problems.

spreading activation mechanism in semantic memory, occurring whenever the participants retrieved the answers to single-digit multiplication problems. This mechanism affected the episodic recall probability of other problems associated with the generated items. Wedescribe this mechanism next, and then explain why our self-report categories failed to moderate the generation effect for large-product-size problems. Afterward, we present a final experiment, designed to rule out an alternative explanation for the data so far and to test, a priori, the hypothesis we describe below. SEMANTIC MEMORY AND THE GENERATION EFFECT

We propose that both semantic and episodic memory processes determine the size of the generation effect for a given multiplication problem, and we begin the explanation with Ashcraft's (1992) network-retrieval model of semantic memory (see also Campbell, 1995; Siegler, 1988). In the network model, multiplication answers are stored in semantic memory as nodes in an associative network. Familiarity or experience with a problem determines the memory strength of each node. Overall, more experience with a problem leads to a stronger associative link between it and its correct answer. Each answer in the network connects to every other answer, but associative strength determines the semantic distance between two nodes. Twoanswers are near neigh-

III

bors if adding or subtracting I to either ofthe first answer's operands results in the second answer (see, e.g., Ashcraft, 1992). For example, the problem (3 X 7 =) 21 has the following neighbors: (3 X 6 =) 18, (3 X 8 =) 24, (2 X 7 =) 14, and (4 X 7 =) 28. When a person retrieves a multiplication answer from the network, activation from the operands spreads to the target answer node and then to its neighbors. For example, solving the problem 3 X 7 = ?? would activate the answer node 21 and then, to a smaller extent, the neighbor nodes 18,24,14, and 28. We suggest that neighbornode activation in semantic memory affects the recall probability for these items on an episodic memory test. In other words, solving 3 X 7 = ?? can make the answer 24 more memorable. But, in a generation-effect experiment, both read and generate items can receive neighbor activation. Therefore, when a generate item is activated, the resulting generation effect for that item should be increased. Conversely, when a read item is activated, its generation effect should be decreased. To test this idea, we first identified neighbor nodes for every generate item used in our experiments, excluding 12s problems, and the different-operand condition of Experiment 2. The key was to separate generate items into groups as a function of the particular counterbalancing packet in which they occurred. Recall that, within experiments, the items generated by half the participants were read by the other half. So, which particular neighbors were activated depended on which particular counterbalancing packet the participant received. For each read or generate answer within counterbalancing sets, we tallied the number of activations received from neighbors generated in the same set. Again, across the two counterbalancing sets, each answer was generated once and was read once. The difference in neighbor activations for each item when generated versus read was then used to predict the size of the generation effect for each problem, relative to every other problem (i.e., rankpredicted generation effects for each problem within an experiment). Consider the problem 3 X 4 = 12, used in Experiment 1. When presented as a generate item in one packet (i.e., 3 X 4 = ??), the answer was activated as a near neighbor by two other generate items in the same packet (i.e., 2 X 4 = ?? and 3 X 6 = ??). When presented as a read item in the other packet (i.e., 3 X 4 = 12), the answer was activated as a near neighbor by three generate items in the same packet (i.e., 3 X 3 = ??, 4 X 4 = ??, and 2 X 7 = ??). The net influence of neighbornode activations on the generation effect for this item was -1.0, the difference between the number of generate and read activations. For comparison, the Experiment I problem, 2 X 9 = 18 was activated twice when it occurred as a generate item (i.e., 6 X 4 = ?? and 9 X 3 = ??) and zero times when it occurred as a read item. The net influence of neighbor-node activations on the generation effect for this item was +2.0. We therefore predicted the answer 18 to show a larger generation effect than the answer 12. In fact, the answer 18 showed a pos-

112

PESTA, SANDERS, AND MURPHY

itive generation effect (i.e., of +5 items), and the answer 12 showed a negative generation effect (i.e., of -1 item). On the basis of neighbor activations, predicted generation-effect values were ranked within experiments and were paired with the observed generation-effect ranks for each problem. The ranks within experiments were combined into one data set (without reranking across experiments) to maximize statistical power. We then regressed product size (as a continuous variable ranging from 8 to 81) and the neighbor-node-predicted values on the actual generation-effect ranks observed for each problem. The regression equation resulted in a multiple correlation of .66, showing that 43% of the generation-effect variance was explained by product size [/3 = .50, t(40) = 4.2] and the neighbor-node values [/3 = .43, t( 40) = 3.6] (these weights are standardized). The apparent generation-effect unreliability for smallproduct-size problems across experiments appears to have been due to neighbor activation (see the predicted values for these problems in Table 3). Large-product-size problems, however, did not require consideration of neighbor activation to produce reasonable generation-effect data. The contrast between small-product-size and largeproduct-size problems probably results from differences in stimulus familiarity: Small-product-size problems are more familiar, and, therefore, they activate their neighbors more strongly than do large-product-size problems. In sum, neighbor activation can partially explain the generation effects we observed for simple multiplication problems. That is, neighbor activation in semantic memory makes an item more memorable in episodic memory. The net influence on the generation effect, however, depends on whether the activated neighbor appeared in the stimulus set as a read item or a generate item.

EPISODIC MEMORY AND THE GENERATION EFFECT Our initial hypothesis that operands (see, e.g., MeNamara & Healy, 1995) and intermediate steps serve as retrieval cues clearly predicted strategy use to moderate the generation effect in Experiment 3. Self-reports should account for the correlation between product size and the generation effect. Our analysis, though, may have simply suffered from a lack of statistical power, in part because only 12 problems showed variability in reported solution strategy. To explore this possibility, we borrowed data from a recent mental arithmetic study. LeFevre et al. (1996) examined solution strategies for solving single-digit multiplication problems and reported strategy use and solutiontime data on individual problems. Their data allow us to examine the relationship between the generation effect and strategy use across every single-digit operand problem we used. Again, the critical prediction given the cue retrieval explanation is that the product-size/generationeffect correlation should disappear when strategy use is controlled.

We first calculated simple correlations between four variables: (1) the actual generation-effect ranks observed for problems within experiments, (2) product size, (3) reaction time (RT) to solve each problem, and (4) the percentage ofparticipants reporting memory retrieval as the solution strategy for each problem. Data comprising the last two variables, taken from the LeFevre et al. (1996) appendix, were also ranked (using raw instead of rank values resulted in the same conclusions). The actual generation-effect ranks correlated significantly with product size [r( 43) = .50], RT [r(43) = .59], and percentage use of direct memory retrieval [r(43) = .53]. The product-size correlation disappeared, however, when strategy use was controlled [r(40) = -.23,p = .14] or when RT was controlled [r(40) = -.12,p = .47] [the simple correlation between strategy use and RT was r(43) = .73]. In contrast, when product size was partialed out ofthe correlation between the actual generation effect and either the strategy use or the RT variable, significant correlations remained [strategy use, r(40) = .32; RT, r(40) = .38]. The partial correlations support the idea that problem-solution strategy moderates the generation effect for simple multiplication problems. Our experiments used product size as an indicator of solution strategy, and product size, in fact, moderated the generation effect. The key result, however, is that measures of strategy use and solution time each accounted for the shared variance between product size and the actual generation effect. In other words, the generation effect is larger when a participant computes an answer than when when he or she retrieves an answer from memory. As hinted at above, however, a possible alternative explanation exists for our data.' Notice that RT correlates more highly with product size than does the strategy-use variable. In our analyses, we assumed both RT and strategy use to indicate how a problem is solved. It is possible (but uninteresting) that the critical interaction exists simply because participants spend more time computing large-product-size answers than they do retrieving smallproduct-size answers.

EXPERIMENT 4 We therefore present a final experiment in which the amount oftime participants spent studying small-productsize and large-product-size problems was controlled. We accomplished this by encouraging participants to continue processing large-product-size and small-productsize answers after solving them for an upcoming memory test. An a priori test of the neighbor-node model was also constructed, predicting the actual generation-effect ranks to correlate with predicted values based on neighbor-node activations. Furthermore, because the stimuli here were identical to those in Experiment 1, the ranked generation effects for small-product-size problems should correlate across these experiments (unlike the -.10 correlation reported in Table 3). A final goal of Experiment 4 was to see whether presenting the items in a pseudorandom

GENERATION EFFECT

113

model therefore predicts stability in the generation effect for these items. In fact, the observed generation-effect correlation for small-product-size problems across ExperMethod iments 1 and 4 was significant with 10 observations Participants and Design. The participants were 5 males and 19 [r(9) = .59,p < .05, one-tailed]. This correlation contrasts females with a mean age of21.6 (range = 17-35) years. The design sharply with the - .10 value observed in Table 3.4 replicated that in Experiment I. In sum, Experiment 4 showed that the generate condiMaterials and Procedure. The 24 multiplication problems in Experiment I were printed separately on 6 X 8 in. index cards in tion X product size interaction exists even when controlling for study time. The experiment also offers some boldface 72-point font. We compiled two sets of index cards that corresponded to the two counterbalanced stimulus packets used in degree ofa priori evidence for the neighbor-node hypothExperiment I. The first set was used with half the participants, and esis. To bolster our argument, however, we provide three the second set was used with the other half. Unlike Experiment I, more analyses related to this hypothesis, using data from however, the multiplication problems were arranged in a pseudoall our experiments. random order, fixed for all participants. So far, the neighbor-node hypothesis predicted only The procedure differed from Experiment 1 only as follows. First, generation-effect differences among problems. If the the experiment was run in groups ranging from 2 to 6 participants. neighbor effect is real, however, activations should also Second, the stimuli were presented at a rate of 5 sec each. During this period, the participants wrote down the problem and its answer correlate significantly with the percent correct recall valon a blank sheet of paper and spent any remaining time studying ues for each item, rather than just the generation-effect the answer for later recall. After stimulus presentation, the particidifference scores. Moreover, our hypothesis restricted the pants engaged in a 3D-secdistractor task, circling the letter I as it appeared in campus newspaper articles (we checked to ensure that the effect of neighbor activations to generate items only. It is participants were doing this, but we did not analyze these data). The possible, though, that reading an item (e.g., 3 X 4 = 12) also results in some degree of spreading activation that distractor task was used to avoid possible ceiling effects, because, in previous experiments, some delay existed between the time the affects the episodic recall probability of its neighbors. participant finished the study packet and the time the participant We addressed these issues by conducting a regression received the recall sheet. Notice that, even with the distractor task, analysis, using four variables to predict percent recall for recall for all items here was elevated relative to recall in the first each single-digit operand problem in our study: (1) neighthree experiments (see Table 1). After the distractor task, the participants free-recalled their answers on a second blank sheet of bor activations resulting from generating other problems, (2) neighbor activations resulting from reading other paper. problems, (3) product size, and (4) whether the target Results and Discussion item itself was read or generated. The multiple correlaTable 1 lists mean recall values by product size and tion was .53, showing that our independent variables exgenerate condition for Experiment 4. No source of vari- plained 29% of the recall variance. Neighbor activation ance involving the counterbalancing factor was signifi- from reading an item, however, did not predict recall cant. The mean percentage of solution errors made was [13 = .11, t(113) = 1.02], but the other three variables 2.1% (SD = 0.41), and the mean number of recall intru- did [generate condition, 13 = .48, t(I13) = 4.5; product sions was 0.92 (SD = 1.2). The ANOVA revealed main size, 13 = .32, t(113) = 3.4; neighbor activations from effects of generate condition [F(1,23) = 8.0, MS e = 1.6], generated items, 13 = .26, t(113) = 2.4]. The next analysis involved first categorizing smallproduct size (marginal)[F(1,23) = 4.12, MS e = l.3,p = .054], and the generate condition X product size interac- product-size answers based on how many activations tion[F(1,23) = 12.2,MSe = 0.72]. As in the previous ex- they received from generated neighbors and then looking periments, only large-product-size problems showed a sig- for percent correct recall differences across the categornificant generation effect, and recall of the generate, ies. For example, 16 answers in our study were not actirather than read, answers drove the interaction (see Ta- vated as neighbors by generate problems in the same ble 1). These data show that product size indeed moder- stimulus set. These answers averaged 38% (SD = 15) ates the generation effect, even with study time controlled. correct recall. The mean value for the 13 answers receivTurning to semantic memory, the first prediction was ing one activation was similar: 35% (SD = 14). For the that the actual and neighbor-node-predicted generation- 9 problems receiving two (n = 7) or three (n = 2) actieffect ranks would correlate. Although in the right di- vations, however, correct recall was at 59% (SD = 18). rection, the observed correlation was not significant, The one-way ANOVA conducted on these data resulted given the small number of observations it was based on in a main effect ofnumber of activations [F(2,35) = 6.5, [r(15) = .24]. [Looking at small-product-size problems MS e = 254]. Postexperimental tests revealed that anonly, the correlation was larger, but still nonsignificant, swers in the latter group were recalled best, with no rer(9) = .37.] Despite power concerns, however, the second call difference between the former two groups. Finally, we conducted a regression analysis on the acprediction was supported. To review, Table 3 shows the lack of generation-effect stability for small-product-size tual generation effect for all single-digit operand probproblems across experiments. Because Experiments I and lems in this study. The independent variables were the 4 used the same stimuli, the pattern of neighbor activa- neighbor-node-predicted values and mean RT to solve tions for each item was the same across experiments. The each problem (from LeFevre et aI., 1996). We excluded order, as opposed to the alternating order used in the previous experiments, would affect the critical interaction.

114

PESTA, SANDERS, AND MURPHY

the self-report measure due to its correlation with RT [i.e., r(43) = .73] and assumed both measure the same thing. The multiple correlation was. 71, showing that the neighbornodes [13 = .38, t(56) = 3.9] and RT [13 = .64, t(56) = 6.7] explained 51% of the generation-effect variance.

GENERAL DISCUSSION We propose that the generation effect for multiplication problems is influenced by both semantic and episodic memory processes. The model describing these effects is process-oriented, as opposed to item-oriented, appealing to spreading activation in semantic memory and to cue retrieval in episodic memory. Specifically, the generation effect for any single multiplication problem is influenced by (1) the frequency with which the answer occurred as a neighbor node for other generate problems in the same stimulus set, (2) the use ofthe problem's operands as retrieval cues to the answer (i.e., McNamara & Healy's, 1995, operand-retrieval strategy), and (3) the use of any intermediate steps, if computing the answer, as an additional source of retrieval cues. The idea that semantic memory plays a role in producing the generation effect is not new, nor is its influence limited to multiplication items. Several studies using verbal items have implicated semantic memory processes. These include studies examining generation-effect differences among experts versus novices (see, e.g., Peynircioglu & Mungan, 1993; Reardon, Durso, Foley, & McGahan, 1987; see also Pesta et al., 1996), high- versus lowfrequency words (see, e.g., Gardiner et al., 1988; Nairne & Widner, 1988), and words versus nonwords (see, e.g., McElroy & Slamecka, 1982; Nairne & Widner, 1987). Whether our findings generalize to addition or division problems is an empirical question, but nonetheless an interesting one. For example, McNamara and Healy (1995) observed no generation effect for addition problems unless they were solved in a study list also containing multiplication problems. The neighbor-node idea may serve as an additional aid in explaining the generation effect (or lack thereof) for addition problems. The present data also tie to the verbal domain in the work of'Roediger and McDermott (1995), who found strong falsememory effects for critical lure items that were associates of other words in a study list. For example, participants incorrectly remembered the word sleep after studying a list of sleep words (e.g., pillow, bed, rest, tired). Indeed, participants "remembered" critical lures as often as target items-a result explained by the large amount of residual activation the lure received from processing of its associates. In other words, semantic neighbor activation affected an item's episodic recall probability. Two differences, however, between the Roediger and McDermott (1995) study and ours deserve mention. First, we used the idea of neighbor activation to explain recall of other target items in the same stimulus set. Roediger and McDermott used it to predict recall intrusion errors.

We did not have enough intrusion errors to conduct a meaningful analysis with the present data, although the neighbor-node idea predicts intrusion error frequency to vary with how often nontarget multiplication answers are residually activated. Second, the strength of the experimental manipulations across studies was vastly different. Roediger and McDermott's impressive results came from using an extreme design: The critical lure was related to every studied list item. The pattern of residual activation here, however, was more subtle. At most, other studied items activated a target item three times. The result was weaker, but consistent, effects. Nonetheless, even relatively small amounts of residual activation in semantic memory can sometimes influence an item's recall probability on an episodic memory test, as shown here. To conclude, the memory-systems explanation for the generation effect provides specific, testable predictions and accounts for the interesting effects observed here, including the product-size interaction and the unreliability of the generation effect across experiments. The explanation ties to verbal items in the work of Roediger and McDermott (1995), and, finally, our data suggest that both semantic and episodic memory processes can influence the size of an observed generation effect. REFERENCES ASHCRAFT, M. H. (1992). Cognitive arithmetic: A review of data and theory. Cognition, 44, 75-106. ASHCRAFT, M. H., & CHRISTY, K S. (1995). The frequency ofarithmetic facts in elementary texts: Addition and multiplication in grades 1-6. Journal for Research in Mathematics Education, 26, 396-421. CAMPBELL, J.l. D. (1995). Mechanisms for simple addition and multiplication: A modified network-interference theory and simulation. Mathematical Cognition, 1, 121-164. CRUTCHER, R. J., & HEALY, A. F. (1989). Cognitive operations and the generation effect. Journal of Experimental Psychology: Learning, Memory, & Cognition, 14,669-675. DONALDSON, W., & BASS, M. (1980). Relational information and memory for problem solutions. Journal of Verbal Learning & VerbalBehavior, 19,26-35. FIEDLER, K, LACHNIT, H., FAY, D., & KRUG, C (1992). Mobilization of cognitive resources and the generation effect. Quarterly Journal of Experimental Psychology, 45A, 149-171. GARDINER, J. M., GREGG, V. H., & HAMPTON, J. A. (1988). Word frequency and generation effects. Journal ofExperimental Psychology: Learning, Memory. & Cognition, 14, 687-693. GARDlNER,1. M., & ROWLEY, J. M. C (1984). A generation effect with numbers rather than words. Memory & Cognition, 12,443-445. GEARY, D. C, FRENSCH, P. A., & WILEY, J. G. ( 1993). Simple and complex mental subtraction: Strategy choice and speed-of-processing differences in younger and older adults. Psychology & Aging, 8, 242-256. GEARY, D. C, & WILEY,). G. (1991). Cognitive addition: Strategy choice and speed-of-processing differences in young and elderly adults. Psychology & Aging, 6, 474-483. GREENE, R. L. (1992). Human memory: Paradigms and paradoxes. Hillsdale, NJ: Erlbaum. JACOBY, L. L. (1978). On interpreting the effects of repetition: Solving a problem versus remembering a solution. Journal of lerbal Learning & Verbal Behavior, 17, 649-667. LEFEVRE, J., BISANZ, J., DALEY, K E., BUFFONE, L., GREENHAM, S. L., & SADESKY, G. S. (1996). Multiple routes to solution of single-digit multiplication problems. Journal ofExperimental Psychology: General, 125, 284-306.

GENERATION EFFECT

McELROY, L. A., & SLAMECKA, N. J. (1982). Memorial consequences ofgenerating nonwords: Implications for semantic memory interpretations of the generation effect. Journal of Verbal Learning & Verbal Behavior, 21, 249-259. McNAMARA, D. S., & HEALY, A. F. (1995). A procedural explanation of the generation effect: The use ofan operand retrieval strategy for multiplication and addition problems. Journal of Memory & Language, 34,399-416. MITCHELL, D. B., & HUNT, R. R. (1989). How much "effort" should be devoted to memory? Memory & Cognition, 17, 337-348. NAIRNE, J. S., & WIDNER, R. L. (1987). Generation effects with nonwords: The role oftest appropriateness. Journal ofExperimental Psychology: Learning, Memory, & Cognition, 13, 164-171. NAIRNE, J. S., & WIDNER, R. L. (1988). Familiarity and lexicality as determinants of the generation effect. Journal ofExperimental Psychology: Learning, Memory, & Cognition, 14,694-699. PESTA, B. J., SANDERS, R. E., & NEMEC, R. J. (1996). Older adults' strategic superiority with mental multiplication: A generation effect assessment. Experimental Aging Research, 22,155-169. PEYNIRCIOGLU, Z. F., & MUNGAN, E. (1993). Familiarity, relative distinctiveness, and the generation effect. Memory & Cognition, 21, 367-374. REARDON, R., DURSO, F. T., FOLEY, M. A., & MCGAHAN, J. R. (1987). Expertise and the generation effect. Social Cognition, 5, 336-348. ROEDIGER, H. L., III (1990). Implicit memory: Retention without remembering. American Psychologist, 45, 1043- I056. ROEDIGER, H. L., III, & McDERMOTT, K. B. (1995). Creating false memories: Remembering words not presented in lists. Journal of Experimental Psychology: Learning, Memory, & Cognition, 21, 803-814. SIEGLER, R. S. (1988). Strategy choice procedures and the development of multiplication skill. Journal of Experimental Psychology: General, 117, 258-275. SLAMECKA, N. J., & GRAF, P. (1978). The generation effect: Delineation ofa phenomenon. Journal ofExperimental Psychology: Human Learning & Memory, 4, 592-604. TuLVING, E., & THOMSON, D. M. (1973). Encoding specificity and re-

115

trieval processes in episodic memory. Psychological Review, 80, 352-373. TYLER, S. w., HERTEL, P. T., MCCALLUM, M. c., & ELLIS, H. C. (1979). Cognitive effort and memory. Journal ofExperimental Psychology: Human Learning & Memory, 5, 607-617. NOTES 1. A reviewer wondered why people would bother remembering operands, when they could perhaps more readily remember the target answers. We believe that people do remember target answers, sometimes. When memory for the target answer fails, however, the operand-retrieval strategy provides a second route for retrieval. 2. The correlations are similar in size when number of items, instead of ranks, is used to index the generation effect. We used ranks, however, because the hypothesis we describe predicts generation-effect magnitude differences among problems within an experiment, as opposed to actual generation-effect values for each problem. 3. The product size X generate condition interaction may also exist because large-product-size problems are more distinct than smallproduct-size problems. The following analysis, though, rules out this explanation: We operationalized distinctiveness based on distance and solution space. For each answer used in this study, we calculated its absolute mean distance from every other answer. Across product size, this resulted in a curvilinear function, with the smallest mean distance occurring for answers in the lOs and 20s decades (i.e., these being least distinct). Distinctiveness, however, did not correlate with the generation effect, after controlling for product size (conversely, product size's correlation with the generation effect was unchanged, even after controlling for distinctiveness). 4. Admittedly, the reported correlations have less than ideal statistical power. We are in the process, however, of designing more powerful and direct tests of the neighbor-node model. (Manuscript received March 27,1997; revision accepted for publication December 8, 1997.)

A beautiful day in the neighborhood: What factors determine the generation effect for simple multiplication problems? BRYAN J. PESTA University ofAkron, Akron, Ohio, and Cleveland State University, Cleveland, Ohio and RAYMOND E. SANDERS and MARTIN D. MURPHY University ofAkron, Akron, Ohio In four experiments, we examined the generation effect for the free recall of simple multiplication answers. Large-product-size problems showed a consistent generation-effect advantage over smallproduct-size problems, except when each answer was generated twice, via two different sets of operands (Experiment 2). Also, measures of problem-solution time and strategy use accounted for the largeproduct-size advantage. Across experiments, however, small-product-size problems (but not largeproduct-size problems) showed considerable variation in the size of their generation effect. We discovered that solving small-product-size problems via direct memory retrieval increased the episodic recall probability of other problems that were near neighbors to the generated answer, and we attribute this result to a spreading activation mechanism in semantic memory. A measure of neighbor activations, combined with RTto solve each problem, accounted for 51% of the observed generation-effect variance.

The generation effect is better memory for items actively produced by a participant than for items merely supplied by an experimenter (Jacoby, 1978; Slamecka & Graf, 1978). For example, memory for a word will be better if the participant generates it from a fragment (e.g., Cuck), than if the participant simply reads it (e.g., truck). Similarly, memory for a multiplication answer will be better if the participant solves the problem (e.g., 3 X 7 = ??) than if the participant just reads the answer (e.g., 3 X 7 = 21). In Slamecka and Graf's (1978) original experiments, generation effects were found using synonym, antonym, rhyme, associative, and category generation rules. The generation effect is also robust. It occurs with words (Slamecka & Graf, 1978), nonwords (McElroy & Slamecka, 1982; Nairne & Widner, 1987), and numbers (Crutcher & Healy, 1989; Gardiner & Rowley, 1984). Evidence exists, however, that the size ofthe generation effect depends on the relative degree of difficulty involved in generating an item or set of items. Tyler, Hertel, McCallum, and Ellis (1979) found that more difficult word stem completions were remembered better than were less difficult ones. Fiedler, Lachnit, Fay, and Krug (1992) obtained a similar result comparing memory for word stems that varied in how much context they provided about the to-be-generated items. The generate condition providing the most context (e.g., a car rolls on four wh_C) produced a smaller genThe authors thank Les Fisher for his assistance with this manuscript and Peter Graf, Alice Healy, Reed Hunt, and an anonymous reviewer for valuable comments. Correspondence should be addressed to B. 1. Pesta, 14610 Harley, Cleveland, OH 44111 (e-mail: [email protected]).

Copyright 1999 Psychonomic Society, Inc.

eration effect than did the generate condition providing no context (e.g., wh_C). Problem difficulty may also moderate the generation effect for simple multiplication problems. For example, Pesta, Sanders, and Nemec (1996) had participants read and solve simple multiplication problems. They manipulated problem difficulty via product size, which refers to the size of each problem's answer. Typically, smallproduct-size and large-product-size problems are solved in different ways: Most adults solve the former via direct memory retrieval of the answer and the latter via computation (see, e.g., Ashcraft, 1992). Differences in solution strategy are indeed the primary reason that largeproduct-size problems are more difficult to solve than small-product-size problems. In current models of mental arithmetic, though, problemfamiliarity determines whether an answer is computed or retrieved from memory. Since, developmentally, children are exposed to more small-product-size problems than large-product-size problems in elementary school (see, e.g., Ashcraft, 1992; Ashcraft & Christy, 1995), small-product-size problems in turn are more often retrieved from memory than are large-product-size problems. Even after an item's answer is stored in memory, however, familiarity can still affect its rate ofretrieval via strength ofrepresentation in memory (Ashcraft, 1992). Therefore, product size represents a manipulation ofboth stimulus familiarity and problem difficulty. Pesta et al. (1996) hypothesized that differences in solution strategy for small-product-size and large-productsize problems may translate into differences in the size of

106

GENERATION EFFECT

their resulting generation effects. Because computation requires more attention or "cognitive effort" than memory retrieval, computed multiplication answers should show a larger generation effect than retrieved multiplication answers. Consistent with this hypothesis, Pesta et al. found a strong generation effect for large-productsize problems but no generation effect for small-productsize problems. Therefore, in contrast to using verbal materials where familiarity increases the generation effect (see, e.g., Gardiner, Gregg, & Hampton, 1988; Nairne & Widner, 1988), in Pesta et aI., more difficult, less familiar problems showed the largest generation effect. That problem difficulty moderates the generation effect, however, is more an observation than an explanation. Simple effort accounts of the generation effect have fallen out offavor in the literature, partly because oftheir poor specification (see Greene, 1992, and Mitchell & Hunt, 1989, for discussions). Also, in some studies, a generation effect was found by using rules that require little effortful processing. For example, Donaldson and Bass (1980) used a generate rule that was trivially simple: Participants merely added the letter e to the end of each fragment (e.g., tabC, appC). Yet, even this small difference between reading and generating produced the effect. One can also obtain the effect with simple letterswitching generation rules (see, e.g., Nairne & Widner, 1987). Effortful processing is thus not a necessary condition for producing the generation effect. So, something else must explain the product size X generate condition interaction that exists for multiplication problems. Before pursuing such an explanation, why should psychologists be interested in memory for multiplication answers? We believe there are several good reasons. First, studies using multiplication problems speak directly to theories of the generation effect based on verbal items. Gardiner and Rowley (1984) first examined the generation effect for multiplication problems primarily to rule out a strong version of the lexical activation hypothesis (the idea that an item must have lexical status in semantic memory to show a generation effect). Second, multiplication facts can be ideal stimuli for anyone interested in the organization of knowledge in memory. The set of multiplication problems and answers is far smaller, with a more knowable structure, than the number of words in the average person's vocabulary. Finally, as we show, some interesting effects that exist in this area have strong parallels in the verbal domain. So, why does problem difficulty moderate the generation effect? Our explanation starts with the procedural account of McNamara and Healy (1995), which appeals to the operand-retrieval strategy. At memory test, participants may retrieve the operands to previously studied multiplication problems. The operands serve as powerful memory cues by allowing the regeneration of potential target answers. The participant then decides whether the regenerated answers occurred in the study phase of the

107

experiment. The operand-retrieval strategy takes advantage of transfer appropriate processing and encoding specificity (see, e.g., Roediger, 1990; Tulving & Thomson, 1973) for the generated, but not read, study items. These mechanisms then produce the generation effect.' The procedural account alone, however, does not explain why product size moderates the generation effect. Small-product-size and large-product-size problems contain the same number of operands, so the account would need to make the odd assumption that participants retrieve mostly large-product-size operands at memory test. An important difference exists, though, between solving small-product-size and large-product-size problems, which is not incorporated into the procedural account. Solving a large-product-size problem via computation results in the generation of intermediate steps. These steps may be retrieved for use as cues to the target answers, just as the operands presumably are. In effect, solving largeproduct-size problems via computation generates more potential cues to the target item than does retrieving a small-product-size answer directly from memory. For example, one computational strategy for solving the problem 8 X 12 = ?? is to first generate the intermediate steps of8 X 10 = 80 and 8 X 2 = 16, and then to add each result for the final product, 96. At test, the participant could retrieve the operands, the intermediate steps, or both for use as cues to the target answer. In contrast, solving a small-product-size problem via direct memory retrieval (e.g., 3 X 2 = ??) does not involve intermediate steps. The number of memory cues for these items is therefore limited to the operands that were present at study. EXPERIMENT 1 By combining the procedural account with the intermediate-steps hypothesis, a precise testable explanation exists for why problem difficulty moderates the generation effect. Our first concern, however, was to replicate the basic effect. We thus had college students read and solve multiplication problems varying in product size and then administered a free recall test for all answers. Method Participants and Design. The participants for our experiments were undergraduates in (voluntary) classroom demonstrations ofthe generation effect, and each experiment used different people. The participants here were 6 males and 18 females with a mean age of 22.3 (range = 18-38) years. The design was a 2 X 2 factorial, with product size (small, large) and generate condition (read, generate) as within-subjects variables. Materials. The experimental packet contained 24 multiplication problems selected from the 2 X 2 through 12 X 12 table. The first two problems (i.e., 10 X 10 = ??, and 10 X 9 = ?~) reappeared as the last two problems and served as filler items used to reduce primacy and recency effects. The remaining 20 problems consisted of 10 small-product-size (i.e., answers ranging from 8 to 28) and 10 large-product-size (i.e., answers ranging from 42 to 108) problems.

108

PESTA, SANDERS, AND MURPHY

We dichotomized product size to use an analysis of variance (ANOYA), but no conclusions changed when treating the variable as continuous in regression analyses. Within each level of product size, 5 problems appeared as read items (e.g., 2 X 6 = 12, or 6 X 7 = 42), and the other 5 appeared as generate items (e.g., 4 X 5 = ?~,or8 X 9 = ??). We then arranged these into the packet vertically, one on every fourth line. The participants either solved (for generate items) or copied (for read items) their answers onto an answer space that was printed horizontally next to each problem. The items were ordered to alternate between small-product-size and large-product-size problems, and between read and generate problems. A second experimental packet exactly reversed the assignment of read and generate conditions to items. The generate items in the first packet (e.g., 4 X 5 = ??, 8 x 9 = ??) served as the read items in the second packet (e.g., 4 X 5 = 20, 8 x 9 = 72), and vice versa. We distributed the packets at random to the participants. The instructions, printed on the first page of each packet, included two sample problems, one read and one generate, that the participants worked through. The instructions also required the participants to (I) work through the packet at their own pace, (2) not go back to a problem previously worked on, and (3) remember all ofthe answers in the packet, whether read or solved. Procedure. The 20-min experiment was conducted in a classroom, and it began with distribution of the experimental packets, handed out face-down. The experimenter then read the instructions aloud, pausing as the participants worked through the samples. Each participant then completed the problems at his or her own pace. When finished, each participant exchanged the study packet for a recall sheet. Instructions on the recall sheet asked the participants to write down, in any order, all of the answers they could remember, whether read or solved. The participants were also cautioned not to guess and were asked to write down an answer only if reasonably sure it appeared in the study packet.

Results and Discussion The level of statistical significance used for all analyses was p < .05. A preliminary statistical test found no significant effects involving the counterbalancing factor. Also, the mean percentage of errors made when generating the 10 multiplication answers was low (M = 2.5%, SD = 0.44; no errors were made when copying the read answers in any experiment). Finally, the mean number of recall intrusion errors (where the participant recalled an answer not contained in the study packet) was also low (M = 0.29, SD = 0.46). Mean free recall out of five possible for any cell and standard deviations are shown in Table 1. For each ANOVA described in this study, a parallel analysis was run using proportion correct recall as the dependent variable. These analyses excluded trials in which the participant generated a wrong answer to a problem. Given the small number of errors made, however, the conclusions reached in each analysis were identical, and so we report only the former. The ANOVA on the Experiment 1 data revealed significant effects of generate condition [F(l,22) = 17.0, MS e = 1.11], product size [F(1,22) = 9.0, MSe = 1.72], and the generate condition X product size interaction [F( 1,22) = 5.6, MSe = 1.45]. As shown in Table 1, a generation effect (i.e., a significant difference between recall of generate and read items) existed for large-product-

Table 1

Recall Means, Standard Deviations, and the Generation Effect by Experiment Condition Read

Generate

SD

Difference

1.32 1.00

0.29 1.46*

Small Large

Experiment 2, Same Operand 1.91a 1.17 1.84a 1.25 1.16 2.94 b 1.29 1.59a

0.07 1.35*

Small Large

Experiment 2, Different Operand 1.24 1.53a 1.39 2.53' 0.80 1.23 1.53a 2.66'

1.00* 1.13*

Small Large

1.89b 2.63'

Experiment 3 1.13 1.02a 1.17 1.20a

0.94 1.14

0.87* 1.43*

Small Large

3.00 a 4.03 b

Experiment 4 1.16 2.79 a 0.91 2.73 a

1.20 1.50

0.21 1.30*

Product Size

M

Small Large

1.79a 3.17 b

SD

M

Experiment I 1.56 1.50a 1.05 1.71"

Note-Maximum score = 5. Means within experiments not sharing superscripts differ at p < .05 (via Tukey LSD). *A generation effect significantly different from zero.

size but not small-product-size problems. Furthermore, postexperimental tests in Table 1 show that recall of the generate answers rather than read answers drove the interaction. In sum, we replicated the generation effect for large-product-size multiplication problems but not small-product-size multiplication problems, reported earlier by Pesta et al. (1996). The operand retrieval strategy, alone, cannot explain this result, because smallproduct-size and large-product-size problems contain the same number of operands. If combined with the intermediate-steps hypothesis, though, the operand-retrieval strategy can perhaps account for the generation-effect difference across product size. Operand retrieval, by itself, may not provide enough retrieval cues to produce a strong generation effect for small-product-size problems. The generation of large-product-size answers, however, more often involves computation, which allows for both the operands and the intermediate steps to serve as retrieval cues. The result is a strong generation effect for these items. Ifthis explanation is correct, increasing the size of the generation effect for small-product-size problems should be possible, simply by increasing the number of retrieval cues that generating these items creates. EXPERIMENT 2 To test this hypothesis, in Experiment 2, each multiplication answer was presented twice. In the same-operand condition, the operands at first and second presentations were identical (e.g., 3 X 6 = 18, followed 20 problems later by 3 X 6 = 18). In the different-operand condition, different problems-with the same answers-were given

GENERATION EFFECT across presentations (e.g., 3 X 6 = 18, followed 20 problems later by 2 X 9 = 18). The operand-retrieval strategy predicts a larger generation effect in the different-operand condition than in the same-operand condition, because twice as many potential retrieval cues (i.e., operands) for the target answers are present in that condition.

Method Participants and Design. The participants were 64 undergraduate psychology students, with mean ages of 25.4 (SD = 7.2) and 24.4 (SD = 6.1) years (same-operand and different-operand conditions, respectively). Each group contained 8 males and 24 females (by chance), with I additional participant excluded for not completing all multiplication problems. The design was a 2 X 2 X 2 mixed factorial, with product size (small, large) and generate condition (read, generate) as within-subjects variables. The betweensubjects variable was operand condition (same or different operands at second presentation), and we randomly assigned participants to its levels. Materials and Procedure. The experimental packets contained 40 multiplication problems, not counting the primacy and recency items described previously. The 40 problems had only 20 unique answers, because each answer occurred twice, using either the same operand or a different operand at the second presentation. Also, 8 of the 40 problems contained an operand larger than 12 (e.g., 27 X 3 = ??, 14 X 2 = ??). We included these problems because there are too few answers in the 2 X 2 through 12 X 12 multiplication table that are solvable by more than one unique combination of operands (which was also our reason for manipulating operand condition between subjects). An equal number of these problems, however, occurred in the read/generate and same-zdifferent-operand conditions. Ten ofthe 20 products were small (i.e., range = 8-36) and 10 were large (i.e., range = 42-108); half of each type were read and half were generated. The items were ordered into the packet by alternating between read and generate and between small and large product size and by separating the repetition of each answer by 20 other problems. We then compiled three other experimental packets, counterbalancing the items in the read/generate and same-/differentoperand conditions across packets. We distributed the packets at random to the participants, and the procedure of Experiment 2 was identical to that of Experiment I.

Results and Discussion No effects involving the counterbalancing factor were significant. The mean percentage of errors made when solving the multiplication problems was 8.1% (SD = 0.82) for the different-operand condition and 5.3% (SD = 0.88) for the same-operand condition; this difference was not significant (t < 1.0). The participants in the former condition, however, made more recall intrusions (M = 1.3, SD = 1.3) than did the participants in the latter condition (M = 0.63, SD = 1.0) [t(62) = 2.13]. This effect was probably due to the fact that the different-operand participants solved twice as many different problems as the same-operand participants. Mean recall level out of five possible and standard deviations for each cell appear in Table I. The ANOVA revealed main effects of generation [F(1,60) = 50, MSe = 1.1], the generation X product size interaction [F(1,60) = 9.9, MSe = 0.8], and the generation X product size X operand condition interaction [F(1,60) = 6.7,

109

MSe = 0.8]. The operand manipulation, by itself, did not interact with the generation effect[F(1,60) = 1.87, MS e =

1.1]. The difference scores and postexperimental tests listed in Table I show generation effects of similar size for all conditions except same-operand, small-productsize problems. Furthermore, as in Experiment I, recall of the generate answers drove the highest order interaction. The operand-retrieval strategy predicted the data pattern for small-product-size problems here. A generation effect existed for these problems only in the differentoperand condition, where twice as many retrieval cues were present as in the same-operand condition. A result not predicted by this hypothesis, alone, was that the operand manipulation did not affect large-product-size problems. An explanation may follow, however, by combining the operand-retrieval strategy and the intermediate steps hypothesis. Because no intermediate steps are involved in solving a small-product-size problem, the use of different operands effectively doubled the number of retrieval cues to these answers at memory test. In contrast, large-product-size problems may have already contained enough retrieval cues to produce a strong generation effect, because the generation of intermediate steps is a necessary by-product of computing these answers. The logic described here appeals to diminishing returns-adding more retrieval cues to an already memorable target item (i.e., a computed large-product-size answer) had a smaller impact on the generation effect than did doubling the amount ofretrieval cues available to a less memorable target item (i.e., a retrieved small-product-size answer).Thus, our operand manipulation had an impact on the generation effect for small-product-size problems, but not for large-product-size problems.

EXPERIMENT 3 Experiments I and 2 did not provide a direct link between the type of mental processing participants engage in when solving a multiplication problem and the size of the generation effect. Instead, we assumed that computation was more likely when solving large-product-size problems than when solving small-product-size problems (see, e.g., Ashcraft, 1992). In Experiment 3, we directly tested whether problem-solution strategy (i.e., computation versus memory retrieval) moderates the generation effect: We had participants report strategy use for each multiplication problem they solved. The mental arithmetic literature shows convergence between selfreports of strategy use and on-line measures of problemsolution time (Geary, Frensch, & Wiley, 1993; Geary & Wiley, 1991; LeFevre et aI., 1996). Therefore, we asked participants for self-reports of strategy use after they solved each multiplication problem. Unlike Experiment 2, however, each problem was presented only once. If our logic holds, the self-report categories should moderate the generation effect, with computed answers showing larger generation effects than retrieved answers.

110

PESTA, SANDERS, AND MURPHY

Method Participants and Design. The participants were 40 female and 14 male undergraduates with a mean age of 23.S (range = 17-39) years. Two additional participants were excluded because they solved the read items, as shown by their self-reports. The design replicated that in Experiment I. Materials. The experimental packets contained 20 items, not counting the previously described primacy and recency filler items. The 20 items included 10 small-product-size (i.e., answers ranging from 12 to 30) and 10 large-product-size (i.e., answers ranging from 42 to lOS) problems. Two of the 10 large-product-size problems contained an operand greater than 12 (i.e., 4 X 16 = ??, and 18 X 3 = ??). We included these to assess the validity of our self-report data. Because few people should have these answers committed to memory, computation should be selected as the solution strategy. The study packets used here contained four self-report categories printed next to each problem's answer space. As the participants read or solved the problems, they checked a space next to one of the four possible strategies, which were (a) The answer was providedfor me (for read items), (b) I remembered the answer immediately, (c) I remembered the answer after some time, and (d) I had to figure out the answer. We had the participants rate strategy use for read items to unconfound the act of rating with the read/generate independent variable. Procedure. The instructions were similar to those in Experiment I but also included information on strategies for solving simple multiplication problems. Examples of easy (e.g., 2 X 2 = ??) and difficult (e.g., II x II = ??) multiplication problems were provided to illustrate the difference between retrieving an answer from memory and having to figure out or compute an answer. The participants were then introduced to our self-report categories, and other sample problems were provided for which strategy use was rated. A read problem was included in the samples, and the participants were asked to use the category, The answer was providedfor me, exclusively for these problems. The procedure of Experiment 3 was otherwise identical to that of Experiment I.

Results and Discussion No source of variance involving the counterbalancing factor was significant. The mean percentage of errors made when solving the multiplication problems was 2.4% (SD = 0.47), and the mean number ofrecall intrusion errors was 0.24 (SD = 0.51). Table 1 shows mean recall values and standard deviations by product size and generate condition for this experiment. The ANOVArevealed main effects ofproduct size [F(l,52) = 11.4,MSe = 1.01],generate condition [F(l,52) = 63, MS e = 1.13], and the product size X generate condition interaction [F(l,52) = 4.6, MS e = 0.90]. As shown in Table 1, although the difference in recall ofthe read answers across product size was again not significant, small-product-size problems showed a small but significant generation effect. We suspect that rating strategy use for problems here increased the memorability of the generate answers. The key result, though, is that large-product-size problems again showed a larger generation effect than did small-product-size problems. The strategy self-report data, summarized in Table 2, are based on the ratings of half the sample (i.e., 27 participants for each problem), because half the items any participant experienced occurred in the stimulus set as read items. The table excludes data on eight small-product-size problems solved exclusively via immediate memory re-

Table 2 The Generation Effect and the Percentage of Participants Reporting Computation, Immediate, or Delayed Memory Retrieval as the Solution Strategy for Experiment 3 Problems Strategy Type

Problem

Generation Immediate Retrieval Effect (%)* (%)

3 x S = 24

22

93

3x9=27 6 X 7 = 42 6xS=48 18X3=54 7 x S = 56 7 x 9 = 63 4 x 16 = 64 6 x 12 = 72 7 x 12 = 84 S x 12 = 96 9 x 12 = 108

33 IS 26 30

89 89

Delayed Retrieval

Answer Computation

(%)

(%)

4 II II

4

o

37 7 30

93 II 78 78 19 30

22

22

33 37

IS

IS 19 30

II

22

33

II 19 19 II

o o

7 78 3 3 71 55 59 55 67

Note-The self-report percentages are based on n = 27. *Calculated by subtracting the percent recall for the 27 participants reading the item from the percent recall for the 27 other participants generating the item.

trieval of the answer. We conducted several analyses on the self-report data, but all yielded the same conclusion: Solution strategy did not moderate the generation effect for large-product-size problems. Consider just our initial analysis, which involved calculating percent correct recall values for large-product-size problems across each possible level ofstrategy. The means were 44% (SD = 19) for immediate retrieval, 64% (SD = 39) for delayed retrieval, and 50% (SD = 36) for computed answers. The ANOVA was not significant, nor was any pairwise comparison. Again, this conclusion followed in all other analyses (these also included recall of the read items to assess generation-effect differences among the items more directly). The breakdown of strategy use by problem in Table 2 is consistent with commonsense notions of problem difficulty, yet the self-report data did not moderate the generation effect. In our attempt to make sense of this result, we discovered the peculiar data pattern illustrated in Table 3. The table lists generation effects, ranked within experiments and ordered by product size, for each problem used in more than one of our experiments. For example, the problem 2 X 8 = 16 appeared in all three experiments, but its generation effect varied dramatically across experiments (i.e., large in Experiment I, intermediate in Experiment 2, and small in Experiment 3). In fact, small-product-size problems showed a modest inverse correlation [r( 19) = -.10, p > .05] for the generationeffect ranks across all pairs of same answers." The data followed a different pattern for large-productsize problems. The generation-effect ranks were relatively stable for the same problems across experiments, and the reliability correlation [r(21) = .51] was positive and significant. We believe the Table 3 data reflect a

GENERATION EFFECT

Table 3 Actual and Predicted Generation-Effect Ranks for Identical Problems as a Function of Experiment Experiment I Problem

Experiment 2

Experiment 3

Actual Predicted Actual Predicted Actual Predicted Small Products

4x2=8 3x3=9 3 X 4 = 12 2 X 8 = 16 2 X 9 = 18 4 X 5 = 20 3 X 8 = 24 3 X 9 = 27 7 X 4 = 28

9 10 7 2 I 7 5 3 7

2.5 9 9 2.5 I 5.5 5.5 5.5 9

6 6 7 7 7 8 9

9 8 7 6 5 4 I

6 9.5 2.5 6 6 9.5 6

3 9 1.5 5 7.5 7.5 5

2.5 7.5 2.5 7.5 4.5 7.5 7.5

1.5 8.5 8.5 3.5 5 1.5 3.5

6 8 8 2.5 2.5 2.5 6

8 5.5 3.5 1.5 7 3.5 1.5

7 3 3 7 3 3 7

Large Products X X X X

7 = 42 8 = 48 8 = 56 9 = 63 X 12 = 84 X 12 = 96 X 12 = 108

8 3.5 5.5 5.5 3.5 I 3.5

2.5 2.5 2.5 6 6 8 6

Note-Small ranks indicate larger generation effects, and ranks are within product size. The correlations between actual and predicted values are r(23) = +.64 for small-product-size problems, and r(21) = - .18 for large-product-size problems. The correlations between the actual generation-effect ranks, using all possible pairs of same answers (i.e., reliability) are r(l9) = -.10 for small-product-size problems and r(21) = +.51 for large-product-size problems.

spreading activation mechanism in semantic memory, occurring whenever the participants retrieved the answers to single-digit multiplication problems. This mechanism affected the episodic recall probability of other problems associated with the generated items. Wedescribe this mechanism next, and then explain why our self-report categories failed to moderate the generation effect for large-product-size problems. Afterward, we present a final experiment, designed to rule out an alternative explanation for the data so far and to test, a priori, the hypothesis we describe below. SEMANTIC MEMORY AND THE GENERATION EFFECT

We propose that both semantic and episodic memory processes determine the size of the generation effect for a given multiplication problem, and we begin the explanation with Ashcraft's (1992) network-retrieval model of semantic memory (see also Campbell, 1995; Siegler, 1988). In the network model, multiplication answers are stored in semantic memory as nodes in an associative network. Familiarity or experience with a problem determines the memory strength of each node. Overall, more experience with a problem leads to a stronger associative link between it and its correct answer. Each answer in the network connects to every other answer, but associative strength determines the semantic distance between two nodes. Twoanswers are near neigh-

III

bors if adding or subtracting I to either ofthe first answer's operands results in the second answer (see, e.g., Ashcraft, 1992). For example, the problem (3 X 7 =) 21 has the following neighbors: (3 X 6 =) 18, (3 X 8 =) 24, (2 X 7 =) 14, and (4 X 7 =) 28. When a person retrieves a multiplication answer from the network, activation from the operands spreads to the target answer node and then to its neighbors. For example, solving the problem 3 X 7 = ?? would activate the answer node 21 and then, to a smaller extent, the neighbor nodes 18,24,14, and 28. We suggest that neighbornode activation in semantic memory affects the recall probability for these items on an episodic memory test. In other words, solving 3 X 7 = ?? can make the answer 24 more memorable. But, in a generation-effect experiment, both read and generate items can receive neighbor activation. Therefore, when a generate item is activated, the resulting generation effect for that item should be increased. Conversely, when a read item is activated, its generation effect should be decreased. To test this idea, we first identified neighbor nodes for every generate item used in our experiments, excluding 12s problems, and the different-operand condition of Experiment 2. The key was to separate generate items into groups as a function of the particular counterbalancing packet in which they occurred. Recall that, within experiments, the items generated by half the participants were read by the other half. So, which particular neighbors were activated depended on which particular counterbalancing packet the participant received. For each read or generate answer within counterbalancing sets, we tallied the number of activations received from neighbors generated in the same set. Again, across the two counterbalancing sets, each answer was generated once and was read once. The difference in neighbor activations for each item when generated versus read was then used to predict the size of the generation effect for each problem, relative to every other problem (i.e., rankpredicted generation effects for each problem within an experiment). Consider the problem 3 X 4 = 12, used in Experiment 1. When presented as a generate item in one packet (i.e., 3 X 4 = ??), the answer was activated as a near neighbor by two other generate items in the same packet (i.e., 2 X 4 = ?? and 3 X 6 = ??). When presented as a read item in the other packet (i.e., 3 X 4 = 12), the answer was activated as a near neighbor by three generate items in the same packet (i.e., 3 X 3 = ??, 4 X 4 = ??, and 2 X 7 = ??). The net influence of neighbornode activations on the generation effect for this item was -1.0, the difference between the number of generate and read activations. For comparison, the Experiment I problem, 2 X 9 = 18 was activated twice when it occurred as a generate item (i.e., 6 X 4 = ?? and 9 X 3 = ??) and zero times when it occurred as a read item. The net influence of neighbor-node activations on the generation effect for this item was +2.0. We therefore predicted the answer 18 to show a larger generation effect than the answer 12. In fact, the answer 18 showed a pos-

112

PESTA, SANDERS, AND MURPHY

itive generation effect (i.e., of +5 items), and the answer 12 showed a negative generation effect (i.e., of -1 item). On the basis of neighbor activations, predicted generation-effect values were ranked within experiments and were paired with the observed generation-effect ranks for each problem. The ranks within experiments were combined into one data set (without reranking across experiments) to maximize statistical power. We then regressed product size (as a continuous variable ranging from 8 to 81) and the neighbor-node-predicted values on the actual generation-effect ranks observed for each problem. The regression equation resulted in a multiple correlation of .66, showing that 43% of the generation-effect variance was explained by product size [/3 = .50, t(40) = 4.2] and the neighbor-node values [/3 = .43, t( 40) = 3.6] (these weights are standardized). The apparent generation-effect unreliability for smallproduct-size problems across experiments appears to have been due to neighbor activation (see the predicted values for these problems in Table 3). Large-product-size problems, however, did not require consideration of neighbor activation to produce reasonable generation-effect data. The contrast between small-product-size and largeproduct-size problems probably results from differences in stimulus familiarity: Small-product-size problems are more familiar, and, therefore, they activate their neighbors more strongly than do large-product-size problems. In sum, neighbor activation can partially explain the generation effects we observed for simple multiplication problems. That is, neighbor activation in semantic memory makes an item more memorable in episodic memory. The net influence on the generation effect, however, depends on whether the activated neighbor appeared in the stimulus set as a read item or a generate item.

EPISODIC MEMORY AND THE GENERATION EFFECT Our initial hypothesis that operands (see, e.g., MeNamara & Healy, 1995) and intermediate steps serve as retrieval cues clearly predicted strategy use to moderate the generation effect in Experiment 3. Self-reports should account for the correlation between product size and the generation effect. Our analysis, though, may have simply suffered from a lack of statistical power, in part because only 12 problems showed variability in reported solution strategy. To explore this possibility, we borrowed data from a recent mental arithmetic study. LeFevre et al. (1996) examined solution strategies for solving single-digit multiplication problems and reported strategy use and solutiontime data on individual problems. Their data allow us to examine the relationship between the generation effect and strategy use across every single-digit operand problem we used. Again, the critical prediction given the cue retrieval explanation is that the product-size/generationeffect correlation should disappear when strategy use is controlled.

We first calculated simple correlations between four variables: (1) the actual generation-effect ranks observed for problems within experiments, (2) product size, (3) reaction time (RT) to solve each problem, and (4) the percentage ofparticipants reporting memory retrieval as the solution strategy for each problem. Data comprising the last two variables, taken from the LeFevre et al. (1996) appendix, were also ranked (using raw instead of rank values resulted in the same conclusions). The actual generation-effect ranks correlated significantly with product size [r( 43) = .50], RT [r(43) = .59], and percentage use of direct memory retrieval [r(43) = .53]. The product-size correlation disappeared, however, when strategy use was controlled [r(40) = -.23,p = .14] or when RT was controlled [r(40) = -.12,p = .47] [the simple correlation between strategy use and RT was r(43) = .73]. In contrast, when product size was partialed out ofthe correlation between the actual generation effect and either the strategy use or the RT variable, significant correlations remained [strategy use, r(40) = .32; RT, r(40) = .38]. The partial correlations support the idea that problem-solution strategy moderates the generation effect for simple multiplication problems. Our experiments used product size as an indicator of solution strategy, and product size, in fact, moderated the generation effect. The key result, however, is that measures of strategy use and solution time each accounted for the shared variance between product size and the actual generation effect. In other words, the generation effect is larger when a participant computes an answer than when when he or she retrieves an answer from memory. As hinted at above, however, a possible alternative explanation exists for our data.' Notice that RT correlates more highly with product size than does the strategy-use variable. In our analyses, we assumed both RT and strategy use to indicate how a problem is solved. It is possible (but uninteresting) that the critical interaction exists simply because participants spend more time computing large-product-size answers than they do retrieving smallproduct-size answers.

EXPERIMENT 4 We therefore present a final experiment in which the amount oftime participants spent studying small-productsize and large-product-size problems was controlled. We accomplished this by encouraging participants to continue processing large-product-size and small-productsize answers after solving them for an upcoming memory test. An a priori test of the neighbor-node model was also constructed, predicting the actual generation-effect ranks to correlate with predicted values based on neighbor-node activations. Furthermore, because the stimuli here were identical to those in Experiment 1, the ranked generation effects for small-product-size problems should correlate across these experiments (unlike the -.10 correlation reported in Table 3). A final goal of Experiment 4 was to see whether presenting the items in a pseudorandom

GENERATION EFFECT

113

model therefore predicts stability in the generation effect for these items. In fact, the observed generation-effect correlation for small-product-size problems across ExperMethod iments 1 and 4 was significant with 10 observations Participants and Design. The participants were 5 males and 19 [r(9) = .59,p < .05, one-tailed]. This correlation contrasts females with a mean age of21.6 (range = 17-35) years. The design sharply with the - .10 value observed in Table 3.4 replicated that in Experiment I. In sum, Experiment 4 showed that the generate condiMaterials and Procedure. The 24 multiplication problems in Experiment I were printed separately on 6 X 8 in. index cards in tion X product size interaction exists even when controlling for study time. The experiment also offers some boldface 72-point font. We compiled two sets of index cards that corresponded to the two counterbalanced stimulus packets used in degree ofa priori evidence for the neighbor-node hypothExperiment I. The first set was used with half the participants, and esis. To bolster our argument, however, we provide three the second set was used with the other half. Unlike Experiment I, more analyses related to this hypothesis, using data from however, the multiplication problems were arranged in a pseudoall our experiments. random order, fixed for all participants. So far, the neighbor-node hypothesis predicted only The procedure differed from Experiment 1 only as follows. First, generation-effect differences among problems. If the the experiment was run in groups ranging from 2 to 6 participants. neighbor effect is real, however, activations should also Second, the stimuli were presented at a rate of 5 sec each. During this period, the participants wrote down the problem and its answer correlate significantly with the percent correct recall valon a blank sheet of paper and spent any remaining time studying ues for each item, rather than just the generation-effect the answer for later recall. After stimulus presentation, the particidifference scores. Moreover, our hypothesis restricted the pants engaged in a 3D-secdistractor task, circling the letter I as it appeared in campus newspaper articles (we checked to ensure that the effect of neighbor activations to generate items only. It is participants were doing this, but we did not analyze these data). The possible, though, that reading an item (e.g., 3 X 4 = 12) also results in some degree of spreading activation that distractor task was used to avoid possible ceiling effects, because, in previous experiments, some delay existed between the time the affects the episodic recall probability of its neighbors. participant finished the study packet and the time the participant We addressed these issues by conducting a regression received the recall sheet. Notice that, even with the distractor task, analysis, using four variables to predict percent recall for recall for all items here was elevated relative to recall in the first each single-digit operand problem in our study: (1) neighthree experiments (see Table 1). After the distractor task, the participants free-recalled their answers on a second blank sheet of bor activations resulting from generating other problems, (2) neighbor activations resulting from reading other paper. problems, (3) product size, and (4) whether the target Results and Discussion item itself was read or generated. The multiple correlaTable 1 lists mean recall values by product size and tion was .53, showing that our independent variables exgenerate condition for Experiment 4. No source of vari- plained 29% of the recall variance. Neighbor activation ance involving the counterbalancing factor was signifi- from reading an item, however, did not predict recall cant. The mean percentage of solution errors made was [13 = .11, t(113) = 1.02], but the other three variables 2.1% (SD = 0.41), and the mean number of recall intru- did [generate condition, 13 = .48, t(I13) = 4.5; product sions was 0.92 (SD = 1.2). The ANOVA revealed main size, 13 = .32, t(113) = 3.4; neighbor activations from effects of generate condition [F(1,23) = 8.0, MS e = 1.6], generated items, 13 = .26, t(113) = 2.4]. The next analysis involved first categorizing smallproduct size (marginal)[F(1,23) = 4.12, MS e = l.3,p = .054], and the generate condition X product size interac- product-size answers based on how many activations tion[F(1,23) = 12.2,MSe = 0.72]. As in the previous ex- they received from generated neighbors and then looking periments, only large-product-size problems showed a sig- for percent correct recall differences across the categornificant generation effect, and recall of the generate, ies. For example, 16 answers in our study were not actirather than read, answers drove the interaction (see Ta- vated as neighbors by generate problems in the same ble 1). These data show that product size indeed moder- stimulus set. These answers averaged 38% (SD = 15) ates the generation effect, even with study time controlled. correct recall. The mean value for the 13 answers receivTurning to semantic memory, the first prediction was ing one activation was similar: 35% (SD = 14). For the that the actual and neighbor-node-predicted generation- 9 problems receiving two (n = 7) or three (n = 2) actieffect ranks would correlate. Although in the right di- vations, however, correct recall was at 59% (SD = 18). rection, the observed correlation was not significant, The one-way ANOVA conducted on these data resulted given the small number of observations it was based on in a main effect ofnumber of activations [F(2,35) = 6.5, [r(15) = .24]. [Looking at small-product-size problems MS e = 254]. Postexperimental tests revealed that anonly, the correlation was larger, but still nonsignificant, swers in the latter group were recalled best, with no rer(9) = .37.] Despite power concerns, however, the second call difference between the former two groups. Finally, we conducted a regression analysis on the acprediction was supported. To review, Table 3 shows the lack of generation-effect stability for small-product-size tual generation effect for all single-digit operand probproblems across experiments. Because Experiments I and lems in this study. The independent variables were the 4 used the same stimuli, the pattern of neighbor activa- neighbor-node-predicted values and mean RT to solve tions for each item was the same across experiments. The each problem (from LeFevre et aI., 1996). We excluded order, as opposed to the alternating order used in the previous experiments, would affect the critical interaction.

114

PESTA, SANDERS, AND MURPHY

the self-report measure due to its correlation with RT [i.e., r(43) = .73] and assumed both measure the same thing. The multiple correlation was. 71, showing that the neighbornodes [13 = .38, t(56) = 3.9] and RT [13 = .64, t(56) = 6.7] explained 51% of the generation-effect variance.

GENERAL DISCUSSION We propose that the generation effect for multiplication problems is influenced by both semantic and episodic memory processes. The model describing these effects is process-oriented, as opposed to item-oriented, appealing to spreading activation in semantic memory and to cue retrieval in episodic memory. Specifically, the generation effect for any single multiplication problem is influenced by (1) the frequency with which the answer occurred as a neighbor node for other generate problems in the same stimulus set, (2) the use ofthe problem's operands as retrieval cues to the answer (i.e., McNamara & Healy's, 1995, operand-retrieval strategy), and (3) the use of any intermediate steps, if computing the answer, as an additional source of retrieval cues. The idea that semantic memory plays a role in producing the generation effect is not new, nor is its influence limited to multiplication items. Several studies using verbal items have implicated semantic memory processes. These include studies examining generation-effect differences among experts versus novices (see, e.g., Peynircioglu & Mungan, 1993; Reardon, Durso, Foley, & McGahan, 1987; see also Pesta et al., 1996), high- versus lowfrequency words (see, e.g., Gardiner et al., 1988; Nairne & Widner, 1988), and words versus nonwords (see, e.g., McElroy & Slamecka, 1982; Nairne & Widner, 1987). Whether our findings generalize to addition or division problems is an empirical question, but nonetheless an interesting one. For example, McNamara and Healy (1995) observed no generation effect for addition problems unless they were solved in a study list also containing multiplication problems. The neighbor-node idea may serve as an additional aid in explaining the generation effect (or lack thereof) for addition problems. The present data also tie to the verbal domain in the work of'Roediger and McDermott (1995), who found strong falsememory effects for critical lure items that were associates of other words in a study list. For example, participants incorrectly remembered the word sleep after studying a list of sleep words (e.g., pillow, bed, rest, tired). Indeed, participants "remembered" critical lures as often as target items-a result explained by the large amount of residual activation the lure received from processing of its associates. In other words, semantic neighbor activation affected an item's episodic recall probability. Two differences, however, between the Roediger and McDermott (1995) study and ours deserve mention. First, we used the idea of neighbor activation to explain recall of other target items in the same stimulus set. Roediger and McDermott used it to predict recall intrusion errors.

We did not have enough intrusion errors to conduct a meaningful analysis with the present data, although the neighbor-node idea predicts intrusion error frequency to vary with how often nontarget multiplication answers are residually activated. Second, the strength of the experimental manipulations across studies was vastly different. Roediger and McDermott's impressive results came from using an extreme design: The critical lure was related to every studied list item. The pattern of residual activation here, however, was more subtle. At most, other studied items activated a target item three times. The result was weaker, but consistent, effects. Nonetheless, even relatively small amounts of residual activation in semantic memory can sometimes influence an item's recall probability on an episodic memory test, as shown here. To conclude, the memory-systems explanation for the generation effect provides specific, testable predictions and accounts for the interesting effects observed here, including the product-size interaction and the unreliability of the generation effect across experiments. The explanation ties to verbal items in the work of Roediger and McDermott (1995), and, finally, our data suggest that both semantic and episodic memory processes can influence the size of an observed generation effect. REFERENCES ASHCRAFT, M. H. (1992). Cognitive arithmetic: A review of data and theory. Cognition, 44, 75-106. ASHCRAFT, M. H., & CHRISTY, K S. (1995). The frequency ofarithmetic facts in elementary texts: Addition and multiplication in grades 1-6. Journal for Research in Mathematics Education, 26, 396-421. CAMPBELL, J.l. D. (1995). Mechanisms for simple addition and multiplication: A modified network-interference theory and simulation. Mathematical Cognition, 1, 121-164. CRUTCHER, R. J., & HEALY, A. F. (1989). Cognitive operations and the generation effect. Journal of Experimental Psychology: Learning, Memory, & Cognition, 14,669-675. DONALDSON, W., & BASS, M. (1980). Relational information and memory for problem solutions. Journal of Verbal Learning & VerbalBehavior, 19,26-35. FIEDLER, K, LACHNIT, H., FAY, D., & KRUG, C (1992). Mobilization of cognitive resources and the generation effect. Quarterly Journal of Experimental Psychology, 45A, 149-171. GARDINER, J. M., GREGG, V. H., & HAMPTON, J. A. (1988). Word frequency and generation effects. Journal ofExperimental Psychology: Learning, Memory. & Cognition, 14, 687-693. GARDlNER,1. M., & ROWLEY, J. M. C (1984). A generation effect with numbers rather than words. Memory & Cognition, 12,443-445. GEARY, D. C, FRENSCH, P. A., & WILEY, J. G. ( 1993). Simple and complex mental subtraction: Strategy choice and speed-of-processing differences in younger and older adults. Psychology & Aging, 8, 242-256. GEARY, D. C, & WILEY,). G. (1991). Cognitive addition: Strategy choice and speed-of-processing differences in young and elderly adults. Psychology & Aging, 6, 474-483. GREENE, R. L. (1992). Human memory: Paradigms and paradoxes. Hillsdale, NJ: Erlbaum. JACOBY, L. L. (1978). On interpreting the effects of repetition: Solving a problem versus remembering a solution. Journal of lerbal Learning & Verbal Behavior, 17, 649-667. LEFEVRE, J., BISANZ, J., DALEY, K E., BUFFONE, L., GREENHAM, S. L., & SADESKY, G. S. (1996). Multiple routes to solution of single-digit multiplication problems. Journal ofExperimental Psychology: General, 125, 284-306.

GENERATION EFFECT

McELROY, L. A., & SLAMECKA, N. J. (1982). Memorial consequences ofgenerating nonwords: Implications for semantic memory interpretations of the generation effect. Journal of Verbal Learning & Verbal Behavior, 21, 249-259. McNAMARA, D. S., & HEALY, A. F. (1995). A procedural explanation of the generation effect: The use ofan operand retrieval strategy for multiplication and addition problems. Journal of Memory & Language, 34,399-416. MITCHELL, D. B., & HUNT, R. R. (1989). How much "effort" should be devoted to memory? Memory & Cognition, 17, 337-348. NAIRNE, J. S., & WIDNER, R. L. (1987). Generation effects with nonwords: The role oftest appropriateness. Journal ofExperimental Psychology: Learning, Memory, & Cognition, 13, 164-171. NAIRNE, J. S., & WIDNER, R. L. (1988). Familiarity and lexicality as determinants of the generation effect. Journal ofExperimental Psychology: Learning, Memory, & Cognition, 14,694-699. PESTA, B. J., SANDERS, R. E., & NEMEC, R. J. (1996). Older adults' strategic superiority with mental multiplication: A generation effect assessment. Experimental Aging Research, 22,155-169. PEYNIRCIOGLU, Z. F., & MUNGAN, E. (1993). Familiarity, relative distinctiveness, and the generation effect. Memory & Cognition, 21, 367-374. REARDON, R., DURSO, F. T., FOLEY, M. A., & MCGAHAN, J. R. (1987). Expertise and the generation effect. Social Cognition, 5, 336-348. ROEDIGER, H. L., III (1990). Implicit memory: Retention without remembering. American Psychologist, 45, 1043- I056. ROEDIGER, H. L., III, & McDERMOTT, K. B. (1995). Creating false memories: Remembering words not presented in lists. Journal of Experimental Psychology: Learning, Memory, & Cognition, 21, 803-814. SIEGLER, R. S. (1988). Strategy choice procedures and the development of multiplication skill. Journal of Experimental Psychology: General, 117, 258-275. SLAMECKA, N. J., & GRAF, P. (1978). The generation effect: Delineation ofa phenomenon. Journal ofExperimental Psychology: Human Learning & Memory, 4, 592-604. TuLVING, E., & THOMSON, D. M. (1973). Encoding specificity and re-

115

trieval processes in episodic memory. Psychological Review, 80, 352-373. TYLER, S. w., HERTEL, P. T., MCCALLUM, M. c., & ELLIS, H. C. (1979). Cognitive effort and memory. Journal ofExperimental Psychology: Human Learning & Memory, 5, 607-617. NOTES 1. A reviewer wondered why people would bother remembering operands, when they could perhaps more readily remember the target answers. We believe that people do remember target answers, sometimes. When memory for the target answer fails, however, the operand-retrieval strategy provides a second route for retrieval. 2. The correlations are similar in size when number of items, instead of ranks, is used to index the generation effect. We used ranks, however, because the hypothesis we describe predicts generation-effect magnitude differences among problems within an experiment, as opposed to actual generation-effect values for each problem. 3. The product size X generate condition interaction may also exist because large-product-size problems are more distinct than smallproduct-size problems. The following analysis, though, rules out this explanation: We operationalized distinctiveness based on distance and solution space. For each answer used in this study, we calculated its absolute mean distance from every other answer. Across product size, this resulted in a curvilinear function, with the smallest mean distance occurring for answers in the lOs and 20s decades (i.e., these being least distinct). Distinctiveness, however, did not correlate with the generation effect, after controlling for product size (conversely, product size's correlation with the generation effect was unchanged, even after controlling for distinctiveness). 4. Admittedly, the reported correlations have less than ideal statistical power. We are in the process, however, of designing more powerful and direct tests of the neighbor-node model. (Manuscript received March 27,1997; revision accepted for publication December 8, 1997.)