Download PDF - SAGE Journals

0 downloads 0 Views 316KB Size Report
tion-only mode, translation plus fill-in exercises, and transla- tion plus sentence .... confusion of part of speech with task type and enhancing the generalizability of ... The teacher-researchers were available to answer any ques- tions and the ...
730596

research-article20172017

SGOXXX10.1177/2158244017730596SAGE OpenTahmasbi and Farvardin

SAGE Open - Original Manuscript

Probing the Effects of Task Types on EFL Learners’ Receptive and Productive Vocabulary Knowledge: The Case of Involvement Load Hypothesis

SAGE Open July-September 2017: 1­–10 © The Author(s) 2017 https://doi.org/10.1177/2158244017730596 DOI: 10.1177/2158244017730596 journals.sagepub.com/home/sgo

Maryam Tahmasbi1,2 and Mohammad Taghi Farvardin2

Abstract This study examined the effects of task types on English as a foreign language (EFL) learners’ receptive and productive vocabulary knowledge. To this end, 130 (70 female and 60 male) EFL learners were randomly assigned to one of six tasks of learning 30 target words. The design of the tasks was based on the involvement load hypothesis (ILH) arguing that learning of unfamiliar words to be contingent on the amount of task induced involvement. The components of involvement in ILH include need (N), search (S), and evaluation (E). In this study, the tasks induced the same or different involvement loads regarding the presence and strength of each component: paragraph writing (+N, +S, ++E), sentence writing (+N, +S, +E), combining (+N, –S, +E), fill in the blank (+N, –S, +E), translation (+N, –S, +E), and control (–N, –S, –E). After the last treatment session, both receptive and productive knowledge of the target words were measured. Moreover, a delayed posttest was administered 1 month later. The results revealed that all output tasks were more effective than the control task in enhancing the participants’ receptive and productive vocabulary knowledge. Moreover, paragraph writing task was found to be the most effective task. Keywords EFL learners, involvement load hypothesis, output task, receptive vocabulary knowledge, productive vocabulary knowledge

Introduction Vocabulary is viewed as the essential building block of second language (L2) learning (McCarty, 2005). Vocabulary knowledge is usually classified into productive and receptive knowledge (Nation & Meara, 2002). Productive vocabulary knowledge is referred to the learners’ ability to comprehend something they hear or read and express their ideas by using proper vocabulary through writing or speaking (Nation, 2003). Receptive vocabulary knowledge, however, is referred to recalling the words through listening and reading (Schmitt, 2000). As vocabulary knowledge plays a crucial role in L2 learning, it is necessary for teachers to select and use proper vocabulary learning tasks (Bao, 2015). Moreover, language researchers should address the reasons for superiority of some tasks over others in L2 vocabulary learning. In this line, it has been argued that learning and retention of words in an L2 depend on the extent of involvement with a task, which Laufer and Hulstijn (2001) called the involvement load hypothesis (ILH). The ILH argues that the greater demand a vocabulary task places on an L2 learner, the more likely the target words will be acquired (Laufer & Hulstijn, 2001). The ILH, as a major

development in field of L2 vocabulary research, has received a great deal of attention as it is clear, precise, and can be operationalized (Bao, 2015; Keating, 2008; Tang & TreffersDaller, 2016; Zou, 2017). The ILH consists of three components: need, search, and evaluation. Need is a noncognitive but motivational factor. Search is the relationship between form and meaning of unknown words. Evaluation includes making a decision on the appropriate word with its related meaning in context. These elements are authorized regarding their distinction. If a component is absent (–) the score is 0. If the component is moderate (+), the score is 1 and the component gains 2 if the involvement is strong. It is believed that the best result in learning new vocabulary is obtained through a task with the highest degree of involvement load (Marmol & Sanchez-Lafunte, 2013).

1

Department of ELT, Khouzestan Science and Research Branch, Islamic Azad University, Ahvaz, Iran ²Department of ELT, Ahvaz Branch, Islamic Azad University, Ahvaz, Iran Corresponding Author: Mohammad Taghi Farvardin, Department of English Language Teaching, Ahvaz Branch, Islamic Azad University, Ahvaz 6134937333, Iran. Email: [email protected]

Creative Commons CC BY: This article is distributed under the terms of the Creative Commons Attribution 4.0 License (http://www.creativecommons.org/licenses/by/4.0/) which permits any use, reproduction and distribution of the work without further permission provided the original work is attributed as specified on the SAGE and Open Access pages (https://us.sagepub.com/en-us/nam/open-access-at-sage).

2 There are two reasons behind this study. First, as the ILH claims, the presence of an involvement component leads to more vocabulary learning than its absence. However, little research has been done to examine this issue (e.g., Bao, 2015). Second, the ILH proposes that the same amount of word learning is obtained when there is the same involvement component regardless of type of vocabulary learning task. However, little research has been conducted to probe this issue, either (i.e., Bao, 2015; Kim, 2008). Therefore, this study examined the effect of different output tasks on English as a foreign language (EFL) learners’ vocabulary knowledge.

Literature Review The ILH argues that tasks with three constructs of need (N), search (S), and evaluation (E) have more effectiveness on vocabulary learning than tasks with lower involvement loads (Laufer & Hulstijn, 2001). Need is a motivational construct dealing with the “need to achieve” (Laufer & Hulstijn, 2001, p. 14), while search and evaluation are cognitive constructs having to do with paying attention to form–meaning relationship. Tasks may induce these involvement elements to three possible degrees: none, moderate, and strong. A task involvement load is referred to as the combination of these involvement elements, which can be absent or present, moderate or strong (Bao, 2015). A moderate involvement is given an index of 1 and a strong involvement receives an index of 2. It is argued that “the higher the scores of need, search, and evaluation are, the greater the involvement load in learning an unknown word is” (Bao, 2015, p. 85). In assessing the ILH, L2 reading passages have been used as stimuli in most previous studies, following Laufer and Hulstijn (2001). Laufer and Hulstijn (2001) compared three incidental vocabulary learning tasks: reading with marginal glosses with index of 1 (+N, –S, –E), fill in the blank with index of 2 (+N, –S, +E), and composition writing with index of 3 (+N, –S, ++E) with no time controlling. The study investigated the effect of task types on the retention of 10 target words by EFL learners. The participants were 97 advanced EFL university learners in the Netherlands and 128 in Israel. They were six intact groups consisting of three parallel groups in the Netherlands and three parallel groups in Israel. The findings showed that writing task had a better result than two others. Likewise, reading plus fill-in-the-blank task was more effective than reading task. This experiment provided strong support for the ILH. Therefore, it was claimed that retention of unknown words depends on the amount of involving learners with tasks including different degrees of need, search, and evaluation. However, their study lacked a control group and there was no control for word type. Moreover, substantiating or rejecting the ILH to some extent relies on the design of tasks and handling the task time (Webb, 2005). It should be also noted that the type of

SAGE Open vocabulary knowledge can reject or accept the ILH (Keating, 2008). Most follow-up studies have provided inconclusive evidence for the ILH. Keating (2008), for example, investigated the effects of three tasks with different involvement loads on vocabulary retention of 79 Spanish learners. To this end, three tasks with different involvement loads were selected: reading comprehension task (involvement load of 1), reading plus fill-in task (involvement load of 2), and sentence writing task (involvement load of 3). The results showed that the participants could learn more words in sentence writing task with the highest involvement load. Nevertheless, Keating (2008) did not include any control group in his study. In another study, Marmol and Sanchez-Lafunte (2013) studied the effects of four types of tasks on EFL vocabulary learning. The participants were 28 primary school English as a second language (ESL) learners in Spain. Eighteen words including six nouns, six adjectives, and six verbs, were selected randomly from a short story. The participants were assigned to four different tasks with different involvement loads: reading comprehension with marginal glosses, reading comprehension and gap-filling, writing with marginal glosses, and writing and dictionary use. All the participants took a receptive and a productive vocabulary test. The results showed that doing a task with the highest involvement led to the best result in L2 vocabulary learning. Their study, however, lacked a control group, and the sample size was small. Some researchers (e.g., Folse, 2006; Webb, 2005; Zou, 2017) have laid emphasis on writing tasks as effective ones in vocabulary learning. In the same vein, Feng (2014) included sentence writing task in his study. Feng (2014) examined the effects of three translation tasks on EFL learners’ vocabulary learning based on the ILH. In this study, 30 verbs were selected from business documents to be taught to 60 EFL learners via three different translation tasks: translation-only mode, translation plus fill-in exercises, and translation plus sentence writing. The result indicated sentence writing could significantly improve passive and active word learning and retention, whereas translation-only task had the lowest effect. However, Feng’s (2014) study lacked a control group, did not control the word type, and sufficient information for power analyses was not provided. More recently, Bao (2015) investigated how task type affects EFL learners’ vocabulary knowledge. To this end, 153 Chinese EFL learners (144 females and 9 males) were selected. The participants learned 18 target words through five tasks: control (involvement load of 0), definition (involvement load of 2), combining (involvement load of 2), translation (involvement load of 2), and sentence writing (involvement load of 3). The results revealed that all tasks were significantly better than the control task. Although Bao (2015) included a control group and the sentence writing task, he did not examine the effects of different writing tasks such as paragraph writing on EFL learners’ vocabulary knowledge.

3

Tahmasbi and Farvardin Tang and Treffers-Daller (2016) have recently examined the effects of different tasks on L2 incidental vocabulary learning based on the predictions of ILH. To this end, 230 Chinese EFL learners whose proficiency was at A2 level on the Common European Framework of Reference for languages (CEFR) were selected. Six different tasks with different involvement loads were designed. The results showed that tasks with a higher involvement load were significantly better than tasks with a lower involvement load both in the immediate and the delayed posttests. However, like most previous studies, this study suffers from lack of a control group and insufficient information for power analyses. In Iranian context, few studies have been conducted on the effect of task types and involvement indices on EFL learners’ vocabulary learning. Yaqubi, Rayati, and Allemzade Gorgi (2010), for instance, randomly assigned 60 EFL learners to three groups: Group 1 completed an input-oriented task with an involvement load of 3, Group 2 was given the same type of task but with an involvement load of 2, and Group 3 completed an output-oriented task with the involvement load of 3. The results were contrary to the prediction of the ILH, that is, Task 2 was superior to Task 1, which had a higher index. Moreover, the learners who had completed Task 3 significantly did better than those who did Task 1, despite their index equivalency. However, this study lacked a control group, had a small sample size, provided no information for power analyses, and did not measure the participants’ productive knowledge. In the same vein, Soleimani and Rahmanian (2015) randomly assigned 33 Iranian EFL learners to three groups: fill in the blank (involvement load of 1), reading comprehension (involvement load of 2), and sentence writing (involvement load of 3). The results showed that sentence writing task was significantly better than the other two tasks. The results supported the ILH assumptions. Nevertheless, this study suffers from some limitations such as no control for word type, small sample size, no measurement of the productive knowledge, and insufficient information for power analyses. The present study also attempted to overcome the limitations of the previous studies such as having no control group, no measurements of both receptive and productive vocabulary knowledge, and no distinction between input and output orientation. In addition, other tasks like paragraph writing and writing a word definition have been barely used in the literature. Finally, little research has been conducted to examine the hypothesis that the same presence of an involvement component leads to the same amount of word learning. Therefore, this study made an attempt to overcome the deficiencies in previous studies. Moreover, this study focused on the output tasks and their effectiveness on L2 vocabulary learning based on the predictions of the ILH. To fulfill the objectives of the study, the following research questions were raised: Research Question 1: Are different output tasks (i.e., paragraph writing, sentence writing, translation,

combining, and fill in the blank) all conducive to EFL receptive vocabulary knowledge? Research Question 2: Are different output tasks (i.e., paragraph writing, sentence writing, translation, combining, and fill in the blank) all conducive to EFL productive vocabulary knowledge? Research Question 3: Which output task (i.e., paragraph writing, sentence writing, translation, combining, and fill in the blank) will be more effective in EFL receptive vocabulary knowledge? Research Question 4: Which output task (i.e., paragraph writing, sentence writing, translation, combining, and fill in the blank) will be more effective in EFL productive vocabulary knowledge?

Method This experimental study adopted a pretest/posttest control group design. This research was based on five different tasks, including paragraph writing, sentence writing, combining, fill in the blank, and translation. The between-subjects factor was task type.

Participants First, a total of 144 (78 female and 66 male) junior high school students in the third grade participated in this study. They were selected from two high schools in Ahvaz, Iran. The participants’ age ranged between 14 and 16 (M = 14.7, SD = 0.35). They had not studied English abroad or in an English institute, which meant their English background was just restricted to school. The participants were equally assigned to six vocabulary learning tasks (n = 24 each) through simple random sampling. However, after the delayed posttest, the number of the participants was reduced to 130 (70 female and 60 male). Finally, the number of participants at each vocabulary learning task was as follows: paragraph writing (n = 20), sentence writing (n = 22), translation (n = 24), fill in the blank (n = 21), combining (n = 23), and control (n = 20). As Oxford Placement Test (OPT) is easy to administer and has a well-established reliability and validity (Allan, 2004), it was adopted in this study. OPT includes two sections, grammar and listening, each of which consists of 100 items. The required time to complete the test is 60 min. Each correct item received 1 point. Therefore, the maximum possible score was 200. The participants’ scores ranged from 105 to 118, suggesting that they were at the elementary level of English proficiency (M = 111.36, SD = 2.75).

Instruments and Materials Target words.  In this study, first, 42 English target words, with equal parts of speech, were randomly selected from the 504 Absolutely Essential Words (Bromberg, Lieb, & Traiger,

4

SAGE Open

2011). This book includes 42 lessons and one word was randomly selected from each lesson. The reasons behind selecting equal parts of speech were controlling the possible confusion of part of speech with task type and enhancing the generalizability of the findings. Moreover, the target words were selected from 504 Absolutely Essential Words whose words are a little beyond the participants’ proficiency level. In other words, this book is appropriate for intermediatelevel EFL learners. The reason behind this was to ensure that the target words would be unknown to all participants. However, the sentences in which the target words used in the main study were ensured to be appropriate for elementarylevel EFL learners by four experienced EFL teachers. To select the target words, a test was also administered to 60 nonparticipants at another high school. The testees were asked to write the Persian equivalents of the words. The nonparticipants were also at the elementary level as the participants in the experiment. After administering the test, 30 words (i.e., 10 nouns, 10 verbs, and 10 adjectives) unknown to all the testees were selected as the target words (see Appendix A). EFL reading task.  In this study, 30 sentences were adapted from Cambridge Advanced Learner’s Dictionary (Walter, Woodford, & Good, 2008) in which each target word was embedded (see Appendix B). The Persian equivalents were written from The Aryanpur Progressive English–Persian Dictionary (Aryanpur Kashani & Aryanpur, 2008). The teacher-researchers and two other EFL teachers checked the appropriacy of vocabulary and syntax of the reading sentences for the participants. The 30 sentences were randomly divided into three sets; each set was presented in one session. Each set included 10 sentences and 10 target words. After each sentence, the corresponding gloss for the target word was provided in the brackets. All the groups received the sentences in the same random order. A sample sentence is presented below: Her talent for music showed at an early age. (talent: n. ‫ ذوق‬،‫)استعداد‬

In this example, the target word is talent and its gloss provides its part of speech (a noun) and Persian translation (‫ ذوق‬،‫)استعداد‬. Vocabulary knowledge test.  This study adopted Min’s (2008) four-item Vocabulary Knowledge Scale (VKS) in which the unknown and known word categories are separated (see Appendix C). “Categories of VKS offer no clues to the target words and thus can more accurately reflect the students’ knowledge about the target words” (Min, 2008, p. 85). In this study, both receptive and productive vocabulary knowledge of participants were measured. As a result, Categories III and IV in the VKS measured the EFL receptive and productive vocabulary knowledge, respectively.

Procedures This study followed the procedure of Bao’s (2015) study. First, the participants were randomly assigned to five experimental groups receiving a vocabulary learning task (i.e., paragraph writing, sentence writing, translation, fill in the blank, and combining) and a control task. The same or different involvement loads were utilized for all tasks except for the control task to test the contribution of each task type to vocabulary learning. The paragraph writing task induced involvement load index of 4 (+N, +S, ++E), sentence writing task induced involvement load index of 3 (+N, +S, +E), and combining, fill in the blank, and translation tasks all induced an involvement load index of 2 (+N, –S, +E). Two weeks before the study, all participants took the VKS. In the main study, the experimental groups were taught the target words by the teacher-researchers in three sessions. In each session, 10 target words were taught through a vocabulary learning task presented immediately following a reading task in which the target words appeared. For the control group, the task included meaning matching exercises in which the test words were not the target words. For each set of exercise, 10 test words had to be matched with 12 definitions. The number of the definitions was more than the test words to decrease the participants’ guessing. For the sentence and paragraph writing tasks, the participants were required to write semantically acceptable and grammatically correct sentences or paragraphs in 15 min. The teacher-researchers were available to answer any questions and the participants had access to both monolingual and bilingual dictionaries. In both groups, the participants did the tasks individually. For the combining task, each sentence was segmented into separate parts. The combining group was required to combine all segments into a grammatically correct sentence. The participants in translation group had to translate the same sentences from L2 to L1. The participants could use English–Persian dictionary during each task. For the fill-inthe-blank task, the participants were to fill in the blanks with appropriate given (target) words. At the end of the last treatment session, the immediate posttest was administered in 20 min. To assess the participants’ long-term retention of the target words, a delayed posttest was also administered 1 month after the experiment.

Scoring and Data Analysis In this study, Min’s (2008) scoring was followed to measure the participants’ receptive and productive vocabulary knowledge of the target words. For Categories III and IV, 1 point or 0 point was given. Category III measured the receptive knowledge of a target word. If the given synonym or translation was improper, or no response was given, 0 point was considered. A correct synonym or translation also received 1

5

Tahmasbi and Farvardin Table 1.  Results of Mixed ANOVA on Receptive Vocabulary Tests. Source Between-subject   Task type  Error Within-subject  Time   Time × Task type   Error (time)

Type III SS

MS

df

F

Significance

Partial η2

1,561.792 2,256.682

312.358 18.199

5 124

17.163

.000

  .409  

15,793.102 848.790 1,135.620

7,896.551 84.879 4.579

2 10 248

1,724.471 18.536

.000 .000

.933 .428  

Note. SS = sum of squares; MS = mean square.

point. Category IV measured the productive knowledge of a target word. If no point was given to Category III, 0 point was given to Category IV. If a target word’s meaning was inappropriate or ungrammatical in the sentence context, 0 point was given. One point was given when a target word was both semantically and grammatically correct in the sentence albeit other parts of the sentence had errors. The participants’ responses were scored by two experienced EFL teachers. Each participant’s responses were independently scored. In the pretest, the Cohen’s Kappa interrater reliability indices for the receptive and productive tests were 0.98 and 0.95, respectively. Then, the two raters discussed all discrepancies in scoring until they reached a unanimous agreement. Hence, the interrater agreement was 100% in the immediate and delayed posttests. To address the research questions, a mixed 3 × 6 ANOVA, one-way ANOVA, and Tukey post hoc tests were run on receptive and productive test scores separately. Significance level was set at .05.

Results First, normality of data was measured through Shapiro–Wilk test. All significant values in the test were above the significance level of .05, implying normal distribution of data. The results also revealed that the paragraph writing group had the highest mean scores on both receptive and productive measures of vocabulary knowledge in immediate and delayed posttests. Then, a mixed 3 × 6 ANOVA, with time (i.e., pretest, immediate posttest, and delayed posttest) and task type (i.e., paragraph writing, sentence writing, translation, fill in the blank, combining, and control) as two main factors, was done. A one-way ANOVA along with post hoc Tukey tests was also conducted for the overall comparison of the six groups on all tests. Results of mixed ANOVA on the receptive tests are displayed in Table 1. The results show significant main effects of task type (F2, 124 = 17.163, p = .000, partial η2 = 0.409). There were also a significant effect for time (F2, 248 = 1724.471, p = .000, partial, η2 = 0.933) and a significant interaction between time and task type (F10, 248 = 18.536, p = .000, partial η2 = 0.428).

The effect size for mixed ANOVA is calculated by the partial eta squared (Pallant, 2007). Moreover, it is recommended that L2 researchers adopt the following benchmarks to interpret the practical significance of L2 research effects more precisely: small = 0.40, medium = 0.70, and large = 1.00 (Plonsky & Oswald, 2014). Therefore, the results suggest an almost large effect size for time and small effect size for task type. To further examine the differences between the groups, a one-way ANOVA was performed for each receptive test. No significant difference was found in the pretest (F5, 124 = 0.117, p = 0.988). However, in the immediate posttest (F5, 124 = 24.546, p = .000) and the delayed posttest (F5, 124 = 15.629, p = .000), significant differences were found. Table 2 illustrates the findings of post hoc analyses on the posttests. Table 2 shows that on the immediate posttest, the paragraph writing group did significantly better than all groups but the sentence writing group. Moreover, the sentence writing group outperformed the combining and control groups. All groups also significantly outperformed the control group. Post hoc tests also showed that paragraph writing group obtained significantly higher scores than the combining and control groups on the delayed posttest (p = .000). Moreover, all groups significantly outperformed the control on the delayed posttest (p = 000) except the combining group. A mixed 3 × 6 ANOVA, with time (i.e., pretest, immediate posttest, and delayed posttest) and task type (i.e., paragraph writing, sentence writing, translation, fill in the blank, combining, and control) as two main factors, was also conducted on the productive vocabulary tests. In addition, a one-way ANOVA along with post hoc Tukey tests were run for the overall comparison of the six groups on all tests. Results of mixed ANOVA on the productive tests are shown in Table 3. The results showed significant main effects of task type (F2, 124 = 9.985, p = .000, partial η2 = 0.287). There were also a significant effect for time (F2, 248 = 726.611, p = .000, partial η2 = 0.854) and a significant interaction between time and task type (F10, 248 = 14.227, p = .000, partial η2 = 0.365). The results suggest a medium effect size for time and small effect sizes for both task type and the interaction between time and task type.

6

SAGE Open

Table 2.  Results of Tukey Post Hoc Tests on Receptive Posttests. Dependent variable Immediate posttest

Delayed posttest

(I) Group Paragraph writing Paragraph writing Paragraph writing Paragraph writing Paragraph writing Sentence writing Sentence writing Sentence writing Sentence writing Translation Translation Translation Fill in the blank Fill in the blank Combining Paragraph writing Paragraph writing Paragraph writing Paragraph writing Paragraph writing Sentence writing Sentence writing Sentence writing Sentence writing Translation Translation Translation Fill in the blank Fill in the blank Combining

(J) Group Sentence writing Translation Fill in the blank Combining Control Translation Fill in the blank Combining Control Fill in the blank Combining Control Combining Control Control Sentence writing Translation Fill in the blank Combining Control Translation Fill in the blank Combining Control Fill in the blank Combining Control Combining Control Control

Mean difference (I-J)

SE

Significance

1.077 3.225 3.540 6.089 10.250 2.148 2.463 5.012 9.173 0.315 2.864 7.025 2.549 6.710 4.164 0.895 3.058 2.564 5.720 8.350 2.163 1.669 4.824 7.455 –0.494 2.661 5.292 3.155 5.786 2.630

1.047 1.026 1.058 1.036 1.071 1.000 1.033 1.010 1.047 1.012 0.988 1.026 1.022 1.058 1.036 1.105 1.083 1.118 1.094 1.131 1.056 1.092 1.067 1.105 1.069 1.044 1.083 1.080 1.118 1.094

.907 .025 .014 .000 .000 .270 .170 .000 .000 1.00 .50 .000 .134 .000 .001 .965 .060 .204 .000 .000 .322 .646 .000 .000 .997 .118 .000 .000 .000 .163

p < .05.

Table 3.  Results of Mixed ANOVA on Productive Vocabulary Tests. Source Between-subject   Task type  Error Within-subject  Time   Time × Task type   Error (time)

Type III SS

MS

df

315.130 782.667

63.026 6.312

5 124

1,817.093 177.899 310.096

908.546 17.790 1.250

2 10 248

Significance

Partial η2

9.985

.000

.287  

726.611 14.227

.000 .000

.854 .365  

F

Note. SS = sum of squares; MS = mean square.

To further examine the differences between the groups, a one-way ANOVA was performed for each receptive test. No significant difference was found in the pretest (F5, 124 = 0.072, p = 0.996). However, in the immediate posttest (F5, 124 = 9.301, p = .000) and the delayed posttest (F5, 124 = 19.562, p = .000) significant differences were found. Table 4 depicts the findings of post hoc analyses on the posttests.

Table 4 shows that, on the immediate posttest, paragraph writing significantly did better than all groups except the sentence writing group. Moreover, the sentence writing group outperformed the combining and control groups. Also, it was found that the translation and the fill in the blank groups were superior to the control group. Post hoc tests also showed that the paragraph writing group obtained significantly higher scores than all groups

7

Tahmasbi and Farvardin Table 4.  Results of Tukey Post Hoc Tests on Productive Posttests. Dependent variable Immediate posttest

Delayed posttest

(I) Group Paragraph writing Paragraph writing Paragraph writing Paragraph writing Paragraph writing Sentence writing Sentence writing Sentence writing Sentence writing Translation Translation Translation Fill in the blank Fill in the blank Combining Paragraph writing Paragraph writing Paragraph writing Paragraph writing Paragraph writing Sentence writing Sentence writing Sentence writing Sentence writing Translation Translation Translation Fill in the blank Fill in the blank Combining

(J) Group Sentence writing Translation Fill in the blank Combining Control Translation Fill in the blank Combining Control Fill in the blank Combining Control Combining Control Control Sentence writing Translation Fill in the blank Combining Control Translation Fill in the blank Combining Control Fill in the blank Combining Control Combining Control Control

Mean difference (I-J)

SE

Significance

1.655 2.158 2.271 2.570 4.150 0.504 0.617 0.915 2.495 0.113 0.411 1.992 0.298 1.879 1.580 1.891 3.217 3.610 3.800 5.000 1.326 1.719 1.909 3.109 .393 .583 1.783 –.190 1.390 1.200

.612 .599 .618 .605 .626 .584 .604 .590 .612 .591 .578 .599 .597 .618 .605 .550 .539 .556 .545 .563 .526 .543 .531 .550 .532 .520 .539 .538 .556 .545

.081 .006 .005 .001 .000 .955 .910 .633 .001 1.00 .980 .015 .996 .034 .102 .010 .000 .000 .000 .000 .126 .024 .006 .000 .977 .871 .015 .999 .132 .243

p < .05.

on the delayed posttest. Moreover, the paragraph writing, the sentence writing, and the translation groups significantly outperformed the control group on the delayed posttest (p < .05).

Discussion The results revealed that, regardless of vocabulary knowledge type, all output tasks were significantly better than the control group in receptive vocabulary knowledge test. Moreover, the findings showed that paragraph writing task was more effective than all groups except the sentence writing group. In addition, the sentence writing group outperformed the combining and control groups. The relative efficacy of the tasks can be briefly reported as follows: paragraph writing (index = 4) ≥ sentence writing (index = 3) ≥ translation (index = 2) = fill in the blank (index = 2) ≥ combining (index = 2). However, some discrepancies were detected as the patterns for each type of vocabulary knowledge were examined separately.

In the immediate receptive vocabulary test, the paragraph writing task was better than other tasks except the sentence writing task, implying that the higher the involvement load a task has, the more effective the task will be for L2 vocabulary learning. The findings of this study are in line with Laufer and Hulstijn (2001) who found that the effectiveness of an output task depends on its involvement load. Fill in the blank, translation, and combining tasks had similar involvement load indices (+N, –S, +E). Although, the translation and fill in the blank tasks showed no significant difference, combining task had the lowest score in immediate and delayed posttests. However, the findings of this study are not in line with some previous studies supporting the ILH (e.g., Bao, 2015; Keating, 2008; Marmol & Sanchez-Lafunte, 2013). Comparing the groups’ performance in the immediate and delayed posttests revealed that paragraph writing task with involvement index of 4 was not superior to sentence writing task with involvement index of 3. In addition, the participants who had completed fill in the blank and translation tasks with involvement index of 2 outperformed those

8 who did combining task despite its involvement index is 2. In other words, combining task with the components (+N, –S, +E) of involvement load had the lowest score in comparison with fill in the blank and translation with the similar components in this study. In combining task, the learners were required to arrange a string of words in a proper sentence. Hence, the learners might have no special attention to the target words. Moreover, in the combining task, the participants might be too obsessed with combining the sentences per se to notice the semantic and syntactic aspects of the target words. The disconnected strings of words given to the participants might significantly diminish the contextual clueing that led the combining task to be the least effective output task. The results of delayed receptive vocabulary test, however, revealed that the paragraph writing task was similar to the sentence writing, translation, and fill in the blank tasks. The reason might be attributable to the opportunity the participants had to “retrieve more grammatical information about those target words whose meanings they remembered” (Bao, 2015, p. 92). Regarding productive vocabulary knowledge, paragraph writing was better than other tasks on both immediate and delayed productive tests. The reason might be attributable to the fact that it could draw participants’ attention more to both semantic and grammatical features of target words, which might help them remember the meaning of words better. Moreover, it can be argued that when participants wrote a paragraph with target words they evaluated the appropriacy of using the target words in context. The sentence writing task, however, was not superior to other output tasks on the immediate posttest. The results revealed that “the same involvement loads did not necessarily lead to similar vocabulary learning, nor did the higher involvement loads necessarily lead to better word learning” (Bao, 2015, p. 91). This may imply that the writing task might be superior to other output tasks in productive vocabulary knowledge when a more challenging writing task like writing a paragraph or essay is required. In this regard, Keating (2008) found that sentence writing task had no superiority over the reading plus fill-in task on the immediate recall test. Keating (2008) argued that “producing connected sentences might involve more elaborate processing of the target words than disconnected sentences” (Bao, 2015, p. 93). The findings of the present study confirm Keating’s (2008) claim about the higher effectiveness of more complex writing tasks such as paragraph writing on the EFL learners’ productive vocabulary knowledge. The results are also in line with Williams’ (2012) claim that writing tasks can push learners to evaluate and analyze their linguistic knowledge and hence improve their L2 vocabulary knowledge. Therefore, it can be concluded that writing tasks can reinforce and encourage EFL learners to extend their knowledge of learning new lexical items (McDonough & Fuentes, 2015; Zou, 2017). In the delayed productive test, the paragraph writing task was also superior to other tasks in the retention of the target words.

SAGE Open As the results showed, a decline in the mean scores of all groups was observed from the first posttest to the second posttest, implying that some newly learned words are often forgotten after a few days. Hence, “reinforcement of newly learned words is still needed if students are to remember them in the longer term, regardless of the amount of involvement load of the vocabulary learning task” (Tang & TreffersDaller, 2016, p. 138). Therefore, it seems necessary to take other factors, such as attention, contextual clueing, and learners’ awareness of the target words, into account.

Conclusion The results of this study confirm the predictions of ILH that tasks with a higher involvement load better help EFL learners recall and retain the target words. The findings have also important implications for EFL practitioners. Teachers can arrange activities to help students develop their vocabulary learning through tasks with high involvement load and strong evaluation. Moreover, the teachers should apply useful opportunities to involve learners to learn more vocabulary knowledge and obtain meaningful usage of unknown words through tasks such as paragraph writing and sentence writing. Furthermore, teachers can actively increase the effectiveness of vocabulary learning through noticing and considering the learners’ needs. EFL teachers may sometimes integrate various tasks, for instance, translation and sentence writing tasks, to use the merits of each task for vocabulary learning. The findings of the present study are subject to a number of limitations. First, this study was conducted in Iran as an EFL context. Hence, future studies can examine the effectiveness of different vocabulary learning tasks in ESL and other EFL contexts. Second, the effect of each output task was studied separately. Thus, researchers can integrate the output tasks and investigate their effect on EFL learners’ productive and receptive vocabulary knowledge. Third, this study explored the predictions of ILH in vocabulary learning tasks. However, the role of other factors such as learners’ awareness of target words and the contextual clues in EFL vocabulary learning were not examined. Therefore, further studies are needed to investigate other influential factors in L2 vocabulary learning. Finally, future studies can compare and contrast ILH with other theoretical frameworks such as Technique Feature Analysis (Nation & Webb, 2011) to better examine the predictability power of these frameworks.

Appendix A Target Words Nouns: decade, disaster, event, homicide, lecture, menace, obstacle, panic, talent, vicinity. Verbs: anticipate, commence, conceal, detect, detest, excel, perish, pursue, squander, vanish.

9

Tahmasbi and Farvardin Adjectives: excessive, frigid, massive, mediocre, rural, urgent, thorough, vacant, vague, weary.

Appendix B Sentences in the EFL Reading Task  1. Her talent for music showed at an early age.   2. The hospital has no vacant beds.   3. I tried to conceal my surprise when she told me her age.   4. Few plants can grow in such a frigid weather.  5. The child vanished while on her way home from school.   6. Life in rural areas is simpler and cheaper.   7. Human ear cannot detect some sounds.   8. They did a thorough search of the area but found nothing.   9. The earth is a massive planet. 10. Parents don’t want their children going to mediocre schools. 11. There are several hotels in the vicinity of the station. 12. Rebecca always excelled in mathematics at school. 13. Drunk drivers are a menace to everyone. 14. Many people moved out of this city in the last decade. 15. Many people are in urgent need of food and water. 16. He commenced speaking before all the guests had finished eating. 17. The car was pursued by helicopters. 18. Carmel was in a panic about her exam. 19. Three hundred people perished in the earthquake. 20. I think he’s a little weary after his long journey. 21. At this stage, we can’t really anticipate what will happen. 22. The number of homicides in the city has highly increased. 23. I detest getting up early in the morning. 24. Do not squander your money by buying what you cannot use. 25. Excessive exercise can sometimes cause health problems. 26. It will be a disaster for me if I lose my job. 27. Who’s giving the lecture this afternoon? 28. I do have a vague memory of meeting her many years ago. 29. The Olympic Games are the biggest sporting event in the world. 30. The biggest obstacle in our way was a tree trunk in the road.

Appendix C Modified Vocabulary Knowledge Scale (Min, 2008) I. I don’t remember having seen this word before. II. I have seen this word before, but I don’t know what it means.

III. I know this word. It means …………………. (Give the meaning in English or Persian.) IV. I can use this word in a sentence. (Write a sentence.) (If you do this section, please also complete III.) Declaration of Conflicting Interests The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding The author(s) received no financial support for the research, authorship, and/or publication of this article.

References Allan, D. (2004). Oxford placement test. Oxford, UK: Oxford University Press. Aryanpur Kashani, M. A., & Aryanpur, M. (2008). The Aryanpur progressive English-Persian dictionary. Tehran, Iran: AmirKabir Publications. Bao, G. (2015). Task type effects on English as a foreign language learners’ acquisition of receptive and productive vocabulary knowledge. System, 53, 84-95. Bromberg, M., Lieb, J., & Traiger, A. (2011). 504 absolutely essential words. Hauppauge, NY: Barron’s Educational Series. Feng, T. (2014). Involvement load in translation tasks and EFL vocabulary learning. The New English Teacher, 9(1), 83-101. Folse, K. S. (2006). The effect of type of written exercise on L2 vocabulary retention. TESOL Quarterly, 40, 273-293. Keating, G. D. (2008). Task effectiveness and word learning in a second language: The involvement load hypothesis on trial. Language Teaching Research, 12, 365-386. Kim, Y. (2008). The role of task-induced involvement and learner proficiency in L2 vocabulary acquisition. Language Learning, 58, 285-325. Laufer, B., & Hulstijn, J. (2001). Incidental vocabulary acquisition in a second language: The construct of task-induced involvement. Applied Linguistics, 22, 1-26. Marmol, G. A., & Sanchez-Lafunte, A. A. (2013). The involvement load hypothesis: Its effect on vocabulary learning in primary education. Revista Española de Lingüística Aplicada, 26, 11-24. McCarty, M. (2005). Discourse analysis for language teachers. Cambridge, UK: Cambridge University Press. McDonough, K., & Fuentes, C. G. (2015). The effect of writing task and task conditions on Colombian EFL learners’ language use. TESL Canada Journal, 32, 67-79. Min, H. T. (2008). EFL vocabulary acquisition and retention: Reading plus vocabulary enhancement activities and narrow reading. Language Learning, 58, 73-115. Nation, P. (2003). Vocabulary. In D. Nunan (Ed.), Practical English language teaching (pp. 129-152). New York, NY: McGrawHill. Nation, P., & Meara, P. (2002). Vocabulary. In N. Schmitt (Ed.), An introduction to applied linguistics (pp. 35-54). London, England: Arnold. Nation, P., & Webb, S. (2011). Researching and analyzing vocabulary. Boston, MA: Heinle. Pallant, J. (2007). SPSS survival manual: A step-by-step guide to data analysis using SPSS version 15. New York, NY: McGrawHill.

10 Plonsky, L., & Oswald, F. L. (2014). How big is “big”? Interpreting effect sizes in L2 research. Language Learning, 64, 878-912. Schmitt, N. (2000). Vocabulary in language teaching. Cambridge, UK: Cambridge University Press. Soleimani, H., & Rahmanian, M. (2015). Vocabulary acquisition and task effectiveness in involvement load hypothesis: A case in Iran. International Journal of Applied Linguistics & English Literature, 4, 198-205. Tang, C., & Treffers-Daller, J. (2016). Assessing incidental vocabulary learning by Chinese EFL learners: Testing the Involvement Load Hypothesis. In G. Yu & Y. Yin (Eds.), Assessing Chinese learners of English: Language constructs, consequences and conundrums (pp. 121-148). London, UK: Palgrave. Walter, E., Woodford, K., & Good, M. (Eds.). (2008). Cambridge advanced learner’s dictionary (3rd ed.). Cambridge, UK: Cambridge University Press. Webb, S. (2005). Receptive and productive vocabulary learning: The effects of reading and writing on word knowledge. Studies in Second Language Acquisition, 27, 33-52. Williams, J. (2012). The potential role(s) of writing in second language development. Journal of Second Language Writing, 21, 321-331.

SAGE Open Yaqubi, B., Rayati, R. A., & Allemzade Gorgi, N. (2010). The involvement load hypothesis and vocabulary learning: The effect of task types and involvement index on L2 vocabulary acquisition. Journal of Teaching Language Skills, 29, 145-163. Zou, D. (2017). Vocabulary acquisition through cloze exercises, sentence-writing and composition-writing: Extending the evaluation component of the involvement load hypothesis. Language Teaching Research, 21, 54-75.

Author Biographies Maryam Tahmasbi holds an MA in TEFL the Islamic Azad Unicersity of Ahvaz, Iran. She has been teaching English for 20 years. Her research interests include L2 vocabulary learning and psycholinguistics. Mohammad Taghi Farvardin is an assistant professor of TEFL at the Departmet of English Language Teaching, Islamic Azad University, Ahvaz Branch, Iran. His research interests include L2 vocabulary learning, reading in a second/foreign language, and psycholinguistics. He teaches at both undergraduate and postgraduate levels, and also supervises postgraduate students.