141
English Teaching, Vol. 71, No. 4, Winter 2016
DOI: 10.15858/engtea.71.4.201612.141
Effects of Task Complexity
on L2 Reading and L2 Learning
Jookyoung Jung
(UCL Institute of Education)
Jung, Jookyoung. (2016). Effects of task complexity on L2 reading and L2
learning. English Teaching, 71(4), 141-166.
Task-based language teaching (TBLT) has propelled much research into how task
type, condition, or demand affects L2 learners’ linguistic performance and language
learning. To date, however, TBLT has mainly been researched in connection with
learners’ production, while its applicability to L2 reading has largely been
unattended to. To fill this gap, the present study explored whether and how
cognitive complexity of L2 reading tasks would affect L2 English reading
comprehension and learning of target L2 constructions contained in the texts. The
study employed a pretest, posttest, delayed-posttest design with two treatment
sessions. The target features were 17 English unaccusative verbs and ten pseudowords. Participants included 52 Korean college students learning L2 English who
were randomly assigned to either – or + complex condition. Reading comprehension
was measured with 14 multiple-choice items for each text, and learning of the target
constructions was assessed with a grammaticality judgment test and word form and
meaning recognition tests. The results of mixed-effects modeling indicated that
increased task complexity had limited effects on reading comprehension scores as
well as learning of the target unaccusative verbs. Also, task complexity had
significant negative effects on vocabulary form recognition scores in the delayed
posttest. The results are discussed in relation to models of task-based learning and
L2 reading.
Key words: TBLT, task complexity, L2 reading, L2 learning, mixed-effects
modeling
1. INTRODUCTION
Since the 1980s, TBLT, in which tasks are deemed to serve as a platform where learners
142
Jookyoung Jung
can enjoy natural opportunities for meaning-oriented communications as well as a medium
for infusing focus on form (Ellis, 2003, 2009; Robinson, 2011; Skehan, 1998, 2009), was
proposed as a potential approach to L2 instruction, and it has attracted growing attention
ever since. A task can be defined as a meaning-oriented activity that requires learners to
use the target language (TL) in order to achieve a specified objective (Bygate, Skehan, &
Swain, 2001), and it serves not as a mere vehicle for delivering isolated linguistic features,
but as a platform where learners can enjoy opportunities for meaning-oriented TL use. The
role of task complexity, i.e., the cognitive task demands imposed on learners, has drawn
particular attention from researchers, especially with respect to how it influences learners’
linguistic performance and language learning that accrues from performing the task. To
date, however, the task-based approach has mainly been researched in connection with
learners’ oral and written production, while its applicability to receptive skills, such as L2
reading, has largely been neglected. In order to fill this gap in the literature, the present
study explored whether task complexity affected L2 reading comprehension and learning
of L2 constructions contained in the reading texts.
2. LITERATURE REVIEW
2.1. Task Complexity in TBLT
Within the TBLT framework, individual tasks must be designed, selected, and
sequenced so as to match the learner’s developmental stage and thereby facilitate optimal
L2 learning. Among various taxonomies on how to analyze and sequence pedagogic tasks,
two rival approaches, i.e., Skehan’s Limited Capacity Model (Skehan, 1998, 2009; Skehan
& Foster, 2001) and Robinson’s Cognition Hypothesis (Robinson, 2001, 2011), have
exerted a substantial impact on recent empirical research on task sequencing and grading.
From an attentional capacity perspective, both models attempt to explain and predict how
various features of tasks will affect task-generated cognitive demands and in so doing the
allocation of learners’ attention during task completion. In the Limited Capacity Model,
Skehan proposes that the level of task demands depends on three task-related factors: (a)
code complexity, which pertains to linguistic complexity and variety involved in the task,
(b) cognitive complexity, which entails processing and computational requirements, and (c)
communicative stress, which includes time pressure, number of participants, opportunity to
control, and so on. Based on a single resource view, all of these factors are considered to
have an important bearing on how learners’ attention during a task will be shared out and
how task performance will be affected in terms of linguistic complexity, accuracy, and
fluency (henceforth, CAF). He also suggests competition between linguistic complexity
Effects of Task Complexity on L2 Reading and L2 Learning
143
and accuracy triggered by the natural limitation of attentional resources.
Robinson, on the other hand, defines task complexity as “the result of the attentional,
memory, reasoning, and other information-processing demands imposed by the structure of
the task on the language learner” (Robison, 2001, p. 28). As reflected in his definition,
Robinson claims that only task-inherent features, not linguistic elements involved in the
task, should be considered when determining task complexity. Robinson also proposes the
Multiple Attentional Resources Model based on Wickens’s (1992) work in cognitive
psychology, and claims that increasing task complexity, due to the existence of multiple
resource pools, can result in increased linguistic complexity without an expense of
accuracy in learner production. The Multiple Attentional Resources Model also provides
motivation for the Cognition Hypothesis and its associated Triadic Componential
Framework. Within this framework, Robinson classifies task features into two dimensions,
i.e., resource-directing and resource-dispersing. Along the resource-directing dimension, a
task can become more demanding by increasing the number of elements involved, the
amount of the reasoning required, or making reference to a displaced past time event. The
cognitive and conceptual need to formulate complex content has the effect of channelling
learners’ attention towards lexical and grammatical encoding, which results in greater
complexity and accuracy while negatively affecting fluency. By contrast, a task can also
become more demanding along the resource-dispersing dimension by reducing the
planning time allowed to learners or using unfamiliar task type, content or structures. In
this case, learners’ attention is steered towards the consolidation of, and faster access to,
the existing interlanguage (IL) system, resulting in a trade-off between linguistic
complexity and accuracy. The Cognition Hypothesis further claims that increased task
complexity facilitates L2 development. According to Robinson, more complex tasks
encourage learners to seek more help from the provided input, which results in greater
depth of processing (Craik & Tulving, 1975) and long-term memory of input.
Motivated by these two models, many studies have delved into examining whether and
how manipulating task complexity affects learners’ task performance, interactional patterns,
and L2 development. The majority of studies have explored how varying levels of task
complexity moderated learners’ language production, mostly in terms of CAF (e.g., Foster
& Tavakoli, 2009; Michel, 2011, 2013; Révész, 2011). Studies on monologic speech
production have produced moderately converging findings: increasing task complexity has
small positive effects for accuracy and small negative effects on fluency and linguistic
complexity (see Jackson & Suethanapornkul, 2013). In the written mode, availability of
planning time seemed to have an effect, as learners were inevitably allowed with more
time to prepare and adjust their production (e.g., Kormos & Trebits, 2012). Recently, task
complexity has also been shown to have differential effects on face-to-face versus
computer-mediated tasks (Baralt, 2013; Yilmaz, 2011).
144
Jookyoung Jung
Researchers have recently started to investigate whether and how task complexity
affected the incidence of various interactional features, such as negotiation of meaning
(e.g., confirmation check, comprehension check, and clarification request), self-repair or
modified output, language-related episodes (LREs), or learners’ uptake, all of which
deemed as conducive to L2 learning (e.g., Gilabert, Barón, & Llanes, 2009; Nuevo, 2006;
Révész, 2011; Révész, Sachs, & Mackey, 2011; Robinson, 2007). The results of the
studies have shown that, in general, cognitively complex tasks are likely to increase the
amount of negotiation of meaning and the number of LREs (see, however, Nuevo, 2006).
There are also studies that measured learning of a specific target form through engaging in
interactive tasks, mostly employing pretest-posttest-delayed posttest designs (e.g., Baralt,
2013; Kim & Tracy-Ventura, 2011; Nuevo, 2006; Révész, 2009). The results of these are
mixed: increasing task complexity had either no significant (e.g., Nuevo, 2006), marginal
(e.g., Baralt, 2013), or positive effects on L2 development (e.g., Kim & Tracy-Ventura,
2011; Révész, 2009).
A critical methodological problem in the existing research on task complexity is that
studies so far have more or less eschewed to verify the construct of task complexity, even
though it is fundamental to assure the internal validity of task complexity manipulations
(Révész, 2014). The few researchers who attempted to check the validity of task
complexity manipulations have typically asked learners to respond to post-hoc
questionnaires designed to elicit learners’ perceptions about how challenging and difficult
a task was. The information obtained was used as a basis for inferring the level of
cognitive demands induced by the task (e.g., Baralt, 2013; Gilabert et al., 2009; Révész,
2009; Robinson, 2001, 2005, 2007). As expected, however, internal inconsistency and
response bias are still open to debate. In order to examine whether the supposedly more
complex tasks are indeed more demanding to the learners, new research methods seem
highly recommendable, such as subjective time estimation, dual-task performance, and
physiological measures (Brünken, Plass, & Leutner, 2003; Norris & Ortega, 2009; Révész,
2014).
2.2. Task Effects on L2 Reading
Reading is a complex psycholinguistic process where a variety of associated component
skills come into play (Grabe, 2009; Khalifa & Weir, 2009; Urquhart & Weir, 1998). The
entire reading process is affected by the purpose of reading through making metacognitive
decisions regarding the scope, depth, and speed of reading required (Khalifa & Weir,
2009). In the case of L2 reading, the incomplete linguistic proficiency of learners may
render the importance of regulating reading process across different tasks even more
prominent (Horiba, 2000, 2013; Jung, 2012; Taillefer, 1996). For instance, in Taillefer’s
Effects of Task Complexity on L2 Reading and L2 Learning
145
(1996) and Jung’s (2012) studies, the amount of variance in L2 reading comprehension
accounted for by L2 proficiency decreased considerably in a less complex reading task (i.e.,
scanning) compared to a more complex task (i.e., reading for comprehension). In other
words, L2 linguistic processing may make a more marked contribution to a more complex
L2 reading task.
While there are not many studies on L2 reading for different purposes, Horiba’s (2000,
2013) studies that employed think-aloud protocols provide some useful insights. In her
studies that investigated how task instructions affected L1 and L2 readers’ text processing
and comprehension products, L1 readers were competent and flexible in controlling their
reading processes and allocating their cognitive resources strategically according to the
type of the text and the task, whereas L2 readers were not able to do so mainly due to
linguistic demands. Also, the task effects materialized more clearly at the level of text
processing than in the products of comprehension. In other words, different patterns of text
processing for different reading goals may not necessarily materialize in the outcome of
the reading, which underscores the need for employing concurrent process measures to
evince how task affects reading processes more clearly.
In the field of SLA, Yoshimura (2006) investigated whether manipulating
foreknowledge of the expected task outcomes leads to different reading behavior, text
comprehension, and noticing of L2 form. The reading tasks in this study included reading
to memorize the text, reading to retell the text, and reading to visualize the imagery. After
reading the text, however, the post-reading tasks were not administered. Rather, the
participants were asked to (a) report their reading behaviour by completing a retrospective
questionnaire, (b) answer true-or-false comprehension check questions, and (c) fill in the
blanks in the text with appropriate verbs. The results showed that the output groups (i.e.,
reading for memorization and reading for retelling) used more diverse reading strategies. It
was further revealed that scores on the verb production test were higher for the
memorization group, lower for the retelling group and the lowest for the visualization
group. By contrast, comprehension scores were not significantly different across groups.
From the findings, Yoshimura suggested that learners’ reading processes could be affected
by foreknowledge of the required task output, and that different task instructions might
have a differential impact on language processing. More importantly, Yoshimura’s (2006)
study demonstrates that it is viable to use tasks to promote learners’ reading for acquisition
without interrupting reading for comprehension. That is, if the target construction is
regarded as essential for task completion, learners may pay more attention to TL features
in the text during reading.
The review of previous studies on the effects of task complexity suggests that the scope
of TBLT research has been confined to productive skills and thus needs to be expanded
into other language skills, such as L2 reading, for a more nuanced understanding of task
146
Jookyoung Jung
effects of L2 performance and learning. In addition, given the small number of studies on
how different task features affect L2 reading, more empirical investigations appear
imperative. To fill these gaps, the following research questions will be addressed in the
present study:
1. To what extent do the cognitive demands of second language reading tasks affect
reading comprehension?
2. To what extent do the cognitive demands of second language reading tasks affect
development in the knowledge of target language constructions?
3. METHODOLOGY
3.1. Design
This study examined the impact of task complexity on Korean speakers’ L2 English
reading and learning. As illustrated in Figure 1, the study employed a pretest, posttest and
delayed posttest design, with two treatment sessions. The participants were randomly
assigned to either – or + complex condition, and completed two treatment sessions. In each
session, participants read a passage taken from a TOEFL exam, while simultaneously
answering reading comprehension items. Development in the knowledge of target
constructions was measured with a grammaticality judgement test and vocabulary form
and meaning recognition tests. More detailed explanations of the research instruments and
procedures are provided in the following sections.
FIGURE 1
Experimental Design and Procedure
Week 1
Session 1
Session 2
Week 2
Session 3
Week 4
Session 4
- Complex
+ Complex
(n = 26)
(n = 26)
Pretest, background questionnaire, & L2 proficiency test
Treatment 1 & post-reading questionnaire
Treatment 2, post-reading questionnaire, & immediate posttest
Delayed posttest & exit questionnaire
Effects of Task Complexity on L2 Reading and L2 Learning
147
3.2. Participants
The participants comprised 14 male and 38 female undergraduate students enrolled in a
university in Korea. Their L1 was Korean and their average age was 22.84 years (SD =
1.94). They had no explicit instruction on the target construction (i.e., English unaccusative
verbs) prior to this study. To ensure the homogeneity of participants’ English ability, their
English proficiency level was measured with the Reading and Use of English section of a
practice Cambridge Proficiency: English (CPE) test, developed and provided by
University of Cambridge ESOL Examinations. Based on their scores, stratified random
sampling was applied in order to reduce sampling error and ensure equivalence among the
groups in terms of English proficiency.
3.3. Texts
For the treatment of this study, two expository texts were selected from passages used
for real past TOEFL tests developed by the ETS. The texts were chosen based on two
criteria: (a) sufficient number of occurrences of the target constructions and (b) difficult
and/or unfamiliar topic to participants. As shown in Table 1, the two texts were
comparable in terms of length and readability1. Texts were presented to the participants in
a counter-balanced order.
TABLE 1
Title
Number of words
Average Readability
Text Characteristics
Text 1
Petroleum Resources
682
11.6
Text 2
The Cambrian Explosion
699
13.4
3.4. Targeted L2 Features
One target L2 feature of the present study was the English unaccusative construction,
which Korean learners have been reported to have persistent difficulty in acquiring (e.g.,
Chung, 2014; Hwang, 1999, 2001; Lee, Miyata, & Ortega, 2008; No & Chung, 2006). Ten
pseudo-words were additionally included in order to examine the effects of task
complexity on incidental learning of lexical items.
1
Average readability was calculated from various readability indices including Flesch-Kincaid
grade level, Gunning-Fog score, Coleman-Liau index, SMOG index, and Automated Readability
index.
148
Jookyoung Jung
3.4.1. English unaccusative verbs
Perlmutter (1978) first introduced the Unaccusativity Hypothesis in which intransitive
verbs are classified into either unergatives or unaccusatives. Whereas an unergative verb
assigns an agent-role of a volitional act to its subject, the subject of an unaccusative verb
lacks volitional control, performing a patient-role (e.g., Unergative: Mary danced.;
Unaccusative: The snow melted.). SLA researchers have consistently found that L2
learners of English tend to overuse passive structures with unaccusative verbs (e.g., The
car was disappeared.). Multiple factors may be partly responsible for this learnability
problem. For example, some researchers (Hwang, 1999, 2001) suggest that if an
unaccusative verb has its transitive counterpart (e.g. ship, change, close, break), then L2
English learners may have a stronger tendency to make overpassivization errors than with
non-alternating unaccusative verbs (e.g. happen, result, arrive, disappear). Lack of a
conceptualizable agent (Ju, 2000), L1 transfer (No & Chung, 2006), and low input
frequency (Lee, Miyata, & Ortega, 2008) have also been explored as potential factors
increasing the difficulty in acquiring English unaccusativity.
In the present study, 17 English unaccusative verbs were identified from the two
treatment texts and selected as target constructions. As illustrated in Table 2, all target
verbs included in the texts were low frequency, and hence the participants were expected
to have a limited knowledge about those verbs. Out of the 17 verbs, six verbs were nonalternating unaccusative verbs, whereas the rest were alternating. Each of the unaccusative
verbs appeared in the texts once.
TABLE 2
Unaccusative
verbs
1
2
3
4
5
6
7
8
9
decompose
subside
ascend
accumulate
cease
diminish
drift
collect
settle
Target English Unaccusative Verbs
Text 1
Unaccusative
Alternating Frequency
verbs
(per 450
million)
A
312 1 fossilize
NA
568 2 date to
A
759 3 originate
A
1814 4 consist of
A
2554 5 persist
A
2701 6 evolve
NA
4477 7 disappear
A
10525 8 emerge
A
10873
Text2
Alternating
A
A
A
NA
NA
A
NA
NA
Frequency
(per 450
million)
11
743
1022
2140
2684
3184
7581
9116
3.4.2. Pseudo-words
In addition to the English unaccusative verbs, ten lexical items were also included as
Effects of Task Complexity on L2 Reading and L2 Learning
149
target constructions in this study (see Table 3). The lexical targets were carefully selected
from the two texts based on the following conditions: (a) the word is a noun (to control the
part of speech) and (b) the word appears once (to control for frequency). Five words were
selected from each text and replaced with pseudo-words that followed English
orthographic and morphological rules (Pulido, 2007). When the original word was in a
plural form, plurality was also marked in the corresponding pseudo-word by attaching –s.
Each of the pseudo-words consisted of two syllables, containing seven letters, in order to
control the length. The target unaccusative verbs were not substituted with pseudo-words
as the focus was learning the unaccusative usage of the verbs rather than their forms or
meanings.
TABLE 3
Target Pseudo-words
1
2
3
4
5
Pseudo-words
stragon
golands
phosens
klaners
stovons
Text 1
Original words
bottom
spouts
discoveries
parks
beaches
1
2
3
4
5
Pseudo-words
cabrons
fration
zenters
morbits
tralion
Text2
Original words
changes
absence
clues
descendants
predator
3.5. Treatment Tasks
The treatment task in this study is fundamentally what learners have to do when taking a
reading section of a TOEFL test, i.e., reading the provided text and answering to multiplechoice comprehension questions. In other words, reading comprehension measure was
embedded in the learning task so that the level of participants’ text understanding could be
simultaneously measured in tandem with task completion. The multiple-choice reading
comprehension items were also taken from the previous TOEFL tests developed and tested
by ETS. The reading comprehension items asked participants to identify factual/negative
factual information, make inferences, understand rhetorical purpose, recognize vocabulary
meaning, determine reference, simplify/paraphrase a sentence, insert a sentence into a
paragraph, and select main ideas of the text (Educational Testing Service, 2012). The texts
were divided into five segments, comprised of either one or two paragraphs, and followed
by reading comprehension questions relevant to each segment.
In this paper, task complexity was defined as task-induced demands made on learners’
cognitive resources while performing a task. In the – complex condition, participants were
asked to read and answer the comprehension questions as they normally would when
working on a reading section of a TOEFL test. In the + complex condition, the segments
were jumbled and presented to participants in a mixed order. Thus, in addition to reading
150
Jookyoung Jung
the paragraphs and answering to comprehension questions, participants in the complex
group also had to reorder the segments in a correct order to make a coherent text. The latter
task was judged as cognitively more demanding in that readers’ comprehension is
substantially influenced by the degree of clarity and coherence of text structure (Meyer &
Ray, 2011). There was no time limit required for task completion. The total score for
reading comprehension was 15 for each text.
3.6. Assessment Tasks
L2 learning in this study was operationalized as (a) the ability to recognize
grammaticality of English unaccusative verbs and (b) the ability to recognize the form and
meaning of the target lexical items.
3.6.1. Grammaticality judgment test
The GJT used for this study contained 80 sentences in total, including (a) 34 sentences
for target unaccusative verbs, (b) 16 for novel unaccusative verbs, and (c) 30 distracters.
First, for each of the 17 target unaccusative verb, one grammatical and one ungrammatical
passive sentence were created, resulting in 34 sentences (e.g. The sun was soon
disappeared vs. The tension soon disappeared).
Also, in order to explore if participants’ learning from the treatment transferred to other
unaccusative verbs, eight additional verbs (i.e., occur, remain, appear, fall, burn, stop,
break, and change) were selected from the list of 2000 most frequently used English words,
consulting the Compleat Lexical Tutor version 6.2. For these eight verbs, 8 grammatical
and 8 ungrammatical sentences were produced, totaling in 16 sentences. Caution was paid
to control the number of syllables, syntactic complexity, semantic plausibility, vocabulary
familiarity, and the position of the unaccusative verbs for each of the 25 pairs of the
unaccusative sentences.
Lastly, thirty sentences, 15 grammatical and 15 ungrammatical, were included as
distracters. The grammatical rules the distracters draw on included gerunds, to-infinitives,
subjective moods, comparatives, participial adjectives, reflexives, relative pronouns,
inversion, and prepositions, which covered the topics generally dealt in English grammar
lessons in Korea (No & Chung, 2006). Across the pretest, posttest, and delayed posttest,
the same 80 sentences were randomly presented to participants.
The GJT for this study was constructed using E-Prime 2.0. The 80 sentences were
presented on a computer screen, and participants were asked to press the “z” key if the
sentence seemed grammatical and the “m” key if the sentence seemed ungrammatical.
These particular keys were chosen considering their placement in the QWERTY keyboard,
Effects of Task Complexity on L2 Reading and L2 Learning
151
which is the norm layout in Korea. Participants were instructed to make their decision as
fast as they could. Each sentence remained on the screen until the decision on the sentence
well-formedness was made. Between each stimulus, a fixation cross appeared at center of
the screen for 500 milliseconds to signal an upcoming sentence. The total score was 50 (34
for the target verbs and 16 for the novel verbs), and the test took approximately 10 to 12
minutes to complete.
3.6.2. Vocabulary form recognition test
In order to measure if task complexity affected form recognition of the target pseudowords, 20 items were constructed using E-Prime 2.0. Participants were asked to press
either the “z (yes)” or the “m (no),” depending on whether they remembered seeing the
word from the texts. Ten items were the target pseudo-words, whereas the other ten were
distracters that were constructed drawing upon the pseudo-words in Godfroid, Housen, and
Boers (2013). Each of the distracters contained two syllables and seven letters as the target
pseudo-words. In addition to the 20 form recognition items, 20 additional multiple-choice
items, modeled after Martinez-Fernández’s (2010) meaning recognition test, asked
participants to select a correct Korean translation of the given target word out of three
choices. Among these, ten items were the target words while the other ten were the
distracters used in the form recognition test. As in the GJT, each set of the form
recognition and the meaning recognition items were randomized and presented on a
computer screen, and participants were asked to choose the answer for each item. Again,
between each stimulus, a fixation cross appeared on the screen for 500 milliseconds to
signal the next item. Total score for each of the vocabulary form and meaning recognition
test was 10 (1 for each correct and 0 for each incorrect item), and each test took
approximately 2 to 3 minutes.
3.7. Questionnaires
Participants were asked to answer a background questionnaire, a post-reading
questionnaire, and an exit questionnaire. The aim of the background questionnaire was to
collect information about participants’ demographics and English learning experiences.
The post-reading questionnaires asked participants to provide their retrospective subjective
time estimation taken to complete the given reading task and familiarity with the topic of
the reading text. As the time estimation task was conducted under a retrospective paradigm
(Fink & Neubauer, 2001), participants were unaware of the upcoming duration judgement
task until it had to be done. As such, subjective time estimations were only collected after
the first treatment session. It was expected that the duration ratio would increase after
152
Jookyoung Jung
performing cognitively more demanding tasks. Finally, the exit questionnaire asked
participants to give retrospective comments about the treatment sessions. All
questionnaires were administered in Korean.
4. PROCEDURE
The data were collected over four weeks. All participants took the pretest, background
questionnaire, and the L2 proficiency test in the first session. In sessions 2 and 3, they took
part in the treatment sessions, each followed by a post-reading questionnaire. In session 3,
they also completed an immediate posttest. In the fourth week, participants in the
experimental conditions completed a delayed posttest. Each session took approximately 45
minutes to an hour. The participants in the experimental conditions carried out the sessions
in a computer-laboratory at a university as a group.
5. ANALYSIS
SPSS 22.0 for Mac was used for examining reliability of the tests as well as computing
descriptive and correlational statistics of the data. More specifically, the reliability of the
different tests was determined using Cronbach’s alpha, and interrelationships between the
various test scores were computed using Pearson’s coefficients. The level of significance
for this study was set at alpha level of p < .05. The relationships between the variables
were analyzed by constructing various mixed-effects models constructed with the package
lme4 (Bates, Maechler & Bolker, 2012), using the statistical program R version 3.3.0 (R
Development Core Team, 2016). The fixed effect was Complexity and the random effects
were Subjects and Items. For GJT scores and vocabulary recognition scores, Time (pretest
– posttest – delayed posttest) was put into the models as an additional fixed effect in order
to explore changes in the data over the repeated measurements. For t statistics, absolute tvalues above 2.0 was set for testing significance of the models (Gelman & Hill, 2007).
Effect sizes for the linear mixed-effects models were computed using r.squaredGLMM
function in the package MuMln (Barton, 2015), whereas that of the logit mixed-effects
models was calculated with C index of the concordance using somer2 function in Hmisc
package (Harrell & Dupont, 2015). Following Plonsky and Oswald (2014), R2s value
of .06, .16, and .36 were interpreted as small, medium, and large respectively. A C-index
of .70 was considered as a moderate, .80 as good, and .90 and above as excellent fit for the
data (Baayen, 2008; Rogers, 2016). For t-tests, Cohen’s d was calculated to examine effect
sizes. As suggested by Plonsky and Oswald, the benchmarks were .40 for small, .70 for
Effects of Task Complexity on L2 Reading and L2 Learning
153
medium and 1.00 for large effect sizes for independent-sample t-tests, and .60 for small,
1.00 for medium and 1.50 for large effect sizes for paired-sample t-tests.
6. RESULTS
6.1. Preliminary Analysis
Prior to answering the research questions, some preliminary steps were taken to ensure
the reliability of the instruments and validity of the results. The following methodological
concerns were taken into consideration: reliability of the tests, participants' prior
knowledge of the target items, potential effects of topic familiarity on reading
comprehension scores, and validation of task complexity.
6.1.1. Test reliability
In order to check consistency and stability of the instruments, reliability coefficients for
the proficiency test, reading comprehension tests, grammaticality judgment tests, and
vocabulary recognition tests were computed using Cronbach’s alpha. As summarized in
Table 4, values of Cronbach’s alpha were found to be high for the proficiency test and
grammaticality judgment tests, but low for the reading comprehension tests and vocabulary
recognition tests. Also, the variances in the scores of reading comprehension tests and
vocabulary recognition tests were relatively small, presumably contributing to the low
reliability coefficients. In addition, the mean reading comprehension scores appeared to
imply a ceiling effect, which could have further contributed to the low internal consistency
reliability of the reading comprehension tests.
TABLE 4
Descriptive Statistics for Test Scores
N M
SD
Cronbach’s alpha
CPE test
52 14.29 4.94
.70
Reading comprehension (Text 1)
52 11.04 2.22
.47
Reading comprehension (Text 2)
52 12.85 1.51
.37
Grammaticality judgment (Target items)
52 58.52 11.77
.80
Grammaticality judgment (Novel items)
52 31.92 6.07
.67
Vocabulary recognition (Form)
52 11.64 3.65
.59
Vocabulary recognition (Meaning)
52
5.64 2.96
.45
Note. Maximum score for: CPE test = 45, reading comprehension = 15, grammaticality judgment
(target) = 102, grammaticality judgment (novel) = 48, vocabulary form recognition = 20, vocabulary
meaning recognition = 20.
Test
154
Jookyoung Jung
6.1.2. Equivalence among groups
To check the equivalence of English proficiency level among the groups, a mixedeffects model was constructed, with the CPE scores as the dependent variable, Group as a
fixed effect, and Subject and Item as random effects. When compared with a null model
that contained only random effects, the results showed that the inclusion of Group as a
fixed effect did not make a significant difference to the null model, χ2(1) = .29, p = .59. In
other words, the groups did not significantly differ from each other in terms of their
English proficiency (for descriptive statistics, see Table 5).
TABLE 5
Descriptive Statistics for Proficiency Test by Group
Proficiency test
Group
N
M
SD
– Complex
26
14.62
4.97
+ Complex
26
13.96
4.96
Note. Maximum score = 45.
SE
.98
.98
Next, in order to test whether the four groups started out at a developmentally parallel
stage, another set of likelihood ratio tests were conducted on pretest GJT scores comparing
null models with random effects only and models additionally containing Group as a fixed
effect (for descriptive statistics, see Table 11). The results indicated that Group did not
improve the null models to a significant degree, Target verbs: χ2(1) = .25, p = .62; Novel
verbs: χ2(1) = 1.14, p = .29. In sum, the results showed that, at the time of the pretest, there
was no significant difference among the groups in their ability to judge the grammaticality
of the English unaccusative sentences.
6.1.3. Effects of topic familiarity
To assess the potential impact of topic knowledge on comprehension of the treatment
texts, participants’ familiarity with the two topics was measured using post-reading
questionnaire items (i.e., Item 1: I thought this topic of the reading was familiar, Item 2: I
had some background knowledge about the reading topic).
The descriptive statistics are presented in Table 6. The responses to the two items were
significantly correlated with each other, Text 1: r(52) = .68, p < .01, Text 2: r(52) = .56, p
< .01, suggesting that the items assessed overlapping constructs. In order to examine the
effects of topic familiarity on reading comprehension scores, likelihood ratio tests were
conducted comparing a null model with the random effects and models additionally
including topic familiarity as a fixed effect. The dependent variable was reading
Effects of Task Complexity on L2 Reading and L2 Learning
155
TABLE 6
Descriptive Statistics for Topic Familiarity by Item
Topic familiarity
Text 1
Text 2
Item
N
M
SD
SE
M
SD
#1
52
3.60
.24
1.75
3.10
.24
52
3.48
.23
1.69
2.46
.19
#2
Total
52
7.08
.44
3.16
5.55
.38
Note. Maximum value for each item = 7.
SE
1.76
1.34
2.75
comprehension scores for Text 1 and Text 2. The results showed that adding topic
familiarity did not make a significant improvement to the null models, Text 1: χ2(1) = .01,
p = .91, Text 2: χ2(1) = 2.25, p = .13. In short, the participants’ topic familiarity with the
texts did not affect their scores on the reading comprehension items.
6.1.4. Validation of task complexity
To validate the operationalization of task complexity, all participants were asked to
judge the perceived time duration taken to complete each task immediately after the task
completion. As mentioned earlier, only the time estimations made after completing the first
task were analyzed. In order to examine whether the subjective time estimations differed as
a function of task manipulation, the estimated-to-target duration ratios were calculated by
dividing estimated time by real time taken to complete the given task. In the retrospective
time estimation paradigm, duration judgment ratio is expected to increase with greater
cognitive load.
As shown in Table 7, for both Text 1 and Text 2, duration judgment ratios in the +
complex conditions were on average larger than those in the – complex conditions. The
results from independent samples t-tests on the duration judgment ratios across + and –
complex conditions also revealed significant effects of task complexity for both Text 1 and
Text 2, Text 1: t(50) = 2.86, p = .01, 95% CI [.04, .22]; Text 2: t(50) = 3.85, p < .01, 95%
CI [.11, .36]. Cohen’s ds were .79 and 1.09 respectively, which were considered as
medium and large effect sizes. In other words, duration judgment ratios in the + complex
conditions were significantly greater than those in the – complex conditions, implying that
the + complex tasks induced heavier cognitive loads on the participants compared to the –
complex tasks.
To infer the effects of task complexity on the amount of mental effort posed on the
participants, three questionnaire items were included in the post-task questionnaires (Item
3: I thought this task was difficult, Item 4: I invested a large amount of mental effort to
complete this task, Item 5: I thought this task was demanding). The Cronbach’s alpha for
156
Jookyoung Jung
the three items was .63 for Text 1 and .75 for Text 2. Descriptive statistics for the
responses to the three items are presented in Table 8.
TABLE 7
Descriptive Statistics for Duration Judgment Ratio
Text 1
N
M (SD)
Condition
– Complex
13
1.03 (.16)
+ Complex
13
1.16 (.17)
Total
26
1.09 (.18)
Text 2
M (SD)
.95 (.11)
1.19 (.29)
1.08 (.25)
TABLE 8
Descriptive Statistics for Reported Mental Effort
Reported mental effort
Text 1
Item
Condition
N
M
SD
SE
M
#3
– Complex
26
4.23
1.03
.20
3.77
+ Complex
26
4.65
.80
.16
4.46
#4
– Complex
26
5.12
1.11
.22
4.69
+ Complex
26
4.85
1.19
.23
4.42
– Complex
26
4.00
1.17
.23
3.62
#5
+ Complex
26
4.46
1.42
.28
4.19
– Complex
26
13.35
2.50
.49
12.08
Total
+ Complex
26
13.96
2.71
.53
13.08
Note. Maximum value for each item = 7.
Text 2
SD
1.03
1.07
1.12
1.42
1.10
1.27
2.79
3.02
SE
.20
.21
.22
.28
.22
.25
.55
.59
In order to see if there was significant differences between the + and the – complex
conditions in participants’ ratings of perceived task difficulty, independent-sample t-tests
were conducted. The results revealed that there was no significant difference between the
conditions, t(50) = .852, p = .40, 95% CI [-.83, 2.07]; Text 2: t(50) = 1.242, p = .22, 95%
CI [-.62, 2.62]. Cohen’s ds were .23 and .34 respectively, indicating small effect sizes. In
short, participants’ perceived level of task difficulty appeared comparable regardless of
task manipulation.
6.2. Effects of Task Complexity on L2 Reading Comprehension
The descriptive statistics for the reading comprehension scores of each group are
displayed in Table 9. Reading comprehension scores on Text 2 were on average higher
than those on Text 1.
In order to examine whether task complexity had a significant impact on L2 reading
comprehension scores, linear mixed-effects models were constructed with R. Null models
contained random effects (i.e., Subject and Item) only, and Complexity was entered and
Effects of Task Complexity on L2 Reading and L2 Learning
157
compared against the null models with likelihood ratio tests using χ2 statistics. As
summarized in Table 10, task complexity was shown to have no significant effects on
reading comprehension scores.
TABLE 9
Descriptive Statistics for Reading Comprehension Scores
Text 1
Text 2
Group
N
M
SD
SE
M
SD
– Complex
26
11.08
2.26
.44
13.08
1.13
+ Complex
26
11.00
2.23
.44
12.62
1.81
Total
52
11.04
2.22
.31
12.85
1.51
Note. Maximum score = 15.
SE
.22
.36
.21
TABLE 10
Summary of Likelihood Ratio Tests for Complexity on Reading Comprehension Scores
χ2
df
p
R2
Text 1
.02
1
.90
.23
Text 2
.40
1
.53
.14
6.3. Effects of Task Complexity on Learning of Unaccusative Verbs
Table 11 presents the descriptive statistics for the GJT scores by group. It can be noticed
that mean gain scores were overall higher in the delayed posttest than in the immediate
posttest.
TABLE 11
Descriptive Statistics for Gains in GJT
Target items
Mean
Mean
SD
Group
Test
N
gain
– Complex
Pretest
26
17.92
4.34
Immediate posttest
26
19.58
1.58
5.05
Delayed posttest
26
20.54
2.54
4.89
+ Complex
Pretest
26
17.35
3.36
Immediate posttest
26
19.77
2.62
4.05
Delayed posttest
26
21.73
4.35
4.06
Note. Maximum score for: target GJT items = 34, novel GJT items = 16.
Novel items
Mean
Mean
gain
10.50
10.38
-.04
11.15
.73
10.04
10.77
.58
11.23
1.19
SD
2.57
2.22
2.41
2.91
2.27
2.58
In order to examine whether task complexity had a significant impact on target GJT
scores, logit mixed-effects models were constructed with R. The dependent variable was
GJT gain scores, i.e., the changes in the GJT scores compared to the pretest scores. The
fixed effect was Complexity. The null models contained only the random effects, i.e.,
Subject, Item, and Time (changes in the GJT scores over the repeated measures). Then,
158
Jookyoung Jung
Complexity was added and tested against the null models to see whether the inclusion of
the fixed effects significantly improved the model fit. As shown in Table 12, Complexity
was shown to have no significant influence on the GJT gain scores for the target verbs.
TABLE 12
Summary of Likelihood Ratio Tests for Complexity on GJT Gain Scores for Target Verbs
χ2
df
p
R2
Immediate gain
.34
1
.85
.81
Delayed gain
2.40
1
.30
.81
Next, another series of likelihood ratio tests were conducted to identify whether
Complexity made significant difference to the null models on the GJT gain scores for the
novel verbs. As summarized in Table 13, significance was not found, indicating transfer of
learning did not occur.
TABLE 13
Summary of Likelihood Ratio Tests for Complexity on GJT Gain Scores for Novel Verbs
χ2
df
p
R2
Immediate gain
1.92
1
.38
.81
Delayed gain
2.94
1
.23
.82
6.4. Effects of Task Complexity on Learning of Pseudo-Words
Table 14 presents the descriptive statistics for the vocabulary recognition scores by
group. The mean scores on the form recognition test were overall higher than those on the
meaning recognition test. Also, the mean and form recognition scores from the delayed
posttest were higher than those from the immediate posttest, whereas the mean meaning
recognition scores on the delayed posttest were lower than those on the immediate posttest.
TABLE 14
Descriptive Statistics for Vocabulary Recognition Score
Form
Group
Test
N
M
SD
– Complex
Immediate posttest
26 5.38
2.28
Delayed posttest
26 6.77
2.05
+ Complex
Immediate posttest
26 5.50
1.98
Delayed posttest
26 5.62
1.85
Note. Maximum score for: form recognition = 10, meaning recognition = 10.
Meaning
M
SD
3.04
1.66
2.35
1.41
3.15
1.85
2.96
1.78
In order to examine whether task complexity improved null models to a significant
degree, repeated likelihood ratio tests were conducted using χ2 statistics. The dependent
Effects of Task Complexity on L2 Reading and L2 Learning
159
variables were scores in the immediate posttest and those in the delayed posttest. The null
models included random effects only (i.e., Subject and Item) and each of Complexity was
added and tested against the null models. As shown in Table 15, Complexity improved the
null model in the delayed posttest.
TABLE 15
Summary of Likelihood Ratio Tests for Complexity on Vocabulary Form Recognition Scores
χ2
df
p
R2
Immediate gain
.26
1
.61
.77
Delayed gain
4.98
1
.03
.79
Then, logit mixed-effects models were constructed with Complexity on the delayed
posttest scores. As Table 16 presents, Complexity had significant negative effects in the
delayed posttest. The C index of concordance was .79, which indicated good model fit for
the data. In short, participants in the + complex conditions scored significantly less in the
delayed vocabulary form recognition test than those in the – complex conditions.
TABLE 16
Results for the Best-Fit Logit Mixed-Effects Models
Delayed
on Immediate Vocabulary Form Recognition Scores
Fixed effects
Random effects
by-Subject
by-Item
Estimate
SE
z
p
SD
SD
Intercept
.59
.24
2.48
.01
.45
.62
Complexity
-.63
.27
-2.30
.02
.93
.06
Note. Formula: VF ~ Complexity + (Complexity | Subject) + (Complexity | Item); C = 79.
Finally, effects of task complexity on vocabulary meaning recognition were explored,
beginning with another series of likelihood ratio tests using χ2 statistics. Again, the null
models included random effects only, and the fixed effects, i.e., Complexity, was added to
the null model one by one and examined if this improved the null models to a significant
extent. As summarized in Table 17, Complexity had no significant effects on vocabulary
meaning recognition scores.
TABLE 17
Summary of Likelihood Ratio Tests for Complexity on Vocabulary Meaning Recognition Scores
χ2
df
p
R2
Immediate gain
.31
1
.58
.80
Delayed gain
1.64
1
.20
.83
160
Jookyoung Jung
7. DISCUSSION AND CONCLUSION
In this study, it was investigated whether task complexity affected Korean
undergraduate students’ English reading comprehension and their learning of target lexical
constructions contained in the texts. Task complexity was manipulated by disarranging
paragraphs of each text, based on the understanding that coherent and clear text structure
considerably facilitates reading comprehension (Meyer & Ray, 2011). In this study,
reading comprehension scores were not affected by task complexity of the texts. It should
be noted, however, that as shown in the relatively high mean scores and small SDs,
participants overall performed well in the reading comprehension tests, and thus a ceiling
effect might have masked between-group differences. More difficult reading
comprehension tests may result in higher SDs and reliability of the reading comprehension
tests. In addition, there was no time limit and participants were allowed to stay on the task
as long as they felt necessary, which might have contributed to the non-significant effects
of task manipulation on reading comprehension scores. Another possibility is that
participants’ reading processes could have been affected by task complexity, although the
effects did not surface in the reading comprehension scores. In order to explore this issue,
participants’ verbal reports on their internal processes while performing tasks or eyemovement data will be highly informative.
It was also found that task complexity failed to affect the learning of the English
unaccusative verbs. It seems possible to assume that paragraph-ordering task in the +
complex conditions did not necessarily encourage participants to process the target
unaccusative verbs to a significantly greater extent. More specifically, the ordering task
might have led participants to depend on the initial or the final part of each paragraph
selectively, rather than paying attention to each paragraph thoroughly. Also, arranging
paragraphs might have promoted higher-level conceptual reasoning rather than lower-level
text-based processing, thereby not influencing learning of the target verbs. In other words,
when re-arranging the paragraphs, participants might have focused more on the main idea
of each paragraph and tried to figure out the logical order among the global ideas. In
addition, the task in the + complex conditions in this study could have not been complex
enough than the – complex task. Indeed, the task manipulation was shown successful only
by the subjective time estimations, but not by self-reports on the perceived level of task
difficulty.
Task complexity, though, had significant negative effects on form recognition in the
delayed posttest. That is, participants assigned in the + complex conditions were less
successful in recognizing target word forms than those in the – complex conditions. It
seems possible to assume that the increased level of task complexity could have driven
participants’ attention to the paragraph-ordering task, and accordingly away from attending
Effects of Task Complexity on L2 Reading and L2 Learning
161
the pseudo-word forms. When participants were allowed to read the text in a coherent
order under the – complex conditions, they might have enjoyed extra mental resources to
be shared out for processing the target word forms. This is open to empirical investigation,
preferably using on-line methodologies such as intro- or retrospective verbal reports or
eye-movement data.
The results from this study cast valuable insight for future research. Firstly, it was
speculated that a ceiling effect could have masked the effects of task complexity on
reading comprehension scores. Indeed, mean scores were low while variances were small,
suggesting an inherent limitation in detecting significant effects of task complexity on
reading comprehension scores. Therefore, it was assumed that the difficulty of reading
comprehension tests might need to be increased in the future studies so that the variances
among the participants could be inflated. It was also concluded that, in order to better
detect the effects of task complexity, task manipulation had to be conducted on a more
localized level in such a way that text-bound linguistic processing could be facilitated in a
more complex condition. In this study, the tasks were manipulated on a global-level (rearranging paragraphs into a coherent order) and thus led the participants to rely on topdown conceptual processing, which in turn failed to affect linguistic processing of the
target constructions. It was speculated that, local-level task manipulation might encourage
learners to read the given text more thoroughly so that their processing of target features
could be more likely to differ across the + and – complex task conditions. It was also
considered that a time limit might also play a role in magnifying the effects of cognitive
complexity of each task by placing additional cognitive demands on participants.
Last but not least, the reading tasks in this study involved reading the passages provided
while answering multiple-choice reading comprehension questions, which is
fundamentally what learners would normally do when taking a language aptitude or
proficiency test. In this regard, there may be concerns regarding the ecological validity of
the task in terms of its resemblance to real-world reading tasks. Yet, the applicability of the
reading tasks used in this study has implications in many cases in academic settings where
learners take exams. Also, in the present study, the extent to which the reading task
invoked the kind of cognitive processes that are essential in performing a real-world task
was considered more important than how much the task approximated to a target task in its
appearance. For instance, if a learner can read a given text and identify the author’s
intention, as one of the reading comprehension questions required, we can make a valid
assumption that the learner is likely to perform other similar tasks, such as reading an
editorial and understanding the author’s opinion. It should be acknowledged, however, that
ecologically more valid tasks that closely resemble real-world reading practices would also
generate insightful findings as to the task-based approach to L2 reading instruction.
162
Jookyoung Jung
REFERENCES
Baayen, R. H. (2008). Analyzing linguistic data: A practical introduction to statistics.
Cambridge, UK: Cambridge University Press.
Baralt, M. (2013). The impact of cognitive complexity on feedback efficacy during online
versus face-to-face interactive tasks. Studies in Second Language Acquisition, 35,
689-725.
Barton, K. (2015). MuMIn: Multi-Model Inference. R package version 1.13.4. http://cran.rproject.org/package=MuMIn.
Bates, D. M., Maechler, M., & Bolker, B. (2012). lme4: Linear mixed-effects models using
S4 classes. R package version 0.999999-0.
Brünken, R., Plass, J. L., & Leutner, D. (2003). Direct measurement of cognitive load in
multimedia learning. Educational Psychologist, 38(1), 53-61.
Bygate, M., Skehan, P., & Swain, M. (2001). Researching pedagogic tasks, second
language learning, teaching and testing. Harlow: Longman.
Chung, T. (2014). Multiple factors in the L2 acquisition of English unaccusative verbs.
IRAL, 52(1), 59-87.
Craik, F. I. M., & Tulving, E. (1975). Depth of processing and the retention of words in
episodic memory. Journal of Experimental Psychology: General, 104, 268-294.
Educational Testing Services (2012). Official TOEFL iBT Tests. McGraw-Hill: New York.
Ellis, R. (2003). Task-based language learning and teaching. Oxford, UK: Oxford
University Press.
Ellis, R. (2009). Task-based language teaching: Sorting out the misunderstandings.
International Journal of Applied Linguistics, 19(3), 221-246.
Fink, A., & Neubauer, A. C. (2001). Speed of information processing, psychometric
intelligence: And time estimation as an index of cognitive load. Personality and
Individual Differences, 30, 1009-1021.
Foster, P., & Tavakoli, P. (2009). Native speakers and task performance: Comparing
effects on complexity, fluency, and lexical diversity. Language Learning, 59(4),
866-896.
Gelman, A., & Hill, J. (2007). Data analysis using regression and multilevel/hierarchical
models. New York, NY: Cambridge University Press.
Gilabert, R., Barón, J., & Llanes, À. (2009). Manipulating cognitive complexity across task
types and its impact on learners’ interaction during oral performance. IRAL, 47,
367-395.
Godfroid, A., Housen, A., & Boers, F. (2013). An eye for words: Gauging the role of
attention in incidental L2 vocabulary acquisition by means of eye-tracking. Studies
in Second Language Acquisition, 35, 483-517.
Effects of Task Complexity on L2 Reading and L2 Learning
163
Grabe, W. (2009). Reading in a second language: Moving from theory to practice. New
York: Cambridge University Press.
Harrell, F. E., & Dupont, C. (2015). Hmisc: Harrell Miscellaneous. URL https://CRAN.Rproject. org/package=Hmisc. R package version 3.17-0.
Horiba, Y. (2000). Reader control in reading: Effects of language competence, text type,
and task. Discourse Processes, 29, 223-267.
Horiba, Y. (2013). Task-induced strategic processing in L2 text comprehension. Reading
in a Foreign Language, 25(2), 98-125.
Hwang, J. B. (1999). L2 acquisition of English unaccusative verbs under implicit and
explicit learning conditions. English Teaching, 54(4), 145-176.
Hwang, J. B. (2001). Focus on form and the L2 learning of English unaccusative verbs.
English Teaching, 56(3), 111-133.
Jackson, D. O., & Suethanapornkul, S. (2013). The cognition hypothesis: A synthesis and
meta-analysis of research on second language task complexity. Language Learning,
63(2), 330-367.
Ju, M. K. (2000). Overpassivization errors by second language learners: The effect of
conceptualizable agents in discourse. Studies in Second Language Acquisition, 22,
85-111.
Jung, J. (2012). Relative roles of grammar and vocabulary in different L2 reading tasks.
English Teaching, 67(1), 57-77.
Khalifa, H., & Weir, C. J. (2009). Examining reading: Research and practice in assessing
second language learning. Cambridge, UK: Cambridge University Press.
Kim, Y.-J., & Tracy-Ventura, N. (2011). Task complexity, language anxiety, and the
development of the simple past. In P. Robinson (Ed.), Researching second
language task complexity: Task demands, language learning and language
performance (pp. 287-306). Amsterdam: John Benjamins.
Kormos, J., & Trebits, A. (2012). The role of task complexity, modality, and aptitude in
narrative task performance. Language Learning, 62(2), 439-472.
Lee, S.-K., Miyata, M., & Ortega, L. (2008). A usage-based approach to overpassivization:
The role of input and conceptualization biases. Paper presented at the 26th Second
Language Research Forum, Honolulu, HI, October 17-19.
Martinez-Fernández, A. M. (2010). Experiences of remembering and knowing in SLA, L2
development, and text comprehension: A study of levels of awareness, type of
glossing, and type of linguistic item. Unpublished dissertation, Georgetown
University, Washington, D.C.
Meyer, B. J. F., & Ray, M. N. (2011). Structure strategy interventions: Increasing reading
comprehension of expository text. International Electronic Journal of Elementary
Education, 4(1), 127-152.
164
Jookyoung Jung
Michel, M. C. (2011). Effects of task complexity and interaction on L2 performance. In P.
Robinson (Ed.), Researching second language task complexity: Task demands,
language learning and language performance (pp. 141-173). Amsterdam: John
Benjamins.
Michel, M. C. (2013). The use of conjunctions in cognitively simple versus complex oral
L2 tasks. The Modern Language Journal, 97(1), 178-195.
No, G., & Chung, T. (2006). Multiple effects and the learnability of English unaccusatives.
English Teaching, 61(1), 19-39.
Norris, J. M., & Ortega, L. (2009). Towards an organic approach to investigating CAF in
instructed SLA: The case of complexity. Applied Linguistics, 30(4), 555-578.
Nuevo, A. (2006). Task complexity and interaction: L2 learning opportunities and
interaction. Unpublished doctoral dissertation. Georgetown University,
Washington D.C.
Perlmutter, D. M. (1978). Impersonal and the unaccusative hypothesis. Proceedings of the
4th Annual Meeting of the Berkeley Linguistics Society, 157-190.
Plonsky, L., & Oswald, F. L. (2014). How big is “big”?: Interpreting effect sizes in L2
research. Language Learning, 64(4), 878-912.
Pulido, D. (2007). The effects of topic familiarity and passage sight vocabulary on L2
lexical inferencing and retention through reading. Applied Linguistics, 28(1), 66-86.
R Development Core Team (2016). R: A language and environment for statistical
computing. R Foundation for Statistical Computing, Vienna, Austria. URL
http://www.R-project.org/.
Révész, A. (2009). Task complexity, focus on form, and second language development.
Studies in Second Language Acquisition, 31, 437-470.
Révész, A. (2011). Task complexity, focus on L2 constructions, and individual differences:
A classroom-based study. The Modern Language Journal, 95, 162-181.
Révész, A. (2014). Towards a fuller assessment of cognitive models of task-based
learning: Investigating task-generated cognitive demands and processes. Applied
Linguistics, 35(1), 87-92.
Révész, A., Sachs, R., & Mackey, A. (2011). Task complexity, uptake of recasts, and
second language development. In P. Robinson (Ed.), Researching second language
task complexity: Task demands, language learning and language performance (pp.
203-236). Amsterdam: John Benjamins.
Robinson, P. (2001). Task complexity, cognitive resources and syllabus design: A triadic
framework for examining task influences on SLA. In. P. Robinson (Ed.), Cognition
and second language instruction (pp. 193-226). Cambridge: Cambridge University
Press.
Robinson, P. (2005). Cognitive complexity and task sequencing: Studies in a componential
Effects of Task Complexity on L2 Reading and L2 Learning
165
framework for second language task design. International Review of Applied
Linguistics, 43, 1-32.
Robinson, P. (2007). Task complexity, theory of mind, and intentional reasoning: Effects
on L2 speech production, interaction, uptake and perceptions of task difficulty.
IRAL, 45, 193-213.
Robinson, P. (2011). Researching second language task complexity: Task demands,
language learning and language performance. Amsterdam: John Benjamins.
Rogers, J. R. (2016). Developing implicit and explicit knowledge of L2 case marking under
incidental learning conditions. Unpublished dissertation, University College
London Institute of Education, London, UK.
Skehan, P. (1998). A cognitive approach to language learning. Oxford: Oxford University
Press.
Skehan, P. (2009). Modelling second language performance: Integrating complexity,
accuracy, fluency, and lexis. Applied Linguistics, 30(4), 510-532.
Skehan, P., & Foster, P. (2001). Cognition and tasks. In P. Robinson (Ed.), Cognition and
second language learning (pp. 183-205). New York: Cambridge University Press.
Taillefer, G. E. (1996). L2 reading ability: Further insight into the short-circuit hypothesis.
The Modern Language Journal, 80, 461-477.
Urquhart, A. H., & Weir, C. J. (1998). Reading in a second language: Process, product,
and practice. New York: Longman.
Wickens, C. D. (1992). Engineering psychology and human performance. New York, NY:
Harper Collins.
Yilmaz, Y. (2011). Task effects on focus on form in synchronous computer-mediated
communication. The Modern Language Journal, 95(1), 115-132.
Yoshimura, F. (2006). Does manipulating foreknowledge of output tasks lead to
differences in reading behavior, text comprehension and noticing of language
form? Language Teaching Research, 10(4), 419-434.
Application levels: Secondary
Jookyoung Jung
Department of Culture, Communication and Media
University College London
Gower Street, London
WC1E 6BT
Phone: 44 (0)20-7679-2000
Email: jookyoungjung14@ucl.ac.uk
166
Jookyoung Jung
Received on September 1, 2016
Reviewed on October 15, 2016
Revised version received on November 15, 2016