When the factors involved in second language L2 reading are under scrutiny, vocabulary knowledge is usually regarded as the main one. For that reason, efforts to look into how L2 readers get to the meaning of unknown words play a unique role in L2 reading research. Taking this into account, the objective of this article is to present the results of a study on the lexical inferential strategies and knowledge sources employed by sixteen proficient readers, based on Nassaji (2003), pointing to possible correlations between strategy and resource type use and participants’lexical inference performance.

In the three following sections of the article, a brief literature review is presented. While in section 2 the more general relationship between vocabulary knowledge and reading comprehension is analyzed, in sections 3 and 4 the more specific topics of L2 vocabulary acquisition and L2 lexical inference are discussed. The methodology is depicted in detail in the fifth section, followed by an analysis of data. The main findings of the study are presented in the seven and last section, accompanied by a reflection on the possible causes that may have led to them.


While vocabulary knowledge does not account on its own for reading comprehension, various scholars have highlighted it as the main predictor of successful reading. According to Grabe and Stoller, vocabulary knowledge is given special relevance among the three essential components that a skillful reader should present, the other two being reading fluency and reading rate (2002: 183). Similarly, the importance of vocabulary recognition not only for reading and writing acquisition, but also for listening and speaking in the L2, is emphasized in Gess and Selinker’s statement that “the lexicon may be the most important component for the learner” (1994: 270), as well as in the following Vermeer’s quotation:

Knowing words is the key to understand and being understood. The bulk of learning a new language consists of learning new words. Grammatical knowledge does not make for great proficiency in a language (Vermeer 1992:147).

Evidence from a number of studies focusing on the relationship between lexical knowledge and reading comprehension supports such point of views, such as Anderson and Freebody (1981), Laufer (1991), Coady, J., Magoto, J., Hubbard, P., Graney, J. and Mokhtari, K.(1993), Koda (1989) and Khodadady (2000). In a literature review of the factors that most contribute to reading, Anderson and Freebody (1981) found out that lexical knowledge was the strongest predictor of reading comprehension, more important than variables related to sentence structure, inferential skills and the ability to recognize the main ideas of the text.

Regarding L2 reading, Laufer’s study (1991) pointed to significant correlations between two vocabulary tests and reading scores of L2 learners. Koda (1989) and Coady et al (1993) have also found similar results in their works: while the first pointed to a high correlation between a vocabulary test and two different measures of reading comprehension, the cloze test and paragraph comprehension, the second showed, through two tests of diverse nature, that an increase in high-frequency vocabulary proficiency led to an increase in reading proficiency.

In Khodadady’s study (2000), the objective was to identify which type of vocabulary was decisive for successful reading, whether contextual or global, the first type being directly related to the words presented in a specific text, and the second not directly related to them. In order to achieve that, the scores on contextual vocabulary comprehension tests, on global vocabulary comprehension scores, and on reading comprehension of 123 native speakers and of 64 non-native speakers were correlated. Statistical analysis revealed significant correlations between the reading comprehension test and contextual vocabulary knowledge, pointing to this element as a decisive one for successful reading, both in the first language L1 and in the L2.

Taking into account the interrelationship between lexical knowledge and reading performance, an issue that has deserved special attention by scholars is the number of words readers should know to guarantee an effective reading, especially having in mind research evidence showing that the most significant setback for L2 is not inadequate use of reading strategies such as world knowledge and parsing, but limited vocabulary knowledge (Laufer & Sim 1985; Haynes & Baker 1993). Most researchers understand that both L1 and L2 readers should know between 95 to 98% (Nation 2008) of the vocabulary presented in a text to be able to understand what they read.

As far as reading in a foreign language is concerned, the acquisition of three thousand word families, corresponding approximately corresponding to five thousand lexical items, has been suggested as the minimum to enable a skillful reader in his/her L1 to transfer his/her reading strategies to the L2 (Laufer 1997). Within this context, the comparative study carried out by Hu and Nation (2000) to look into the effects related to the degree of known words in a text, with variations ranging from 80 to 100% among university students speakers of English as an L2, is quite revealing: none of the readers who knew approximately 80% of the total number of words was able to grasp the text meaning, and only a few of the ones who knew 95% of the words were able to understand the text, but not the majority of it.

Such data seem to illustrate two interconnected factors: the central role played by vocabulary acquisition in L2 reading comprehension and the complex task of every L2 learner who wants to become a skillful reader, which is to acquire the biggest amount of words within the shortest possible period of time. Unlike L1 vocabulary acquisition, which occurs incidentally most times, L2 learners have to overcome a series of obstacles to learn new lexical items, making the process longer and more arduous, as we will see in the following section.


The first category deals with the non-familiarity with words presented in a text and is directly related to the minimum number of words that should be known by a reader so that he/she is able to transfer his/her L1 reading strategies to the L2, as previously discussed. The less the reader automatically recognizes the lexical items in a text, regardless the context, the more he/she will need to make use of extra cognitive capacity to understand them. In a different way,

automatic recognition of a large vocabulary…will free one’s cognitive resources for (1) making sense of the unfamiliar or slightly familiar vocabulary and (2) interpreting the global meaning of the text ( Laufer 1997: 23).

In the second category, the pseudo-familiar words – the ones which the reader assumes to be known, but actually are not – are discussed. The author names them deceptive transparent words, since as they give the false impression that clues leading to their meaning are available, they prevent the reader from realizing that he/she does not know them.

Laufer subdivides the deceptive transparent words into five different types. Words with a deceptive morphological structure, the ones that “look as if they were composed of meaningful morphemes” (25), are grouped in the first type. As an example, the author cites the attempt made by a learner of English as an L2 to find out the meaning of the word outline by splitting it in two morphemes – out and line –, with the resulting meaning being out of the line. In the second type are idioms, when translated literally, word by word, such as hit and misssit on the fencea shot in the dark.

The third case of lexical pseudo-familiarity deals with the false-friends between the learner’s L1 and L2, and it takes place when he/she assumes that if the form of the word in the L2 is similar to the form of the word in the L1, the same will hold true of its meaning. The examples given by the author are the translations ofsympathetictramp and novel as nicelift and short story (in Hebrew, simpatitremp and novela) by a Hebrew native speaker.

Polysemic or homonymic words make up the fourth type of pseudo-familiarity. In this case, lexical comprehension difficulties take place when the L2 learner knows only one meaning of this kind of word and gets reluctant in abandon it, even when it is not appropriate for the context. In order to illustrate, Laufer mentions the comprehension of the word state as a region within a country, instead of situation, and also the translation of abstract as opposed to non-concrete, instead of summary.

Finally, the last and biggest type of transparent deceptive words embodies the words with similar lexical forms. In some cases, similarity is in the sound (for instance, the pairs cute/acute/ and price/prize), and in other is in the morphology (for instance, the pairs economic/economical and reduce/deduce). The lexical comprehension problems happen here for reasons similar to the ones for polysemic or homonymic words: either the apprentice knows the meaning of only one of the words, or he/she even may know both meanings but is not sure as to which one is more appropriate in the given context.

Besides the reading problems that may stem from the words a reader either does not know or only believes he/she knows, Laufer also foresees comprehension difficulties associated with words whose meanings are not easily predictable from context.

Even though he states that to foretell the meaning of unknown words from a given context is a valuable activity that actually occurs, he argues that what is hard to accept is:

…taking for granted that guessing in L2 is indeed possible with most unknown words and that successful guessing depends mainly on the learner’s guessing strategies. This seems to be a naïve belief since a variety of factors will interfere with the guessing attempts of the reader. (Laufer 1997: 28)

Among these factors, the author highlights four. The first is related to the fact that the contextual clues may in fact be nonexistent, since he regards the basic assumption that the text will offer the necessary clues to guide readers to the meaning of unknown words overoptimistic. Corroborating it, he reports study findings by Kelly (1990) and Bensoussan and Laufer (1984), in which most L2 readers’ attempts to find out the meaning of unknown words through context were ineffective. The second factor regards the impossibility of actually making use of the contextual clues available because they are also unfamiliar to the L2 reader. Therefore, at least for the reader, the contextual clues do not exist in fact, since they cannot be used.

According to the author, the third and fourth factors that interfere with the readers’ attempts to use the context as a guide to find out the meaning of new words can be witnessed, respectively, when the contextual clues are either partial and misleading, directing to an incorrect guessing, or suppressed. This last case occurs when readers “disregard information that, according to their world view, seems unimportant, add information that ´should` be there, and focus their attention on what, in their opinion, is essential” (Steffensen & Joag-Dev 1984).

Taking into account the multifaceted nature of the three main obstacles to be overcome by L2 learners to acquire vocabulary knowledge, as well as the intimate bond between lexical knowledge and reading comprehension discussed in the previous section, a summary of studies on the strategies used by L2 learners to acquire new vocabulary is presented in the following section, placing special stress on the inference strategy.


Among the various cognitive resources L2 readers apply to learn the meaning of new words, the role played by vocabulary inference has been constantly emphasized and corroborated by research. As a means to illustrate it, study findings from Fraser (1999), Paribakht & Wesche (1999), Cooper (1999) and Alves & Baldo (2008) are presented below.

In Fraser’s study, for instance, the cases of inference use corresponded to 58% of the total number of strategies employed by adult learners of L2. The second most used strategy, consulting a dictionary, accounted for 39%. Paribakht & Wesche (1999) found an even larger percentage of the use of inferencing when analyzing how intermediate students of English as an L2 dealt with unknown words in a text: 80%. The remaining 20% were divided into two strategies, repetition of the word aloud or rereading and requesting for help, both through direct questioning to the interviewer and checking in a dictionary.

Similar results were also found in Cooper (1999) and Alves & Baldo (2008). When examining the strategies employed by 18 university-level learners of English as an L2 to figure out the meaning of 20 idiomatic expressions commonly used, Cooper concluded that contextual inference was the most frequent one, accounting for 28% of the total. Another two strategies were also significantly employed: idiom analysis, with 24% of the total, and use of literal meaning, with 19%. Learners also made use of other strategies, but in a less expressive way: while requesting for further information was responsible for 8% of the total amount of the strategies, paraphrases or repetition, use of previous knowledge and use of the L1 were responsible, each one, for 7% of the total.

Alves & Baldo also verified that the use of context was the main strategy adopted by six Brazilian university students, three of them basic learners and three of them intermediate learners of English as an L2, to get to the meaning of ten lexical items presented in a text. Irrespective of their levels of proficiency in the L2, the use of context as an inference strategy was used two times more than the second most significantly employed strategy, use of L1, even though the intermediate-level readers employed it more often than the basic-level readers.

Research evidence of the vital role played by the use of the inference strategy in the acquisition of new vocabulary items in an L2 has triggered a number of questions, such as the factors that are involved in successful inferencing and the contribution of different strategies and knowledge sources to effective lexical inference.

Both the factors and the strategies and knowledge sources related to lexical inferencing bear a different nature. Amongst the factors that influence the efficacy of the inferential process, it is important to consider the nature of the word and the text, the kind of textual information available, the importance of the word for textual comprehension, the degree of cognitive and mental effort demanded by the task, and the degree of information accessible in the surrounding context. Amongst the strategies, three main types must be considered: the ones related to the internal structure of the word and its components, the ones related to information about syntactic and semantic relations among words, and also the ones related to discursive and extratextual clues (Nassaji 2003). Finally, among the knowledge sources, the grammatical, morphological and phonological knowledge ought to be considered, as well as world knowledge and knowledge about the association between words and cognates (De Bot., K, Paribakht, T.S. & Wesche, M.B. 1997).

As far as the contribution of different strategies and knowledge sources to successful lexical inferencing by L2 learners is concerned, it is worth mentioning here Nassaji’s study (2003), whose main objective was to find out how successful 21 intermediate students of English as an L2 would be in their attempts to infer the meaning of ten unknown words presented in a text through context. After classifying the strategies and knowledge sources employed by the learners, the author related the types of strategies and knowledge sources employed with lexical inference success.

Verbal protocol analysis revealed that, out of 199 answers, learners were unsuccessful in 55.8% of them, being successful in 25.6%, and partially successful in the remaining 18.6%. The individual analysis of each word suggested that learners’ inferential success was related to the physical form and appearance of the word, since many of the students who interpreted incorrectly specific words mistook them by other of similar appearance, but semantically unrelated. Besides, out of the five types of knowledge sources identified by the author – grammatical, morphological, discursive, L1 and world knowledge – the most used was world knowledge (46.2%), followed by morphological knowledge (26.9%). Amongst the six types of strategies – repeating (word and section repeating), verifying, self-inquiry, analyzing, monitoring and analogy – repeating was the most used, responding for 63.7% of all strategy occurrences. All the other strategies were employed in a much less systematic way, accounting for less than 10% of the total of occurrences each one. Finally, regarding the relationship between knowledge sources and strategies and degree of inferential success, Nassaji found that morphological knowledge was the most important one for the inferential skill, the same being true of the verifying, self-inquiry and section repeating strategies. However, the author warns that, although some of the strategies were more related than others to effective lexical inference attempts, the global contributions of these strategies was partial and limited.

The following section presents the methodology adopted in our study, based to a great extent on Nassaji’s work.


Sixteen Brazilian post-graduate students of languages who worked as teachers of English as an L2 for at least five years were chosen to participate in the study, in order to guarantee that all participants were proficient in the L2. The research instruments were two: (i) a vocabulary activity, made up of four lexical items taken from a review of the Brazilian movie City of God by Fernando Meirelles, published in the newspaper The Philadelphia Inquirer in January 24th, 2003; (ii) introspective verbal protocols of the subjects while doing the activity, through which the knowledge sources and inferences employed by them were inferred.

The selection of the words for the vocabulary activity was made for two experts in English as an L2 teaching, taking into account the participants’ proficiency level and the text genre. They are displayed in Table 1, with the surrounding context in which they appear in the text.


The vocabulary inference activity started with a silent reading of the text, followed by the answers to the lexical items through the verbal protocol technique – i.e., the participants were asked to verbalize their thoughts while doing the activity, so that the researcher could infer the cognitive processes that were being employed.

In individual sessions, participants had a previous training on the verbal protocol technique. Once the researcher was assured that they had understood the methodology, they read the text silently and tried to infer the words. The answers were tape-recorded for later transcription of data. Based on data transcriptions, the researcher inferred both the strategies and knowledge sources used in the participants’ inferential processes and the appropriateness of the answers, according to previous evaluation by the English as an L2 learning experts. Finally, these new data were then analyzed taking into account the objectives of the study.

Three levels of vocabulary inference appropriateness were described – appropriate (A), partially appropriate (PA), and inappropriate (I) – each one corresponding to four, two and zero points. The strategies and knowledge sources were based on the classification set forth by Nassaji (2003), as mentioned in the previous section, with minor adaptations. They are shown in Tables 2 and 3 below, with the respective definitions.


This section is subdivided into two parts. The strategies and knowledge sources used by the participants are described in the first part, and the results of the correlations between the use of these resources and vocabulary inference performance are discussed in the second one.


Participants’ verbal protocols revealed that the most frequently used strategies, with more than 20 occurrences each one, were rereading and analysis, while the least used, with 10 occurrences or less each one, were analogy, self-inquiry and automatic retrieval, as shown in Table 4.

Looking at Table 4, it is possible to see that rereading was used 40 times throughout the vocabulary inferential activity. Given that the total number of strategies was 122, that means that rereading alone was responsible for 32.8% of the totality of the strategies used. The second most used strategy was analysis, with 22 occurrences. Therefore, participants’ attempts to find out the meaning of a word by analyzing its constituent parts accounted for 18% of the total.

On the other hand, as previously mentioned, analogy, self-inquiry and automatic retrieval were the strategies with less occurrences, responding together for 15.8% of the total amount of strategies used in the vocabulary activity. While analogy – i.e., attempting to learn the meaning of a new word based on the similarity of sound or form with other words – was employed nine times (7.4% of the total), self-inquiry was employed six times (5% of the total), and automatic retrieval, four (3.4% of the total).

When comparing these findings with the ones shown in Nassaji’s study described in section 3, an aspect seems worth noticing: while in Nassaji’s study rereading was the only significantly used strategy, accounting for more than 60% of the total number of strategies, in this study two strategies were classified as the most used ones, rereading and analysis, even though rereading was still most employed than the second strategy. A possible explanation for that can be found in the participants’ L2 proficiency level in both studies: while Nassaji’s subjects were categorized as intermediate learners of English as an L2, the subjects of this study were placed in a higher level of proficiency. Therefore, it may be assumed that their larger L2 knowledge enabled them to make use of the analysis strategy much more often than the intermediate learners did.

As far as the knowledge sources are concerned, data show that the most utilized, with at least 20 occurrences throughout the vocabulary activity, were the ones related to the morphological and discursive knowledge. On the other hand, world knowledge, L1 knowledge and grammatical knowledge were the least used, each one of them with less than 10 occurrences, as depicted in Table 5.

Bearing in mind that a total of 93 knowledge sources were used by the participants while attempting to infer the meaning of the lexical items, the 51 occurrences of discursive knowledge corresponded to 54.8% of the total, and the 26 occurrences of morphological knowledge to 27.9%.

The remaining three other knowledge sources presented a much less expressive use: there were eight occurrences of world knowledge, five occurrences of L1 knowledge, and three occurrences of grammatical knowledge as a means to find out the meaning of unknown words. In percentages, out of 93 knowledge sources occurrences, world knowledge accounted for 8.6% of the total; L1 knowledge, for 5.4%; and grammatical knowledge, for 3.2%.

Three comments seem relevant here. The first one is related to the parallelism between the most frequently used knowledge sources and the participants’ two preferential strategies, rereading an analysis, a fact that can be regarded as coherent to a great extent, and, therefore, expected. Given that discursive knowledge is related to inter e intra sentence connections and to the devices that link the different parts of the text, when the subjects’ choice to attempt to find out the meaning of the new words fell on discursive knowledge, no other strategy would be as appropriate as the rereading of the surrounding context in which the lexical item was included. Following the same rationale, the participants’ option for resorting to their L2 morphological knowledge – i.e., based on the structure and formation of the word – to answer the vocabulary items should be accompanied, to be successful, by the analysis strategy.

The second aspect worth mentioning regards the mismatch between the most used knowledge source in Nassaji’s study and in this study: while in the former world knowledge responded for 46.2% of the total amount of knowledge sources adopted by the participants, in the latter discursive knowledge was the most used one, with 54.8% of the total number of occurrences. It seems reasonable to think that one of the possible explanations for that is actually the same one used to justify the different percentages in the use of lexical inference strategies by the two groups, their L2 proficiency level. While participants to this study were more able to use the contextual clues, recognizing and/or exploring the intra and interconnections between sentences, most likely due to a more solid knowledge of the L2, participants to Nassaji’s study did not possess enough L2 linguistic background to do it, at least not as often as desirable. What data seem to suggest is that they replaced discursive by world knowledge, a more general (and, hence, less precise) source of knowledge. 1

Finally, the third comment regards the efficient use of strategies and knowledge sources by the participants, as their scores displayed in Table 6 can confirm. As explained previously, three levels of answer appropriateness were considered: appropriate (A), partially appropriate (PA), and inappropriate (I), with the corresponding value being four, two and zero points, respectively. Given that there were four lexical items in the vocabulary activity, the highest possible score was 16. Therefore, out of 64 answers, 53 were considered appropriate, corresponding to 82.8% of the total, with only nine of them (17.1%) classified as inappropriate and partially appropriate.

Based on such data, two conclusions can be drawn. The first one is that the high percentage of appropriate lexical inferences matches the participants’ proficiency level in the L2, showing that, despite the fact that the words were unknown to them, they were able to use the context efficiently, through the frequent use of rereading and of discursive knowledge, as well as through the knowledge of the L2 structure, given the constant use of the analysis strategy and of morphological knowledge, as just mentioned.

The second conclusion is related to the difference between the successful use of context as a resource for the lexical inferences observed among the participants to this study, and the findings of studies that investigated learners with lower levels of proficiency in the L2, in which it was identified only a limited efficacy in the use of context in the inferential answers (Bensoussan & Lauffer 1984; Kelly 1990; Nassaji 2003; Alves & Baldo 2008; see section 4). A plausible explanation to this fact seems to be not specifically in the frequency of use of such resource by the more and less proficient subjects in the L2, but in the kind of context that is presented to the less proficient ones. In other words, a context with limited or unavailable clues, as emphasized by Laufer (1997; see section 3), leading to a mistakenly interpretation of the vocabulary meaning.

6.2 Strategies and Knowledge Sources and Vocabulary Inference Performance

In order to check possible significant correlations between strategies and knowledge sources used by the participants and their performances in lexical inferencing, Pearson correlation tests were used.

Even though most answers were appropriate, test results showed that the vocabulary activity scores did not correlated significantly with any of the two most used strategies, rereading and analysis. Both correlations were actually quite low, rereading being of -.132, and analysis, .208. Similarly, no significant correlations were found between the participants’ scores and the two most employed knowledge sources, discursive and morphological. Once more, correlations were non-significant: -.194 for discursive knowledge, and .317 for morphological knowledge.

It is relevant to note that this finding differs, to a certain degree, from the study by Nassaji discussed here, in which significant correlations between the level of success of lexical inferences and knowledge sources employed were found (see section 3). However, Nassaji’s final interpretation of his findings led him to state that, even though certain strategies and knowledge sources were more related to successful inferencing than others, “their global contributions were partial and limited” (662). Considering that, he suggested that success in lexical inferencing may not depend only on the use of certain strategies, but also on the extent in which such use is “combined and coordinated with other sources of information in and outside the text ” (663).

Therefore, what matters most seems to be the way these strategies and knowledge sources are employed, rather than the frequency of use, a statement already present in Anderson’ study (1991) about reading strategies in the L2 and now corroborated by this study, grounded on the absence of significant correlations between use of strategies and knowledge sources and lexical inference performance.


The objective of this article was to present the results of a study that investigated the cognitive resources used by sixteen Brazilian proficient readers of English as an L2 in their attempts to infer the meaning of four lexical items. The resources were subdivided into strategies and knowledge sources, based on Nassaji (2003). The aims of the study were to identify the most frequently used resources, as well as possible correlations between these resources and vocabulary inference performance.

Based on the verbal protocols, it was possible to verify that the rereading and the analysis strategies, as well as the discursive and the morphological knowledge sources, were the most frequently resources employed by the participants. Furthermore, no significant correlations between the use of these resources and the level of participants’ inferential success were found.

Data analysis led to two main conclusions. The first one is related to the efficacy of using the intra and inter sentential context as a means to learn the meaning of unknown words, as long as such use is backed up by a solid knowledge of the L2. The second regards the complex task of classifying some inference strategies as more efficient than others, given the variety of factors that play a role in L2 vocabulary inference success. Even though such conclusion should be evaluated with caution and cannot be generalized, given the quantitative nature of the study, we hope that it contributes to the debate on how L2 learners of different proficiency levels infer the meaning of unknown words, since, as previously emphasized here, lexical acquisition is one of the fundamental steps for successful acquisition of a second language.


1Within this context, it is interesting to note that, among the participants of this study, world knowledge was much less used – 8.6% of the total amount of knowledge sources used.


