Session 9: Lecture notes

Learning, teaching and assessing vocabulary (III): Current vocabulary test development; vocabulary size; the relationship between vocabulary teaching and assessment

We have so far covered vocabulary acquisition and learning strategies, and vocabulary teaching principles. In this session (9), we are going to focus on vocabulary assessment in terms of the issues including the current vocabulary test development; vocabulary size; and the relationship between vocabulary teaching and assessment. Specifically, we are going to address the questions of 1) why we test vocabulary; 2) what vocabulary (words) to test; 3) how we test the vocabulary; and 4) what we should generally consider in assessing vocabulary.

I.     Why we test vocabulary

Leading experts in vocabulary studies and vocabulary testing, e.g. Robert Schmitt (2000), Paul Nation (2001) and John Read (2002), hold that vocabulary knowledge is a very important component of both first and second language proficiency, and that it is natural to assess the speaker・s and/or learner・s vocabulary knowledge in some way. According to Read (2002, p. 304), .vocabulary, along with grammar and reading comprehension, was the aspect of language that was most commonly included in the new objective tests・.

Although modern vocabulary tests are always embedded in the integrative tests (including listening, speaking, reading and writing), there are a number of reasons for testing vocabulary or testing language skills with a .lexical・ focus, based on the commonly known types of tests.

Perhaps the most common one is to find out if students have learned the words that were taught, or that they were expected to learn (achievement test). Alternatively, a teacher may want to find where students・ vocabularies have gaps, so that specific attention can be given to those areas (diagnostic test).

Vocabulary tests can also be used to help place students in the proper class level (placement test). Vocabulary items that are part of commercial proficiency tests, such as the TOEFL, provide some indication of a learner・s vocabulary size, which is related to overall language proficiency.

Other possibilities include utilizing tests as a means to motivate students to study, to show students their progress in learning new words, and to make selected words more salient by including them on a test.

On a more general basis, a test・s purpose is related to the kind of lexical information desired. A typical purpose for testing is to obtain an estimate of the size of learners・ vocabularies, that is, how many words or word families they know. This is sometimes referred to as breadth of knowledge. Another possibility is to measure how well target words or word families are known (depth or quality of knowledge) (Schmitt 2000, p. 164).


II.    What vocabulary (words) to test

The words and the range of words to be selected for testing purpose vary from test to test. Basically, .if the teacher wants to test students・ class achievement, then the words tested should obviously be drawn from the ones covered in the course・ (Schmitt 2000, p. 164)

Vocabulary tests used for placement or diagnostic purposes may need to sample from a more general range of words. If the students to be tested all come from the same school, or have been taught from similar syllabi, then it is possible to draw words from those taught in their courses. However, if students come from different schools with different syllabi and language-teaching methodologies, as may be the case in a university placement situation, then the selection must be more broadly based. In these cases, words are often taken from word-frequency lists.

Vocabulary tests that are part of proficiency tests need to include the broadest range of words of all. Many universities rely on commercial proficiency tests to control admissions. Therefore, the tests must include a range of words that will provide a fair evaluation of people of different nationalities, native languages, and cultures, as well as proficiency levels. Some of the words on these tests must be uncommon enough to allow the highest-level test takers to demonstrate their superior knowledge (Schmitt 2000, p. 165).


III.     How we test the vocabulary

How we test the selected vocabulary, to a large extent, depends on what aspects of these selected words we want to test. As we have discussed earlier in this module, .knowing a word・ involves a number of aspects. For example, according to Nation (2001, p. 27), knowing a word involves knowing its .form・, .meaning・ and .use・:

 Form spoken What does the word sound like?
How is the word pronounced?
  written What does the word look like?
How is the word written and spelled?
  word parts What parts are recognizable in this word?
What word parts are needed to express the meaning?
 Meaning form and meaning What meaning does this word form signal?
What word form can be used to express the meaning?
  concept and referents What is included in the concept?
What items can the concept refer to?
  associations What other words does this make us think of?
What other words could we use instead of this one?
 Use grammatical functions In what patterns does the word occur?
In what patterns must we use this word?
  collocations What words or types of words occur with this one?
What words or types of words must we use with this one?
  constraints on use Where, when, and how often would we expect to meet this
  (register, frequency K) 

Where, when, and how often can we use this word?


As pre-service English language teachers, and as English (as a second/foreign or native language) learners, we have our intuitions in terms of how to test the different aspects of knowing a word. For example, if we want to test the students whether they know how the selected words are written and spelled, we can have a .dictation・; if we want to test the students whether they know what other words they can use instead of the selected words, we can ask them to identify the words with the closest meanings in a multiple-choice vocabulary test; if we want to test the students whether they know the grammatical functions, the collocations and the constraints on the use of the selected words, we can make a cloze test, or a guided writing test to elicit the relevant lexical knowledge from the students.

There are various ways or formats for testing vocabulary. The same as learning or teaching vocabulary strategies or testing other aspects of language, there is not a correct or best way or format for testing vocabulary. The choice of item format(s) has to be based on the purpose(s) of the test, the profile of the test takers, and other considerations such as the validity, reliability, practicality, and the washback or backwash effect of the test(s).

The following are some example item formats that Schmitt (2000), Nation (2001), and Read (2002) have provided for testing vocabulary (Note that .diction・ and .interview・ on words are not included as they involve .verbal・ instructions):

1.     L1 translation: write your native-language translation for the target word

e.g.       serious ______

            support ______

            experience ______

            difficult ______                                                                                             (Schmitt 2000)

or         Translate the underlined words into your first language

1)      You can see how the town has developed.            ______

2)      I cannot say much about his character.                 ______

3)      Her idea is a very good one.                                   ______

4)      I want to hear only the facts.                                   ______                       (Nation 2001)

2.    Synonym matching

e.g.       firm                  A. deep              B. hard             C. warm           D. clean     

(Schmitt 2000)

            foolish              A. clever              B. mild             C. silly             D. frank

(Read 2002)

He was guilty because he did those things deliberately.

            A. both                                                B. noticeably

            C. intentionally                                    D. absolutely                             (Nation 2001)

            Nutritionists categorize food into seven basic groups.

            A. clarify         B. grind            C. classify           D. channel                 (Read 2002)


3.    Filling in blanks (or Sentence completion)

e.g.       A ______ is a large cat with stripes that lives in the jungle.                 (Schmitt 2000)

or         Fill in the blank by the correct option

            A ______ is used to eat with.

            A. plow              B. fork             C. hammer           D. needle               (Read 2002)    


or         Choose one word from the list on the right to complete the sentence. Do not use the same word twice.


            1) A journey straight to a place is ______                                            acute

            2) An illness that is very serious is ______                                          common

            3) A river that is very wide is ______                                                    bare

            4) Part of your body that is not covered by any clothes is ______        alien

            5) Something that happens often is ______                                         broad


(Nation 2001)

4.    Identifying the meanings of words

e.g.       chronic means

A. lasting for a long time            B. dissatisfied

            C. to greatly decrease               D. effective and harmless

            E. don・t know                                                                                     (Nation 2001)


casualty means

A.     someone killed or injured

B.     noisy and happy celebration

C.     being away from other people

D.     middle class people                                                                     (Nation 2001)


The writing on the page was illegible.

            A. handwritten in ink               B. written in large letters

            C. difficult to read                   D. written in many colors                    (Schmitt 2000)


5.    Matching

e.g.      1) bitter

            2) independent             ______ small

            3) lovely                        ______ beautiful

            4) merry                        ______ liked by many people

            5) popular

            6) slight                                                                                                (Schmitt 2000)


1)      He saw a bull        

2)      She was a champion           ______ formal and serious manner

3)      He lost his dignity                 ______ winner of a sporting event

4)      This is like hell                     ______ building where valuable objects are shown

5)      She liked the museum                   

6)      This is a good solution                                                                (Nation 2001)


6.    Checklist tests

Target words are presented on a list and learners are merely required to check if they know them or not. K Non-words that look like real words but are not, such as flinder or trebron, are put onto the test along with the real words. If some of these non-words are .checked・, that indicates that the student is overestimating his or her vocabulary knowledge.                        (Schmitt 2000)                                           


7.    The word associates test

The target word is followed by eight options, four of which have some relationship with the target word and four of which do not (normally, two out of the four words from each box have some relationship with the target word).

e.g.       sudden

beautiful   quick   surprising   thirsty

change   doctor   noise   school

                                                                                                                                  (Schmitt 2000)


arithmetic   film   pole   publishing

revise   risk   surface   text

                                                                                                                                 (Read 2002)

or         Circle the entry that does not fit with the rest of the group.

A.        editorial                                                 B.         court

            business section                                               lawyer

            cartoons                                                            jail

            weather                                                            author

            research paper                                                 judge

            advertisements                                                 jury                                      (Schmitt 2000)


8.    Making a sentence with the target word

e.g.       Please make a sentence with each of the following words to show that you know them.



9.    C-test (cloze test) with or without choices (options)

e.g.       The following C-test is created by selecting a short text and deleting the second half of every second word in the text.

What is so interesting about work that a whole branch of sociology can be devoted to it?
In t______ first pl______, no mat______ how affl______ our soc______ becomes, t__ necessity t______ work wi______ still rem______ the cen______ of o______ existence. Seco______, the nat______ of wo______ is chan______ so rap______ at t______ present ti______ that ma______ people a______ bewildered b______ it, ma______ sociologists bel______ they c______ help th______ to avoid many of the mistakes made in the past.

(Read 2002)

or         Read the following passage carefully and choose the most appropriate answer for each blank.

Vocabulary is learned incrementally and this obviously means that lexical acquisition requires multiple exposures to a word. This is certainly true for ___1)___ learning, as the chances of learning and retaining a word from one exposure when reading are only about 5%-14%. Other studies suggest that it requires five to sixteen or more ___2)___ for a word to be learned. If recycling is neglected, many ___3)___ known words will be forgotten, wasting all the effort already put into learning them. Fortunately, this recycling occurs naturally as ___4)___ frequent words appear repeatedly in texts and conversations. This repetition does not happen to nearly as great an extent for ___5)___ frequent words, so teachers should look for ways to bolster learner ___6)___ input to offset this. ___7)___ reading seems to be one effective method.

For explicit learning, however, recycling has to be consciously built into any study program. Teachers must guard against presenting a word once and then forgetting about it, ___8)___ their students will do the same. This implies developing a more structured way of presenting vocabulary that ___9)___ words repeatedly in classroom activities. Learning activities themselves need to be designed to require ___10)___ manipulations of a word, such as in vocabulary notebooks in which students have to go back and add additional information about the words. Understanding how memory behaves can help us design programs that give maximum benefit from revision time spent.

1)         A. incidental         B. accidental        C. partial                      D. deep

2)         A. chances          B. repetitions       C. sessions                  D. cycles

3)         A. completely       B. well                  C. hardly                      D. partially

4)         A. more                B. far                   C. better                       D. much

5)         A. so                    B. less                 C. the                           D. fewer

6)         A. output              B. input               C. interest                     D. motivation

7)         A. Intensive          B. Compulsive     C. Voluntary                 D. Extensive

8)         A. in case             B. so that            C. or else                     D. only if

9)         A. reintroduces     B. reiterates       C. reinstates                 D. reinterprets

10)       A. multiple             B. complex        C. sophisticated            D. prolonged

(Marc Xu, Zhichang)


10.    Embedded in reading comprehension

e.g.      In a democratic society suspected persons are presumed innocent until proven guilty. The establishment of guilt is often a difficult task. One consideration is whether or not there remains a reasonable doubt that the suspected persons committed the acts in question. Another consideration is whether or not the acts were committed deliberately. Still another concern is whether or not the acts were premeditated.

Please answer the following questions or complete the following statements by choosing the most appropriate option, based on the passage.

4) The word deliberately in line four of the passage means

                        A. both                   B. noticeably                C. intentionally                D. absolutely

(Nation 2001)


IV.     What we should generally consider in assessing vocabulary

We should consider a number of issues when we design a vocabulary test, or an integrative test with a .vocabulary・ focus.

1.   In general, a good vocabulary test has plenty of items (around 30 is probably a minimum for a reliable test). It uses a test item type which requires learners to use the kind of vocabulary knowledge that you want to test. It is easy enough to make, mark and interpret, and it has a good effect on the learning and teaching that leads up to the test and follows it (Nation 2001, p. 345).

2.  The same criteria of reliability, validity, practicality and backwash need to be considered when designing and evaluating vocabulary tests (Nation 2001, p. 344).

Validity K refers to how well a test measures what it is supposed to measure. In other words, do learners・ responses to the test items represent their actual knowledge of the target words?

Reliability concerns the stability or consistency of a test・s behavior over time. If an examinee took a test several times, without his or her ability changing, the test would ideally produce the same score on each administration (perfect reliability). 

In terms of validity and reliability, longer tests are better. But unreasonably extensive tests often fail in terms of practicality; for example, tests with hundreds of words are unlikely to be of much use in the classroom (Schmitt 2000, pp. 166-167).

Tests have consequences far beyond providing estimates of examinees・ abilities. They also shape the way learners view the content of a course. Most teachers are aware that learners partially judge the importance of classroom material by whether it appears on subsequent tests or not. This effect is called backwash (or washback). (Schmitt 2000, p. 163)

3.  John Read has proposed three dimensions that determine the nature of vocabulary tests.

Discrete                                            Embedded

A measure of vocabulary                  A measure of vocabulary that forms

knowledge or use as an                    part of the assessment of some other,

independent construct                       larger construct

Selective                                           Comprehensive

A measure in which specific              A measure that takes account of the

vocabulary items are the                   whole vocabulary content of the input

focus of the assessment                   material (reading / listening tasks) or the

                                                           test taker・s response (writing / speaking tasks)

Context-independent                            Context-dependent

A vocabulary measure in which               A vocabulary measure that assesses

the test taker can produce                      the test taker・s ability to take account

the expected response without               of contextual information in order to

referring to any context                           produce the expected response.  (Schmitt 2000, p. 173)

In practice, items set on traditional tests have mainly been selective and context-independent, while the tests themselves have tended to be discrete. But the more test writers wish to measure learners・ ability to actually use words in real-world situations, the further the tests need to move toward the embedded, comprehensive, and context-dependent ends of the continuums (Schmitt 2000, p. 174).

4.   There is a general feeling that first language translations should not be used in the teaching and testing of vocabulary. This attitude is quite wrong. K Translation K may be discouraged from political reasons, because teachers do not know the learners・ first language, or because first language use is seen as reducing opportunities for second language practice. However, the use of the first language to convey and test word meaning is very efficient. K The greatest value of the first language in vocabulary testing is that it allows learners to respond to vocabulary items in a way that does not draw on second language knowledge which is not directly relevant to what is being tested (Nation 2001, p. 351)

5.    Multiple-choice items are popular because they are easy to mark.  K Comparison with other item types like translation, asking the learners to use the word in a sentence, and blank filling with choices shows that it is generally the easiest of the item types for K language learners to answer (Nation 2001, p. 349).

6.    Under the influence of the concept of communicative competence in applied linguistics, language testers have moved during the last 15 years towards the design of test tasks that incorporate characteristics of K .target language use・ situations. In other words, test-takers should be given tasks that simulate situations in which they are likely to use the second language outside of the learning environment (Read 2002, p. 318).

Context has a whole variety of potential effects: on the particular meaning of the word, its connotations, the appropriateness of its use, its interpretability within the linguistic environment, the motivation of the learner to understand it, and so on (Read 2002, p. 319) . K Decontextualized tests will have a negative washback effect, encouraging learners to concentrate just on studying dictionaries and word lists, to the detriment of their acquisition of a more broadly based lexical ability (Read 2002, p. 320).

7.    We have seen that no test format gives a complete specification of how well a word is known. This means that vocabulary tests give incomplete information about an examinee・s lexical knowledge. Thus, you need to be careful about how you interpret test results (Schmitt 2000, p. 178).

8.    The future trend in vocabulary testing is likely to be towards the design of integrative tests (of listening, speaking, reading and writing skills) that have a strong lexical focus but in which vocabulary ability is one of several factors that contribute to test-taker performance (Read 2002, p. 320). K Perhaps we need to adopt a broader view of what constitutes a vocabulary test, beyond the dominant notion of a measure of learner knowledge of specific words (Read 2002, p. 303).




Nation, I. S. P. (2001). Learning Vocabulary in Another Language Cambridge, New York, Cambridge University Press.

Read, J. (2002). Vocabulary and testing. Vocabulary: Description, Acquisition and Pedagogy. N. Schmitt and M. McCarthy. Cambridge, Cambridge University Press: 303-320.

Schmitt, N. (2000). Vocabulary in Language Teaching. Cambridge, New York, Cambridge University Press.