The Analysis on a Test of Intensive Reading of English

2017-03-20 21:03宋萍
校园英语·中旬 2017年2期

宋萍

【Abstract】Five sections are included in this assignment: The first section introduces the reason why I was determined to evaluate a final-term test I have written. The second section presents the relevant literature on how to evaluate a language test appropriately. Then I describe my context where the test was taken in the third section. Section four is my analysis of the test in the light of LT theories and a conclusion is made in the last section.

【Key words】Intensive Reading of English; Analysis on a Test

1. Introduction

I have been teaching for nine years in this module paper is that in Tongren University, Comprehensive English is the first core course approved by Education Office of Guizhou Province. The teaching materials we use in this course, such as syllabus, PPT, assignments and testing item bank will be promoted to other colleges and universities in Guizhou Province. Therefore, it is especially essential to design excellent tests by continuous assessments by myself or my peers as a part of the construction of the core course.

2. Literature Review

2.1 The definition of LT

Bachman introduced the fundamental concepts of measurement including the terms “measurement” “evaluation” and “test”, also presented how these were distinct form each other (Bachman, 1990 pp18-23). Consequently, Bachman takes Language Tests as “criterion measures of language abilities in second language acquisition research” (Bachman, 1990 p2). Here language abilities are viewed as “a set of finite components—grammar, vocabulary, pronunciation, and spelling-that were realized as four skills—listening, speaking, reading and writing” (Bachman & Palmer 1996 p4) Zeng Yongqiang defined a test as “a standardized procedure for sampling behavior and describing it with categories or scores”

(Zeng Yongqiang, 2006). Accordingly, LT can be understood as an instrument to measure learners and teachers performance in language learning.

2.2 What is reliability in LT?

Wood, Henning and Bachmans views on reliability in LT

Name View on Reliability in LT

Robert Wood “Reliability is concerned with the consistency of examinee performance” (1993, p132).

Henning. G “Reliability has to do with accuracy of measurement. This kind of accuracy is reflected in the obtaining of similar results when measurement is repeated on different occasions or with different instruments or by different persons. This characteristic of reliability is sometimes termed as consistency” (1987, p73).

“Reliability has been shown to be another word for consistency of measurement.” (1987, p75)

Lyle F. Bachman Reliability can be defined as “consistency of measurement” and believes “a reliable test score will be consistent across different characteristics of the testing situation” (1996, p19).

2.3 What is construct validity in LT?

Henning 1987 defines validity as “general refers to the appropriateness of a given test or any of its component parts as a measure of what it is purported to measure” (1987, p 89), and he also identifies threats to test validity, such as invalid application of tests, inappropriate selection of content, imperfect cooperation of the examinee, inappropriate referent or norming population, poor criterion selection, sample truncation and use of invalid constructs. (1987 pp91-93)

3. The context where the final-term test was taken

3.1 Course Description

Comprehensive English, or called Intensive Reading of English is a compulsory course for university students of English major, which can be classified into English professional skills courses according to the classification of Syllabus in 2000. It focuses on the training of students comprehensive language ability and skills in listening, speaking, reading and writing. Professional English Syllabus for Institutions of Higher Education 2000 divides the teaching requirements of Comprehensive English Course into eight levels. Every term is taken as one level with the specific requirement of pronunciation, grammar, vocabulary, listening, speaking, reading, translation and writing. Because the testing objects of the evaluated test in this paper are university students of the fourth term in university, the assessing standard can refer to the teaching requirement of level four on the Syllabus 2000.

Table3: the teaching requirements of level four on the Syllabus 2000

English skill Teaching requirements

Grammar (1) skillfully master the use of subject clause, appositive clause, inversion and conditionals;

(2) initiatively master the cohesion skills of sentence with sentence and paragraph with paragraph

Vocabulary cognize 5,500—6,500 words, among which accurately and skillfully use 3,000—4,000 words

Listening (1) understand native speakers daily conversation and listening materials of medium difficulty;

(2) catch the main idea of normal-speeded VOA and BBC English news

Reading (1) comprehend the news reports like Newsweek and novels like Sons and Lovers;

(2) reading speed is at 120-180 words per minute with the comprehension rate over 70%; finish the reading of a piece of medium difficult reading material with 1,000 words or so in 5 minutes and master its main idea

Translation independently finish the translation exercises in textbooks in accordance with the original texts

Writing (1) with the guidance of the topic, outline, graph and figures to write a composition of 150-200 words in 30 minutes meeting the requirements of relevant content, rigid structure, clear expression, accurate grammar, smooth language and elegant expression;

(2) finish a practical writing of 60 words or so in 10 minutes

3.2 Test Description

The length of the test is 120 minutes, and there are 6 items types with 56 items altogether. As an achievement test, the final-term examination was intended to measure the learners acquisition of the knowledge learned in Comprehensive English Course after the study of one term. The testing objectives are university students of grade two who have learned Fundamental English for two years in English Department of Tongren University.

There are altogether six types of items in the evaluated test:

(1) Fill in the word according to its definition and first letter given. (Blank filling, 10 items) Here the students are required to fill in a word learned in texts this term according to a complete definition and the first letter given. The item type aims to test if the students have mastered the spelling of the new words in texts this term.

(2) Words and structure. (Multiple choices questions, 10 items) Here the students are required to have a correct choice from the following four answers marked A, B, C, and D to finish a correct sentence. The item type is to test the students competence in the application of words and structure in English sentences.

(3) Put the following expressions with the proper forms in the blanks. (Blank filling, 10 items) Here ten words are taken out from ten sentences. Students are required to choose the suitable word for each sentence from the words box and transform the word into correct form if necessary according to the sentence content. The item type aims to firstly test if the students master the new words learned in texts this term and secondly to test if the students can transform the words into correct forms in terms of the content of sentences.

(4) Translate the following paragraph into Chinese. (Translation, 1 item) Here the students are required to translate a short paragraph from a certain text learned this term into Chinese. The item type is to test if the students have a good understanding of the text content, especially the core words and patterns.

(5) Reading comprehension. (Multiple choices questions, 15 items) Here the students are required to complete the multiple choices questions after reading three passages around 300 words presumably without new words. Each passage is accompanied with five questions about the main idea or facts of the text, choosing a topic for the passage, drawing logical conclusions, making accurate inferences, understanding the link between certain sentences, and making sound judgments of authors attitude, etc. The item type aims to measure the students integrated reading comprehensive skills.

(6) Writing. (Writing a passage, 1 items) Here the students are required to write a passage in English according to a cited paragraph with their own understanding of the paragraph. The item type is to test the students English writing skills.

4. The evaluation of the final-term examination

In the previous parts, I have made a detailed description on the context of Comprehensive English Course and the final-term test, in this part I will make an evaluation on the test according to two most important concepts in Bachmans theory of usefulness: reliability and validity. The evaluation is mainly a theory-based descriptive analysis for the reason that no quantitative data has been collected. The data was unavailable for this May and June because English Department and other three departments in Tongren University were being evaluated by the experts of National Teaching Evaluation Committee on the qualification of offering students Bachelors degree by Tongren University, and hence all the teaching materials and testing papers had been collected by Department of Teaching Affairs.

4.1 Reliability of the test

Bachman 1990 views the investigation of the reliability of a test involves both logical analysis and empirical research (Bachman, 1990 p161). Because of the unavailability of the test scores, I mainly assess the reliability of the test based on theoretical analysis on the test paper. Since measurement specialists have long recognized that the assessment of reliability depends on our ability to distinguish the effects of the abilities we want to measure from the effects of other factors, I will assess the reliability of the test from the aspect firstly. Bachman 1990 takes test method facets, personal attributes and random factors as the factors that affect language test scores. Firstly test method facets include testing environment and test rubric. The final-term examination was held under the strict control of Department of Teaching Affairs according to the rules of Testing Rules of Tongren University, moreover specific test rubric was attached below every item type, especially for writing part, detailed test rubric in accordance with the one of TEM4 was listed, thus we can find the test is reliable from the aspect of test method facets. Secondly personal attributes refers to attributes of individuals that are not related to language ability including individual characteristics and group characteristics. When I wrote the test paper, I tried to avoid putting any particular content concerning sex, race and ethnic background bias into the items in case they might affect the candidates scores, especially for the passages of reading and the topic of writing, I chose the popular material which is familiar by most of the candidates as testing items in the test. Therefore, the test is reliable from the aspect of personal attributes. Lastly, random factors include unpredictable and largely temporary conditions, such as metal alertness or emotional state and uncontrolled differences in test method facets. The final-term examination was held smoothly on the pointed day without any obvious changes, so the test is reliable from the aspect of random factors. Therefore, according to Bachmans standard 1990, we can conclude the final-term examination is overall reliable because it minimizes the factors which may affect candidates test scores other than the effects of abilities we want to measure.

Secondly, I will evaluate the final-term test from Hughes standard. As Hughes 2003 identifies, two components of test reliability are the performance of candidates from occasion to occasion, and the reliability of the scoring. (Hughes, 2003 p44) Hughes views in order to make tests more reliable, it firstly should take enough samples of behaviour. That is to say, other things being equal, the more test items you have on a test, the more reliable the test will be. While the test can not be so long to make candidates become so tired that the behaviour that they exhibit becomes unrepresentative of their ability. Li Xiaoju 1997 points out if a comprehensive test is mainly composed of objective items, the number of items should be no less than 80-100 in total and in each item type to measure a certain particular ability or knowledge, the number of items should be more than 10-20 (Li, 1997 p71). In terms of the final-term test paper, the number of items under each item type to assess candidates particular language ability are mostly over 10 items while the total number of items on the final-term test paper is 56 items, much less than the standard number of items put forward by Li, we can find the reliability of the test paper needs to be improved here. Secondly, Hughes views “items on which strong students and weak students perform with similar degrees of success contribute little to the reliability of a test”. (Hughes, 2003 p45) That is to say, too easy or too difficult items which will cause candidates similar performance cannot be too many on the test paper. When I wrote the final-term test paper, I wrote every item according to Bi-directional Breakdown designed by Department of Teaching Affairs, which gives a detailed specification of the items on a test paper, such as the number of difficult items, normal difficult items, easy items, extra easy items and extra difficult items and the number of subjective items and objective items, etc. Therefore, the final-term test paper is reliable from the aspect of test difficulty. Thirdly, Hughes views a reliable test shouldnt allow candidates too much freedom and test writers should especially pay attention to it on the design of the writing item. In terms of the final-term test paper, a cited paragraph was firstly read by candidates, and they must write the composition according to their understanding of the paragraph, so the candidates were imposed more control over what they wrote than they were required to write a composition with a given topic. Therefore, the writing item is more reliable. Fourthly, Hughes views writing unambiguous items affects the reliability of test paper and the best way to reduce unambiguous items to be written on test paper is to discuss with colleagues. As I taught all the three university classes of grade two Comprehensive English at that time, it is my duty to design the final-term test paper dependently. I wrote the test paper by myself without having any discussion with my colleagues in the teaching and research group. I also didnt have any pre-test of the items on a small group of students. There must be some ambiguous items on the test paper. For instance, when I wrote the translation item of the test paper, I didnt offer the key to it because I couldnt find any standard Chinese translation of the item from the reference books. Thus when I corrected candidates translation of the passage, I couldnt give them marks objectively. Consequently, there must be some errors on candidates test scores, which might influence the reliability of the test paper. Fifthly, Hughes views a reliable test paper should use items that permit scoring which is as objective as possible. The final-term test paper consists of 60% objective items with only one correct answer and hence the possible answers are limited which facilitates a more reliable scoring. Lastly, Hughes views a reliable test paper should provide a detailed scoring key. As mentioned in 3.2.4 I listed the scoring rubric of each part of the test in details, which guaranteed the reliability of the scoring of the possible answers from candidates in the test. Therefore, the final-term examination paper is generally reliable assessed by Hughes standard 2003.endprint

4.2 Validity of the test

Measurements experts view a test can be valid if it measures accurately what it is intended to measure. Even though some aspects are mentioned in part 2.4.2.2 to ensure the validity of LT, I shall evaluate the validity of the final-term test from the aspect of content validity instead of construct validity for the reason that firstly “construct validity pertains to the meaningfulness and appropriateness of the interpretations that we make on the basis of test scores” (Bachman & Palmer, 1996 p21), nevertheless, the candidates test scores were unavailable for data analysis in the study of the assignment paper. Secondly but more important is that as Hughes views 2000, the content of final achievement test should be based directly on a detailed course syllabus or on the books or other materials used. (Hughes, 2000 p11) Thereby, I shall analyze if the test covers the knowledge or language ability specified in the syllabus to evaluate the degree of content validity of the test. Firstly, when designing the test paper, I strictly complied with the requirements of Level four in Syllabus and text book. Over 70% items are based on the content of the text book and few too difficult items which were beyond the specified language ability of Syllabus were included in the testing items. Secondly, most of the testing items are effective to test candidates specific language ability expect writing part. Though in the writing part of the final-term test paper, the intended purpose of citing a paragraph to ask candidates to read before writing is to help them narrow down their minds to a focused point in order to get a better writing effect, the candidates did much poorly in writing part of the test. They were not poor in writing beautiful sentences and variable structures but nearly half of the students could not write an essay relevant to the given paragraph because they could not understand the paragraph well before writing. Consequently, due to the irrelevance to the main idea, the candidates could only get low marks. The writing part of the final-term examination is ideally designed to measure candidates writing competence when I design the test item, while candidates writing competence couldnt be reflected fully in the process of test because of the influence of their reading competence. Here writing and reading were combined in the writing item, which causes the validity of the test to decrease in the test.

5. Conclusion

In this paper, I evaluated the final-term examination from testing reliability and validity, which are two most important qualities in language tests. In general, the final-term examination is reliable and valid, but some weaknesses in the test paper were identified in accordance to the standards of Bachman and Hughes. With the theories of language testing and my continuous efforts, I am confident to design more and more reliable and valid test papers in my future teaching practice.

References:

[1]Alderson,J.C.,Caroline Clapham & Dianne Wall.1995.Language Test Construction and Evaluation.Cambridge: Cambridge University Press.

[2]Bachman,L.F.1990.Fundamental Considerations in Language Testing.Oxford:Oxford University Press.

[3]Bachman,L.F.&Palmer,A.S.1996.Language Testing in Practice Oxford:Oxford University Press.

[4]Li Xiaoju,2001.The Science and Art of Language Test Changsha: Hunan Education Press.

[5]Robert Wood.1993.Assessment and Testing:A Survey of Research.Cambridge:Cambridge University Press.

[6]Wall,D.1997.Impact and Washback in Language Testing.In C.Clapham&D.Corson(Eds.),Encyclopedia of Language end Education,Vol.7(pp.291-302).Dordrecht:Kluwer Academic Publishers.

[7]Zeng Yongqiang,2006,Lectures on Language Testing.FECL,Guangdong University of Foreign Studies.