网络出版日期: 2022-09-13
基金资助
* 2019年度山东省社会科学规划项目“基于中国英语能力等级量表的英语写作形成性反馈模式研究”的阶段性成果(编号19CWZJ11)
Pinpointing Analytic Rating Criteria for EFL Writing Assessment from Raters’ Perspectives
近年来,大规模外语写作测试中采用的评分标准引起了普遍关注,很多研究者一致认为评分标准代表了写作测试实际测量的构念。鉴于此,本研究以大学英语四级写作测试(简称CET-4写作测试)为例,探索适用于CET-4写作测试的评分标准。在理论回顾和文献分析的基础上,本文初步归纳出可能适用于CET-4写作测试的评分标准,然后采用混合研究方法,借助问卷和访谈调查了评分员对这些评分标准的意见。研究结果表明:除了“任务的完成度”这项评分标准之外,其余九项评分标准在CET-4写作测试的评分中都比较有效,而且这些评分标准也基本包含在CET-4写作测试目前的构念框架中,说明这些评分标准符合CET-4写作测试的理论构念要求。本研究从理论上和方法上对于界定大规模外语写作测试的构念,以及检验评分量表的效度都具有一定的启示意义。
邹绍艳, 范劲松 . 评分员视角下外语写作测试分项评分标准的界定[J]. 当代外语研究, 2022 , 22(4) : 133 -143 . DOI: 10.3969/j.issn.1674-8921.2022.04.013
In recent years, the rating criteria adopted in large-scale EFL writing assessments have received increasing research attention due to the widespread consensus that rating criteria represent the de-facto test construct of writing assessment. As such, this study was conducted to pinpoint the most useful rating criteria for the writing components of College English Test Band Four (CET-4 writing). Relying on a Mixed-methods approach, the study investigated how CET-4 raters would perceive the usefulness of a set of rating criteria elicited on the basis of atheoretical and literature review. The results showed that all the rating criteria were perceived to be useful except for one criterion —task fulfillment. Given that the remaining criteria were relevant to the construct components of CET-4 writing, we could have confidence in their representativenss of the construct validity of CET-4 writing. Meanwhile, the study also found that the proficiency levels of CET-4 writing performance could significantly impact raters’ perception of the usefulness of the rating criteria. This, to some extent, could pose challenge to the validity of the holistic scoring approach adopted by CET-4 writing. In conclusion, this study can provide some theoretical and methodological implications for future research indelineating the construct components of large-scale EFL assessments, as well as examining the validity of the existing rating scales adopted by large-scale EFL assessments.
| [1] | Bachman L. F. 1990. Fundamental Considerations in Language Testing[M]. Oxford: Oxford University Press. |
| [2] | Barrett S. 2001. The impact of training on rater variability[J]. International Education Journal (1) : 49-58. |
| [3] | Cohen J. 1988. Statistical Power Analysis for the Behavioral Sciences(2nd ed.)[M]. Hillsdale, NJ: Lawrence Erlbaum Associates, Publishers. |
| [4] | Creswell J. W. & J. D. Creswell. 2017. Research Design: Qualitative, Quantitative, and Mixed Methods Approaches (5th ed.)[M]. London: Sage Publications. |
| [5] | Cumming A., R. Kantor& D. Powers. 2002. Decision making while rating ESL/EFL writing tasks: A descriptive framework[J]. The Modern Language Journal 86: 67-96. |
| [6] | Eckes T. 2008. Rater types in writing performance assessment: A classification approach to rater variability[J]. Language Testing 25: 155-185. |
| [7] | Ellis R. 2003. Task-based Language Learning and Teaching[M]. Oxford: Oxford University Press. |
| [8] | Ellis R. 2008. The Study of Second Language Acquisition(2nd ed.)[M]. Oxford: Oxford University Press. |
| [9] | Fulcher G. 2003. Testing Second Language Speaking[M]. London: Pearson Education. |
| [10] | Grabe W. & R.B. Kaplan. 1996. Theory and Practice of Writing[M]. New York: Longman. |
| [11] | Housen A. & F. Kuiken. 2009. Complexity, accuracy, and fluency in second language acquisition[J]. Applied linguistics 30(4): 461-473. |
| [12] | Howell D. C.. 2016. Fundamental Statistics for the Behavioral Sciences[M]. Belmont: Nelson Education. |
| [13] | Huot B. A.. 1993. The influence of holistic scoring procedures on reading and rating student essays[A]. In M. M. Williamson & B. A. Huot (eds.). Validating Holistic Scoring for Writing Assessment: Theoretical and Empirical Foundations [C]. Cresskill, NJ: Hampton Press. 206-236. |
| [14] | Knoch U. 2009. Diagnostic Writing Assessment: The Development and Validation of a Rating Scale[M]. Frankfurt, Germany: Peter Lang. |
| [15] | Lumley T. 2005. Assessing Second Language Writing: The Rater’s Perspective[M]. New York: Peter Lang. |
| [16] | Luoma S. 2004. Assessing Speaking[M]. Cambridge: Cambridge University Press. |
| [17] | McNamara T. F.. 1996. Measuring Second Language Performance[M]. London and New York: Longman. |
| [18] | Messick S. 1995. Standards of validity and the validity of standards in performance assessment[J]. Educational Measurement: Issues and Practice (14): 5-8. |
| [19] | Milanovic M., N. Saville& S. Shuhong. 1996. A study of the decision-making behaviour of composition markers[J]. Studies in Language Testing (3): 92-111. |
| [20] | Shaw S. D. & C. J. Weir. 2007. Examining Writing: Research and Practice in Assessing Second Language Writing[M]. Cambridge: Cambridge University Press. |
| [21] | Skehan P. 1998. A Cognitive Approach to Language Learning[M]. Oxford: Oxford University Press. |
| [22] | Stratman J. & L. Hamp-Lyons. 1994. Reactivity in concurrent think-aloud protocols:issues for research[A]. In P. Smagorinsky (eds.). Speaking about Writing: Reflections on Research Methodology [C]. Thousand Oaks, CA: Sage. 89-114. |
| [23] | Weigle S. C. 2002. Assessing Writing[M]. Cambridge: Ernst KlettSprachen. |
| [24] | Wolfe E. W., C. W. Kao& M. Ranney. 1998. Cognitive differences in proficient and non-proficient essay scorers[J]. Written Communication 15(4): 465-492. |
| [25] | 费茜、 赵毓琴. 2008. 大学英语四级写作评分标准中的问题分析[J]. 外语教学理论与实践(4): 45-52. |
| [26] | 辜向东、 杨志强. 2009. CET写作试题20年分析与研究[J]. 外语与外语教学(6):21-26. |
| [27] | 李清华. 2014. 高校英语专业四级测试写作评分标准的设计与效度研究[M]. 北京: 科学出版社. |
| [28] | 刘力、 麦陈淑贤、 金檀. 2013. 写作测试内容质量评分研究——分层决策树法[J]. 现代外语(4):419-426. |
| [29] | 王跃武、 朱正才、 杨惠中. 2006. 作文网上评分信度的多面Rasch测量分析[J]. 外语界(1):69-76. |
| [30] | 张森、 于朋. 2010. 大学英语四级考试作文网上评阅信度保障研究[J]. 外语界(5):79-86. |
| [31] | 邹绍艳、 潘鸣威. 2018. 《中国英语能力等级量表》的写作能力构念界定[J]. 当代外语研究(5):62-72. |
| [32] | 邹绍艳、 范劲松. 2019. 大学英语四级写作评分量表的效度初探——基于评分员的视角[J]. 外国语文(3):148-156. |
/
| 〈 |
|
〉 |