Pinpointing Analytic Rating Criteria for EFL Writing Assessment from Raters’ Perspectives

Expand

Online published: 2022-09-13

Abstract

In recent years, the rating criteria adopted in large-scale EFL writing assessments have received increasing research attention due to the widespread consensus that rating criteria represent the de-facto test construct of writing assessment. As such, this study was conducted to pinpoint the most useful rating criteria for the writing components of College English Test Band Four (CET-4 writing). Relying on a Mixed-methods approach, the study investigated how CET-4 raters would perceive the usefulness of a set of rating criteria elicited on the basis of atheoretical and literature review. The results showed that all the rating criteria were perceived to be useful except for one criterion —task fulfillment. Given that the remaining criteria were relevant to the construct components of CET-4 writing, we could have confidence in their representativenss of the construct validity of CET-4 writing. Meanwhile, the study also found that the proficiency levels of CET-4 writing performance could significantly impact raters’ perception of the usefulness of the rating criteria. This, to some extent, could pose challenge to the validity of the holistic scoring approach adopted by CET-4 writing. In conclusion, this study can provide some theoretical and methodological implications for future research indelineating the construct components of large-scale EFL assessments, as well as examining the validity of the existing rating scales adopted by large-scale EFL assessments.

Cite this article

ZOU Shaoyan, FAN Jingsong . Pinpointing Analytic Rating Criteria for EFL Writing Assessment from Raters’ Perspectives[J]. Contemporary Foreign Languages Studies, 2022 , 22(4) : 133 -143 . DOI: 10.3969/j.issn.1674-8921.2022.04.013

References

[1] Bachman L. F. 1990. Fundamental Considerations in Language Testing[M]. Oxford: Oxford University Press.
[2] Barrett S. 2001. The impact of training on rater variability[J]. International Education Journal (1) : 49-58.
[3] Cohen J. 1988. Statistical Power Analysis for the Behavioral Sciences(2nd ed.)[M]. Hillsdale, NJ: Lawrence Erlbaum Associates, Publishers.
[4] Creswell J. W. & J. D. Creswell. 2017. Research Design: Qualitative, Quantitative, and Mixed Methods Approaches (5th ed.)[M]. London: Sage Publications.
[5] Cumming A., R. Kantor& D. Powers. 2002. Decision making while rating ESL/EFL writing tasks: A descriptive framework[J]. The Modern Language Journal 86: 67-96.
[6] Eckes T. 2008. Rater types in writing performance assessment: A classification approach to rater variability[J]. Language Testing 25: 155-185.
[7] Ellis R. 2003. Task-based Language Learning and Teaching[M]. Oxford: Oxford University Press.
[8] Ellis R. 2008. The Study of Second Language Acquisition(2nd ed.)[M]. Oxford: Oxford University Press.
[9] Fulcher G. 2003. Testing Second Language Speaking[M]. London: Pearson Education.
[10] Grabe W. & R.B. Kaplan. 1996. Theory and Practice of Writing[M]. New York: Longman.
[11] Housen A. & F. Kuiken. 2009. Complexity, accuracy, and fluency in second language acquisition[J]. Applied linguistics 30(4): 461-473.
[12] Howell D. C.. 2016. Fundamental Statistics for the Behavioral Sciences[M]. Belmont: Nelson Education.
[13] Huot B. A.. 1993. The influence of holistic scoring procedures on reading and rating student essays[A]. In M. M. Williamson & B. A. Huot (eds.). Validating Holistic Scoring for Writing Assessment: Theoretical and Empirical Foundations [C]. Cresskill, NJ: Hampton Press. 206-236.
[14] Knoch U. 2009. Diagnostic Writing Assessment: The Development and Validation of a Rating Scale[M]. Frankfurt, Germany: Peter Lang.
[15] Lumley T. 2005. Assessing Second Language Writing: The Raters Perspective[M]. New York: Peter Lang.
[16] Luoma S. 2004. Assessing Speaking[M]. Cambridge: Cambridge University Press.
[17] McNamara T. F.. 1996. Measuring Second Language Performance[M]. London and New York: Longman.
[18] Messick S. 1995. Standards of validity and the validity of standards in performance assessment[J]. Educational Measurement: Issues and Practice (14): 5-8.
[19] Milanovic M., N. Saville& S. Shuhong. 1996. A study of the decision-making behaviour of composition markers[J]. Studies in Language Testing (3): 92-111.
[20] Shaw S. D. & C. J. Weir. 2007. Examining Writing: Research and Practice in Assessing Second Language Writing[M]. Cambridge: Cambridge University Press.
[21] Skehan P. 1998. A Cognitive Approach to Language Learning[M]. Oxford: Oxford University Press.
[22] Stratman J. & L. Hamp-Lyons. 1994. Reactivity in concurrent think-aloud protocols:issues for research[A]. In P. Smagorinsky (eds.). Speaking about Writing: Reflections on Research Methodology [C]. Thousand Oaks, CA: Sage. 89-114.
[23] Weigle S. C. 2002. Assessing Writing[M]. Cambridge: Ernst KlettSprachen.
[24] Wolfe E. W., C. W. Kao& M. Ranney. 1998. Cognitive differences in proficient and non-proficient essay scorers[J]. Written Communication 15(4): 465-492.
[25] 费茜、 赵毓琴. 2008. 大学英语四级写作评分标准中的问题分析[J]. 外语教学理论与实践(4): 45-52.
[26] 辜向东、 杨志强. 2009. CET写作试题20年分析与研究[J]. 外语与外语教学(6):21-26.
[27] 李清华. 2014. 高校英语专业四级测试写作评分标准的设计与效度研究[M]. 北京: 科学出版社.
[28] 刘力、 麦陈淑贤、 金檀. 2013. 写作测试内容质量评分研究——分层决策树法[J]. 现代外语(4):419-426.
[29] 王跃武、 朱正才、 杨惠中. 2006. 作文网上评分信度的多面Rasch测量分析[J]. 外语界(1):69-76.
[30] 张森、 于朋. 2010. 大学英语四级考试作文网上评阅信度保障研究[J]. 外语界(5):79-86.
[31] 邹绍艳、 潘鸣威. 2018. 《中国英语能力等级量表》的写作能力构念界定[J]. 当代外语研究(5):62-72.
[32] 邹绍艳、 范劲松. 2019. 大学英语四级写作评分量表的效度初探——基于评分员的视角[J]. 外国语文(3):148-156.
Outlines

/