| [1] |
Cicchetti, D. V. 1994. Guidelines, criteria, and rules of thumb for evaluating normed and standardized assessment instruments in psychology[J]. Psychological Assessment 6(4): 284-290.
|
| [2] |
Colina, S. 2008. Translation quality evaluation: Empirical evidence for a functionalist approach[J/OL]. The Translator 14(1): 97-134. https://doi.org/10.1080/13556509.2008.10799251
|
| [3] |
Colina, S. 2009. Further evidence for a functionalist approach to translation quality evaluation[J/OL]. Target 21(2): 235-264. https://doi.org/10.1075/target.21.2.02col
|
| [4] |
Cui, Y. & M. Liang. 2024. Automated scoring of translations with BERT models: Chinese and English language case study[J]. Applied Sciences 14(5): 1925.
|
| [5] |
Fernandes, P., D. Deutsch, M. Finkelstein, et al. 2023. The devil is in the errors: Leveraging large language models for fine-grained machine translation evaluation[R]. Singapore, Singapore. Proceedings of the Eighth Conference on Machine Translation (WMT): 1066-1083.
|
| [6] |
Gong, M. 2025. The neural network algorithm-based quality assessment method for university English translation[J]. Network: Computation in Neural Systems 36(3): 649-661.
|
| [7] |
Han, C. & X. Lu. 2023. Can automated machine translation evaluation metrics be used to assess students’ interpretation in the language learning classroom?[J]. Computer Assisted Language Learning 36(5-6): 1064-1087.
|
| [8] |
Han, C. 2025. Quality assessment in multilingual, multimodal, and multiagent translation and interpreting (QAM3 T&I): Proposing a unifying framework for research[J]. Interpreting and Society 5(1): 27-55.
|
| [9] |
Huang, X., Z. Zhang, X. Geng, et al. 2024. Lost in the source language: How large language models evaluate the quality of machine translation[R]. Bangkok, Thailand. Findings of the Association for Computational Linguistics: ACL 2024: 3546-3562.
|
| [10] |
Kocmi, T. & C. Federmann. 2023. Large language models are state-of-the-art evaluators of translation quality[R]. Tampere, Finland. Proceedings of the 24th Annual Conference of the European Association for Machine Translation:193-203. https://arxiv.org/abs/2302.14520.
|
| [11] |
Koo, T. K. & M. Y. Li. 2016. A Guideline of selecting and reporting intraclass correlation coefficients for reliability research[J]. Journal of Chiropractic Medicine 15(2): 155-163.
|
| [12] |
Lu, Q., B. Qiu, L. Ding, et al. 2023. Error analysis prompting enables human-like translation evaluation in large language models: A case study on ChatGPT[R]. Bangkok, Thailand. Findings of the Association for Computational Linguistics: ACL 2024: 8801-8816.
|
| [13] |
Qian, S., A. Sindhujan, M. Kabra, et al. 2024. What do large language models need for machine translation evaluation?[R]. Miami, Florida, USA. Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing: 3660-3674. https://arxiv.org/abs/2410.03278.
|
| [14] |
Williams, M. 2004. Translation Quality Assessment: An Argumentation-Centred Approach[M]. Ottawa: University of Ottawa Press.
|
| [15] |
Yang, H., M. Zhang, et al. 2003. Teachersim: Cross-lingual machine translation evaluation with monolingual embedding as teacher[R]. Pyeongchang, South Korea. 2023 25th International Conference on Advanced Communication Technology (ICACT):283-287.
|
| [16] |
郜洁. 2025. 生成式人工智能的双刃剑效应——DeepSeek在外语教育领域的应用优势与潜在风险探析[J]. 当代外语研究 (3): 140-151.
|
| [17] |
江进林. 2013. 英译汉语言质量自动量化研究[J]. 现代外语 36(1): 85-91,110.
|
| [18] |
江进林、 文秋芳. 2012. 大规模测试中学生英译汉机器评分模型的构建[J]. 外语电化教学 (02): 3-8.
|
| [19] |
李晶洁、 陈秋燕. 2025. 人机协同智能写作发展演进分析与启示[J]. 当代外语研究 (1): 73-83.
|
| [20] |
王金铨. 2008. 中国学生汉译英机助评分模型的研究与构建[D]. 北京: 北京外国语大学.
|
| [21] |
王金铨、 朱周晔. 2017. 汉译英翻译能力自动评价研究[J]. 中国外语 14(2): 66-71.
|
| [22] |
王巍巍、 王轲、 张昱琪. 2022. 基于CSE口译量表的口译自动评分路径探索[J]. 外语界(2): 80-87.
|
| [23] |
袁煜. 2016. 翻译质量自动评估特征集[J]. 外语教学与研究48(5): 776-787,801.
|
| [24] |
张静. 2024. 生成式人工智能背景下翻译高阶思维教学模式构建[J]. 中国翻译 (3): 71-80.
|