当代外语研究 ›› 2025, Vol. 25 ›› Issue (4): 15-27.doi: 10.3969/j.issn.1674-8921.2025.04.002
出版日期:
2025-08-28
发布日期:
2025-08-26
作者简介:
林铃,上海交通大学外国语学院副教授、博士生导师。主要研究方向为语料库语言学、学术话语分析。电子邮箱:kathyll@sjtu.edu.cn;基金资助:
Online:
2025-08-28
Published:
2025-08-26
摘要:
短语研究一直是语料库语言学的核心议题之一。本文系统回顾了现有的短语研究方法和模型,并重点介绍了短语研究的一种新路径,即“词性序列”研究。本文不仅介绍了该方法的定义和基本特征,还重点介绍了它在短语提取和分析、短语功能研究、语篇特征研究以及语言教学和学习上的意义。虽然该方法一直被国际和国内语言学界所忽视,但其为短语的提取和分析提供了新思路、新手段和新发现,为现有的短语研究提供了有益的补充,值得进一步深入研究和广泛推广。
中图分类号:
林铃, 刘明. 基于语料库的短语研究新路径:“词性序列”研究[J]. 当代外语研究, 2025, 25(4): 15-27.
LIN Ling, LIU Ming. Towards a Part-of-Speech (PoS) Gram Approach to Corpus-based Phraseology[J]. Contemporary Foreign Languages Studies, 2025, 25(4): 15-27.
表2
“英语短语”数据库中使用频率最高的10个6词词性序列
排序 | 词性序列 | 频数 | 示例 |
---|---|---|---|
1 | PRP AT0 NN1 PRF AT0 NN1 | 20,932 | at the end of the day |
2 | PRP AT0 AJ0 NN1 PRF AT0 | 5,784 | on the other side of the |
3 | AT0 AJ0 NN1 PRF AT0 NN1 | 4,024 | the other side of the scale |
4 | AT0 NN1 PRF AT0 AJ0 NN1 | 3,768 | the end of the Cold War |
5 | NN1 PRP AT0 NN1 PRF AT0 | 3,194 | light at the end of the |
6 | AT0 NN1 PRP AT0 NN1 PRF | 3,035 | an increase in the number of |
7 | PRP AT0 NN1 PRF DPS NN1 | 2,991 | for the rest of his life |
8 | AT0 NN1 PRF AT0 NN1 PRF | 2,955 | a hell of a lot of |
9 | PRP AT0 NN1 PRF AT0 AJ0 | 2,634 | at the end of a long |
10 | VVN PRP AT0 NN1 PRF AT0 | 2,229 | brought to the attention of the |
表3
“英语短语”数据库中使用频率最高的6词词块示例
排序 | 示例 | 频数 | 排序 | 示例 | 频数 |
---|---|---|---|---|---|
1 | at the end of the day | 747 | 11 | by the end of the month | 108 |
2 | by the end of the year | 441 | 12 | before the end of the year | 104 |
3 | in the middle of the night | 273 | 13 | on the edge of the bed | 104 |
4 | at the end of the year | 230 | 14 | at the beginning of the year | 91 |
5 | by the end of the century | 220 | 15 | by the department of the environment | 90 |
6 | at the turn of the century | 207 | 16 | on the floor of the house | 84 |
7 | at the end of the month | 178 | 17 | in the middle of the road | 83 |
8 | by the end of the decade | 130 | 18 | for the rest of the day | 82 |
9 | at the end of the season | 113 | 19 | at the back of the house | 80 |
10 | at the end of the war | 111 | 20 | in the middle of the room | 80 |
[1] | Altenberg, B. 1998. On the phraseology of spoken English: The evidence of recurrent word-combinations [A]. In A. P. Cowie (ed.). Phraseology: Theory, Analysis and Applications[C]. Oxford: Oxford University Press. 101-122. |
[2] | Lin, L. & M. Liu. 2021. Towards a part-of-speech (PoS) gram approach to academic writing: A case study of research introductions in different disciplines[J]. Lingua (4): 1-18. |
[3] | Biber, D. & S. Conrad. 2009. Register, Genre and Style[M]. Cambridge: Cambridge University Press. |
[4] | Biber, D., S. Johansson, G. Leech, et al. 1999. Longman Grammar of Spoke n and Written English (Vol. 2)[M]. London: Longman. |
[5] | Breeze, R. 2019. Part-of-speech patterns in legal genres [A]. In T. Fanego & P. Rodríguez-Puente (eds.). Corpus-based Research on Variation in English Legal Discourse (Vol. 91)[C]. Amsterdam: John Benjamins. 79-103. |
[6] | Brett, D. & A. Pinna. 2015. Patterns, fixedness and variability: Using PoS-grams to find phraseologies in the language of travel journalism[J]. Procedia-Social and Behavioral Sciences (3): 52-57. |
[7] | Butler, C. S. 2003. Multi-word sequences and their relevance for recent models of Functional Grammar[J]. Functions of Language (2): 179-208. |
[8] | Carter, R. & M. McCarthy. 2006. Cambridge Grammar of English[M]. Cambridge: Cambridge University Press. |
[9] | Cheng, W. 2006. Describing the extended meanings of lexical cohesion in a corpus of SARS spoken discourse[J]. International Journal of Corpus Linguistics (3): 325-344. |
[10] | Cheng, W., C. Greaves & M. Warren. 2006. From n-gram to skipgram to concgram[J]. International Journal of Corpus Linguistics (4): 411-433. |
[11] | Cheng, W., C. Greaves, J. Sinclair, et al. 2009. Uncovering the extent of the phraseological tendency: Towards a systematic analysis of concgrams[J]. Applied Linguistics (2): 236-252. |
[12] | Cortes, V. 2004. Lexical bundles in published and student disciplinary writing: Examples from history and biology[J]. English for Specific Purposes (4): 397-423. |
[13] | Greaves, C. & M. Warren. 2010. What can a corpus tell us about multi-word units? [A]. In O. K. Anne & M. Michael (eds.). The Routledge Handbook of Corpus Linguistics[C]. New York: Routledge. 240-254. |
[14] | Hunston, S. & G. Francis. 2000. Pattern Grammar: A Corpus-driven Approach to the Lexical Grammar of English[M]. Amsterdam and Philadelphia: John Benjamins. |
[15] | Hyland, K. 2008. As can be seen: Lexical bundles and disciplinary variation[J]. English for Specific Purposes (1): 4-21. |
[16] | Lim, J. D., O. Mark, G. Pérez-Paredes, et al. 2024. Exploring part of speech (pos) tag sequences in a large-scale learner corpus of L2 English: A developmental perspective[J]. Corpora (1): 31-59. |
[17] | Morley, B. & P. Shift. 2006. Towards the automatic identification of directive speech acts [A]. In R. Facchinetti & M. Rissanen (eds.). Corpus-based Studies of Diachronic English[C]. Bern: Peter Lang. 95-112. |
[18] | Pinna, A. & D. Brett. 2018. Constance and variability: Using PoS-grams to find phraseologies in the language of newspapers [A]. In J. Kopaczyk & J. Tyrkkö (eds.). Applications of Pattern-driven Methods in Corpus Linguistics[C]. Amsterdam: John Benjamins. 107-130. |
[19] | Renouf, A. & J. Sinclair. 1991. Collocational frameworks in English [A]. In K. Aijmer & B. Altenberg (eds.). English Corpus Linguistics[C]. Harlow: Longman. 128-143. |
[20] | Scott, M. 1997. Wordsmith Tools Manual[M]. Oxford: Oxford University Press. |
[21] | Sinclair, J. 1991. Corpus, Concordance, Collocation[M]. Oxford: Oxford University Press. |
[22] | Sinclair, J. 1996. The search for units of meaning[J]. Textus (1): 75-106. |
[23] | Sinclair, J. 1998. The lexical item [A]. In E. Weigand (ed.). Contrastive Lexical Semantics[C]. Amsterdam: John Benjamins. 1-24. |
[24] | Sinclair, J. 2004. Trust the Text: Language, Corpus and Discourse[M]. London and New York: Routledge. |
[25] | Sinclair, J., S. Jones & R. Daley. 1970. English lexical studies: Report to the Office of Scientific and Technical Information (OSTI)[R]. Birmingham: Department of English, University of Birmingham. |
[26] | Stefanowitsch, A., K. Middeke & F. Lin. 2023. Nominal constructions in spoken academic Englishes: A quantitative corpus-based approach[J]. Yearbook of the German Cognitive Linguistics Association (1): 75-104. |
[27] | Stubbs, M. 2007a. An example of frequent English phraseology: Distributions, structures and functions [A]. In R. Facchinetti (ed.). Corpus Linguistics 25 Years on[C]. Amsterdam: Brill Rodopi. 87-105. |
[28] | Stubbs, M. 2007b. Quantitative data on multi-word sequences in English: The case of the word world [A]. In M. Hoey, M. Mahlberg, M. Stubbs, et al. (eds.). Text, Discourse and Corpora: Theory and Analysis[C]. London: Continuum. 163-189. |
[29] | Stubbs, M. 2009. The search for units of meaning: Sinclair on empirical semantics[J]. Applied Linguistics (1): 115-137. |
[30] | Thompson, P. & A. Sealey. 2007. Through children’s eyes?: Corpus evidence of the features of children’s literature[J]. International Journal of Corpus Linguistics (1): 1-23. |
[31] | Warren, M. 2009. Why concgram? [A]. In C. Greaves (ed.). ConcGram 1.0: A Phraseological Search Engine[C]. Amsterdam: John Benjamins. 1-11. |
[32] | Wilks, Y. 2005. REVEAL: The notion of anomalous texts in a very large corpus[R]. Tuscany: Tuscan Word Centre International Workshop, Certosa di Pontignano. |
[33] | 何安平. 2013. 国外语料库语言学视角下多形态短语研究述评[J]. 当代语言学(1): 62-72. |
[34] | 雷蕾、 刘迪麟、 晏胜. 2017. 基于窗口与基于句法分析的搭配提取:问题与方法[J]. 语料库与跨文化研究(1): 13-36. |
[35] |
李文中. 2021. 接着做:扩展意义单位分析[J]. 当代外语研究(6): 13-26, 88.
doi: 10.3969/j.issn.1674-8921.2021.06.002 |
[36] | 刘永芳、 陈宗利. 2019. 中外硕士学位论文英文标题的名词化特征实证性研究[J]. 外语教学(5):18-23. |
[37] |
卢伟胜、 郭躬德、 陈黎飞. 2014. 基于词性标注序列特征提取的微博情感分类[J]. 计算机应用(10):2869-2873.
doi: 10.11772/j.issn.1001-9081.2014.10.2869 |
[38] | 王立非、 文道荣. 2017. 商务英语合同的语义范畴与词性分布特征的语料库考察[J]. 山东外语教学(2):12-20. |
[39] | 吴君、 赫里·蒂萨里. 2021. 中国英语学习者和英语母语者使用程度副词和动词的比较[J]. Chinese Journal of Applied Linguistics(4): 470-487. |
[40] | 卫乃兴. 2012. 共选理论与语料库驱动的短语单位研究[J]. 解放军外国语学院学报(1): 1-6. |
[41] | 卫乃兴. 2009. 语料库语言学的方法论及相关理念[J]. 外语研究(5):36-42. |
[42] | 许家金. 2017. 体裁短语学视角下的医学学术英语词典研编[J]. 外语与外语教学(6):52-60. |
[43] |
甄凤超. 2020. 语料库语言学研究热点追踪与思考[J]. 当代外语研究(6):89-100.
doi: 10.3969/j.issn.1674-8921.2020.06.010 |
[44] | 甄凤超. 2023. 复合词项语义韵研究再探[J]. 外语教学与研究(1):41-52. |
[1] | 甄凤超. 语料库文体学研究新思路——以《别让我走》中“黑尔舍姆”的意义建构为例[J]. 当代外语研究, 2025, 25(4): 3-14. |
[2] | 吴卓超, 郑咏滟. 功能视角下的短语复杂度发展: 学习者背景和任务认知条件的协同作用[J]. 当代外语研究, 2025, 25(2): 138-153. |
[3] | 濮建忠. John Sinclair的短语理论与意义研究[J]. 当代外语研究, 2021, 21(6): 60-76. |
[4] | 甄凤超. 语料库语言学研究热点追踪与思考[J]. 当代外语研究, 2020, 20(6): 89-100. |
[5] | 张乐, 刘芹. 中国理工科大学生英语求职信中的短语特征[J]. 当代外语研究, 2020, 20(6): 111-120. |
[6] | 李晶洁, 李晨阳. 句干式学术话语行为研究[J]. 当代外语研究, 2020, 20(2): 84-95. |
[7] | 甄凤超. 从语料库语言学视角谈有效英语教学[J]. 当代外语研究, 2019, 19(03): 48-51. |
[8] | 甄凤超. 语料库语言学在变化中的坚守[J]. 当代外语研究, 2018, 18(05): 73-80. |
[9] | 黄蓓. 类指句的入场问题[J]. 当代外语研究, 2015, 15(08): 18-25. |
[10] | 梁红梅. 从英语教材元语言序列看短语教学设计——一项基于共选理论视角的研究[J]. 当代外语研究, 2015, 15(02): 40-49. |
[11] | 姜琳, 陈锦. 书面纠正性反馈与二语习得——针对英语类指名词短语用法的实证研究[J]. 当代外语研究, 2013, 13(11): 31-35. |
[12] | 王艳伟;. 语料库语言学的多维视角——2011中国语料库语言学大会综述[J]. 当代外语研究, 2012, 12(09): 71-72. |
[13] | 王红梅;姜楠;. 句长对英语学习者语调切分的影响[J]. 当代外语研究, 2011, 11(09): 16-20+60. |
[14] | 刘振前;庄会彬;. “他的老师当得好”及相关句式——汉语伪定语的产生机制问题辨正[J]. 当代外语研究, 2011, 11(07): 7-12+35+60. |
[15] | 冯志伟;. 我与语言学割舍不断的缘分[J]. 当代外语研究, 2011, 11(01): 1-11+64. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||