欢迎访问《哈尔滨工业大学学报》编辑部网站！

期刊检索

关键词检索

新闻公告MORE

【03-25】投稿请提供保密审查证明
【05-04】论文版权转让协议
【07-05】出版伦理声明
【04-04】告作者书
【07-11】审稿人的职责
【10-17】《哈工大学报》入选“第5届中国精品科技期刊”
【12-30】《哈工大学报》入选“世界学术影响力Q2期刊”
【01-03】《哈工大学报》入选“2018中国国际影响力优秀学术期刊”
【11-01】哈工大学报荣获2016、2018、2020年度“中国高校百佳科技期刊奖”
【03-24】哈工大学报10篇论文入选中国精品科技期刊顶尖学术论文
【12-18】哈工大学报2023优秀审稿专家
【12-24】哈工大学报2022优秀审稿专家
【12-21】哈工大学报2021优秀审稿专家
【12-10】哈工大学报2020优秀审稿专家
【12-13】哈工大学报2019优秀审稿专家
【11-23】哈工大学报2018优秀审稿专家

主管单位 中华人民共和国
工业和信息化部 主办单位 哈尔滨工业大学主编李隆球 国际刊号ISSN 0367-6234 国内刊号CN 23-1235/T

期刊网站二维码

微信公众号二维码

引用本文:	孙广路,郎非,薛一波.基于条件随机域和语义类的中文组块分析方法[J].哈尔滨工业大学学报,2011,43(7):135.DOI:10.11918/j.issn.0367-6234.2011.07.028
	SUN Guang-lu,LANG Fei,XUE Yi-bo.Chinese chunking method based on conditional random fields and semantic classes[J].Journal of Harbin Institute of Technology,2011,43(7):135.DOI:10.11918/j.issn.0367-6234.2011.07.028

【打印本页】【HTML】【下载PDF全文】【查看/发表评论】【下载PDF阅读器】【关闭】

过刊浏览高级检索

本文已被：浏览 1623次下载 1442次	码上扫一扫！
分享到：微信更多字体:加大+\|默认\|缩小-
基于条件随机域和语义类的中文组块分析方法
孙广路^1,2, 郎非³, 薛一波¹
1.清华大学信息技术研究院;2.哈尔滨理工大学计算机科学与技术学院;3.哈尔滨理工大学外国语学院

摘要:

为了解决中文组块分析精度不高和未利用词的语义信息的问题,提出了一种基于条件随机域模型和语义类的中文组块分析方法.该方法通过研究中文组块分析任务及其序列化特性,采用条件随机域模型融合不同类型特征,克服标记偏置问题,将语义词典中抽取的语义类特征应用到中文组块分析中,提高分析精度.实验表明,该方法取得了F值为92.77%的中文组块分析性能,实验进一步还表明了特征模板的选取和训练语料的规模对于分析性能的影响.

关键词: 条件随机域中文组块分析特征模板语义词典

DOI：10.11918/j.issn.0367-6234.2011.07.028

分类号:TP391.1

基金项目:国家自然科学基金资助项目(60903083);黑龙江省自然科学基金项目(F200936);黑龙江省高等学校新世纪优秀人才基金资助项目(1155-ncet-008)

Chinese chunking method based on conditional random fields and semantic classes

SUN Guang-lu^1,2, LANG Fei³, XUE Yi-bo¹

1.Research Institute of Information Technology,Tsinghua University,100084 Beijing,China;2.School of Computer Science and Technology,Harbin University of Science and Technology,150080 Harbin,China;3.School of Foreign Languages,Harbin University of Science and Technology,150080 Harbin,China)

Abstract:

To improve the accuracy of Chinese chunking and utilize the semantic information of words,a new Chinese chunking method is proposed based on conditional random fields and semantic classes.Through the analysis of Chinese chunking task and its sequential characteristics,conditional random fields that could incorporate various types of features were applied to overcome the label bias problem.Semantic features were utilized to improve the chunking performance.Experimental results show that the algorithm achieves impressive accuracy of 92.77% in terms of the F-score.A further experiment indicates the effects of feature template selection and training data′s scales on the aspect of chunking performance.

Key words: conditional random fields Chinese chunking feature template semantic dictionary

期刊检索

关键词检索

新闻公告MORE

友情链接LINKS