引用本文: | 袁蕾,高曙,郭淼,袁自勇.层次化神经网络模型下的释义识别方法[J].哈尔滨工业大学学报,2020,52(10):175.DOI:10.11918/201910183 |
| YUAN Lei,GAO Shu,GUO Miao,YUAN Ziyong.Paraphrase identification based on hierarchical neural network[J].Journal of Harbin Institute of Technology,2020,52(10):175.DOI:10.11918/201910183 |
|
摘要: |
释义识别技术(Paraphrase Identification, PI)被广泛用于问答系统、抄袭检测、个性化推荐等领域.针对已有释义识别方法缺乏有效的特征提取机制问题,提出了一种新的释义识别模型.与传统“编码-匹配”模式不同,采用“编码-匹配-提取”模式,通过添加特征提取层进一步提取分类信息.所提出模型由6层组成:输入层、嵌入层、编码层、匹配层、特征提取层、输出层.在编码层,采用基于注意力机制的上下文双向长短期记忆网络对文本上下文进行编码,充分利用句子的前向和逆向两个方向的上下文信息;在匹配层,通过多种矩阵运算,从不同角度获得句子对匹配信息;在特征提取层,利用Xception网络以便更有效地从匹配结果中提取分类信息.此外,本文采用多特征融合的方法,将GloVe预训练的词向量、字符向量和附加特征向量的连接作为最终的词向量,较普通的词向量携带更丰富的语义信息.实验结果表明,所构建的模型在Quora和SemEval-2015 PIT两个公开数据集上(分别作为大型数据集和中小型数据集的代表)都达到了竞争性效果. |
关键词: 自然语言处理 释义识别 Xception 注意力机制 双向长短期记忆网络 |
DOI:10.11918/201910183 |
分类号:TP391 |
文献标识码:A |
基金项目:国家自然科学基金(51679180) |
|
Paraphrase identification based on hierarchical neural network |
YUAN Lei,GAO Shu,GUO Miao,YUAN Ziyong
|
(School of Computer Science and Technology, Wuhan University of Technology, Wuhan 430000, China)
|
Abstract: |
Paraphrase identification is widely used in question answering system, plagiarism detection, and personalized recommendation. Since the existing paraphrase identification techniques are lack of effective feature extraction mechanism, a new paraphrase model was proposed. Different from previous works which normally adopt the “encoding-matching” mode, the proposed model adopts the “encoding-matching-extraction” mode by adding feature extraction layer to better acquire classification information. The proposed model is consisted of six layers: input layer, embedding layer, encoding layer, matching layer, feature extraction layer, and output layer. The encoding layer utilizes contextual bi-directional long short-term memory network (BiLSTM) with self-attention to encode context of sentences, which can make full use of contextual information in both forward and reverse directions of a sentence. The matching layer uses several matrix operations to get sentence pair matching information from different angles. The extraction layer chooses Xception as the feature extractor to better extract classification information from the matching results. Moreover, this paper combines GloVe word vectors, character vectors, and additional feature vectors as the final embeddings, which carries richer information than ordinary pretrained embeddings. Results show that the proposed model achieved competitive results on two public datasets: Quora Question Pairs (as a representative of large datasets) and SemEval-2015 PIT (as a representative of small and medium datasets). |
Key words: natural language processing paraphrase identification Xception attention mechanism BiLSTM |