Paraphrase identification based on hierarchical neural network
CSTR:
Author:
Affiliation:

(School of Computer Science and Technology, Wuhan University of Technology, Wuhan 430000, China)

Clc Number:

TP391

Fund Project:

  • Article
  • |
  • Figures
  • |
  • Metrics
  • |
  • Reference
  • |
  • Related
  • |
  • Cited by
  • |
  • Materials
  • |
  • Comments
    Abstract:

    Paraphrase identification is widely used in question answering system, plagiarism detection, and personalized recommendation. Since the existing paraphrase identification techniques are lack of effective feature extraction mechanism, a new paraphrase model was proposed. Different from previous works which normally adopt the “encoding-matching” mode, the proposed model adopts the “encoding-matching-extraction” mode by adding feature extraction layer to better acquire classification information. The proposed model is consisted of six layers: input layer, embedding layer, encoding layer, matching layer, feature extraction layer, and output layer. The encoding layer utilizes contextual bi-directional long short-term memory network (BiLSTM) with self-attention to encode context of sentences, which can make full use of contextual information in both forward and reverse directions of a sentence. The matching layer uses several matrix operations to get sentence pair matching information from different angles. The extraction layer chooses Xception as the feature extractor to better extract classification information from the matching results. Moreover, this paper combines GloVe word vectors, character vectors, and additional feature vectors as the final embeddings, which carries richer information than ordinary pretrained embeddings. Results show that the proposed model achieved competitive results on two public datasets: Quora Question Pairs (as a representative of large datasets) and SemEval-2015 PIT (as a representative of small and medium datasets).

    Reference
    Related
    Cited by
Get Citation
Related Videos

Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:October 27,2019
  • Revised:
  • Adopted:
  • Online: September 27,2020
  • Published:
Article QR Code