Journal of Shanghai University(Natural Science Edition) ›› 2021, Vol. 27 ›› Issue (3): 544-552.doi: 10.12066/j.issn.1007-2861.2158

• Research Articles • Previous Articles     Next Articles

Text classification model based on essential $n$-grams and gated recurrent neural network

ZHAO Qian, WU Yue(), LIU Zongtian   

  1. School of Computer Engineering and Science, Shanghai University, Shanghai 200444, China
  • Received:2019-03-27 Online:2021-06-30 Published:2021-06-27
  • Contact: WU Yue E-mail:ywu@mail.shu.edu.cn

Abstract:

An effective text classification model based on $n$-grams and a gated recurrent neural network is proposed in this paper. First, we adopt a simpler and more efficient pooling layer to replace the traditional convolutional layer to extract the essential $n$-grams as important semantic features. Second, a bidirectional gated recurrent unit (GRU) is constructed to obtain the global dependency features of the input text. Finally, we apply the fusion model of the two features to the text classification task. We evaluate the quality of our model on sentiment and topic categorization tasks over multiple public datasets. Experimental results show that the proposed method can improve text classification effectiveness compared with the traditional model. On accuracy, it approaches an improvement of 1.95% on the 20newsgroup and 1.55% on the Rotten Tomatoes corpus.

Key words: text classification, gated recurrent unit (GRU), $n$-grams, natural language processing

CLC Number: