Digital Film and Television Technology

Sentiment analysis of Chinese movie reviews based on deep learning

Expand
  • 1. School of Software Engineering, Suzhou Institute for Advanced Study, University of Science and Technology of China, Suzhou 215123, Jiangsu, China
    2. Shanghai Film Academy, Shanghai University, Shanghai 200072, China

Received date: 2018-07-02

  Online published: 2018-10-26

Abstract

With the rise of social networks, more people choose to express their opinions on the internet, which allows film and television investors to collect the audience's feedback more easily. The watercress movie review is just one such platform through which investors are able to know the viewers' taste and preference, and thereby to make better decision in investing the television and film industry. A large amount of data analysis must be done by means of computer technology. Sentiment analysis is a direction of natural language processing (NLP). Sentiment analysis, also known as emotional tendency analysis, is one aiming to analyze the positive or negative aspects of text description. In order to improve the accuracy of the film's sentiment classification, multiple sets of contrast experiments are set to select the optimal parameters, and the Chinese character vectors and the word vectors are compared as the input matrix, in the bidirectional long short-term memory (Bi-LSTM) model and the convolutional neural network (CNN). A Bagging algorithm with CNN model as weak classifier is proposed. Multiple CNN models are trained to determine the final classification results by voting method. The integrated method reduces the deviation caused by a single model. The accuracy of a single Bi-LSTM model has increased by 5.10%, which is 1.34% higher than that of a single CNN model.

Cite this article

ZHOU Jingyi, GUO Yan, DING Youdong . Sentiment analysis of Chinese movie reviews based on deep learning[J]. Journal of Shanghai University, 2018 , 24(5) : 703 -712 . DOI: 10.12066/j.issn.1007-2861.2075

References

[1] Bollen J, Mao H N, Zeng X J. Twitter mood predicts the stock market [J]. Journal of Computational Science, 2011(2): 1, 1-8.
[2] Severyn A, Moschitti A. Twitter sentiment analysis with deep convolutional neural networks[C]// Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval. 2015: 959-962.
[3] 唐慧丰, 谭松波, 程学旗. 基于监督学习的中文情感分类技术比较研究[J]. 中文信息学报, 2007(11):88-108.
[4] Mikolov T, Corrado G, Chen K, et al. Efficient estimation of word representations in vector space[C]// International Conference on Learning Representations. 2013: 1-12.
[5] Pang B, Lee L, Vaithyanathan S. Thumbs up? Sentiment classification using machine learning techniques[C]// Proceedings of Annual Conference of the Association for Computational Linguistics. 2002: 79-86.
[6] 谢丽星, 周明, 孙茂松. 基于层次结构的多策略中文微博情感分析和特征抽取[J]. 中文信息学报, 2012,26(1):73-83.
[7] Kim Y. Convolutional neural networks for sentence classification[J]. Eprint ArXiv, 2014, DOI: 10.3115/v1/B14-1181.
[8] Zhang X, Zhao J B. Character-level convolutional networksfor text classification[C]// Advances in Neural Information Processing Systems. 2015: 649-657.
[9] Mikolov T, Sutskever I, Chen K, et al. Distributed representations of words and phrases and their compositionality[C]// Advances in Neural Information Processing Systems. 2013: 3111-3119.
[10] Bengio Y, Ducharme R, Vincent P. A neural probabilistic language model[J]. Journal of Machine Learning Research, 2003,3(6):1137-1155.
[11] Mikolov T, Corrado G, Chen K, et al. Efficient estimation of word representations in vector space[C]// International Conference on Learning Representations. 2013: 1-12.
[12] Hochreiter S, Schemidhuber J. Long short-term memory[J]. Neural Computation, 1997,9(8):1735-1780.
[13] Graves A, Schmidhuber J. Framewise phoneme classification with bidirectional LSTM and other neural network architectures[J]. Neural Networks, 2005,18(5):602-610.
[14] Santos C N D, Gattit M. Deep convolutional neural networks for sentiment analysis of short texts[C]// International Conference on Computational Linguistics. 2014: 69-78.
Outlines

/