Journal of Shanghai University(Natural Science Edition) ›› 2018, Vol. 24 ›› Issue (5): 703-712.doi: 10.12066/j.issn.1007-2861.2075

• Digital Film and Television Technology • Previous Articles     Next Articles

Sentiment analysis of Chinese movie reviews based on deep learning

ZHOU Jingyi1, GUO Yan1(), DING Youdong2   

  1. 1. School of Software Engineering, Suzhou Institute for Advanced Study, University of Science and Technology of China, Suzhou 215123, Jiangsu, China
    2. Shanghai Film Academy, Shanghai University, Shanghai 200072, China
  • Received:2018-07-02 Online:2018-10-30 Published:2018-10-26
  • Contact: GUO Yan E-mail:guoyan@ustc.edu.cn

Abstract:

With the rise of social networks, more people choose to express their opinions on the internet, which allows film and television investors to collect the audience's feedback more easily. The watercress movie review is just one such platform through which investors are able to know the viewers' taste and preference, and thereby to make better decision in investing the television and film industry. A large amount of data analysis must be done by means of computer technology. Sentiment analysis is a direction of natural language processing (NLP). Sentiment analysis, also known as emotional tendency analysis, is one aiming to analyze the positive or negative aspects of text description. In order to improve the accuracy of the film's sentiment classification, multiple sets of contrast experiments are set to select the optimal parameters, and the Chinese character vectors and the word vectors are compared as the input matrix, in the bidirectional long short-term memory (Bi-LSTM) model and the convolutional neural network (CNN). A Bagging algorithm with CNN model as weak classifier is proposed. Multiple CNN models are trained to determine the final classification results by voting method. The integrated method reduces the deviation caused by a single model. The accuracy of a single Bi-LSTM model has increased by 5.10%, which is 1.34% higher than that of a single CNN model.

Key words: bidirectional long short-term memory (Bi-LSTM) model, convolutional neural network (CNN) model, Bagging algorithm, word embedding vector, sentiment analysis of movie reviews

CLC Number: