上海大学学报(自然科学版) ›› 2023, Vol. 29 ›› Issue (1): 118-128.doi: 10.12066/j.issn.1007-2861.2308

• 研究论文 • 上一篇    下一篇

基于BERT 的金融文本情感分析模型

朱鹤, 陆小锋, 薛雷()   

  1. 上海大学 通信与信息工程学院, 上海 200444
  • 收稿日期:2020-12-22 出版日期:2023-02-28 发布日期:2023-03-28
  • 通讯作者: 薛雷 E-mail:xuelei@shu.edu.cn
  • 作者简介:薛雷(1963—), 男, 副教授, 博士, 研究方向为模式识别. E-mail: xuelei@shu.edu.cn
  • 基金资助:
    上海市科委基金资助项目(19511105503)

Emotional analysis model of financial text based on the BERT

ZHU He, LU Xiaofeng, XUE Lei()   

  1. School of Communication & Information Engineering, Shanghai University, Shanghai 200444, China
  • Received:2020-12-22 Online:2023-02-28 Published:2023-03-28
  • Contact: XUE Lei E-mail:xuelei@shu.edu.cn

摘要:

在金融领域, 越来越多的投资者选择在互联网平台上发表自己的见解. 这些评论文本作为舆情的载体, 可以充分反映投资者情绪, 影响投资决策和市场走势. 情感分析作为自然语言处理(natural language processing, NLP) 中重要的分支, 为分析海量的金融文本情感类型提供了有效的研究手段. 由于特定领域文本的专业性和大标签数据集的不适用性, 金融文本的情感分析是对传统情感分析模型的巨大挑战, 传统模型在准确率与召回率上表现较差. 为了克服这些挑战, 针对金融文本的情感分析任务, 从词表示模型出发, 提出了基于金融领域的全词覆盖与特征增强的BERT(bidirectional encoder representations from Transformers) 预处理模型.

关键词: 情感分析, 词嵌入向量, BERT, 词性特征, 命名实体识别

Abstract:

n the financial sector, more and more investors choose to express their opinions on the internet platform. These comment texts can fully reflect investor sentiment and influence their investment decisions and market trends. Emotion analysis as an important branch of natural language processing (NLP), which provides an effective research means for analyzing a large number of text emotional types in financial sector. However, due to the professional nature of domain-specific texts and the inapplicability of large label data sets, text emotion analysis in the financial field has brought great challenges to the traditional emotion analysis model. When the general emotion analysis model is applied to specific fields such as finance, its accuracy and recall rate are poor. In order to overcome these challenges, a BERT (bidirectional encoder representations from Transformers) preprocessing model based on full word coverage and feature enhancement in financial field was proposed for the emotional analysis task of financial text from the perspective of word representation model.

Key words: sentiment analysis, word embedded vector, BERT, bag-of-POS (part of speech), named entity recognition

中图分类号: