Publications

WORKSHOP (DOMESTIC) 日経 VI 予測のためのソーシャルメディアの 感情とトピックを用いた文書分散表現獲得手法の提案

上田健太郎 (奈良先端科学技術大学院大学), 諏訪博彦 (奈良先端科学技術大学院大学), 小川祐樹 (立命館大学), 梅原英一 (新潟国際情報大学), 山下達雄, 坪内孝太, 安本慶一 (奈良先端科学技術大学院大学)

人工知能学会「社会における AI」研究会 第43回研究会(「社会システムと知能合同研究会」) (SIG-SAI2022)

March 10, 2022

Prediction of financial indicators using machine learning is one of the important topics in the field of Fintech, but it remains a challenging task due to the uncertainty of the market. The Sparse Composite Document Vector (SCDV) algorithm uses a word embedding method to combine the learned syntax and semantics with different The Sparse Composite Document Vectors (SCDV) algorithm combines the syntax and semantics learned by the word embedding method with potential topic models that can handle different word senses to obtain a distributed representation of documents with high expressive power. It is known that the accuracy of social media- based financial index prediction can be improved by using sentiment information, but the SCDV algorithm may not make good use of sentiment information. We develop a distributed representation acquisition method (SSCDV) that captures the document’s sentiment and topic information by extending SCDV to reflect sentiment information in embedding. The effectiveness of the proposed method is verified by obtaining distributed representation from social media using SSCDV and using it to predict financial indicators by machine learning. The evaluation results show that the model using the proposed method outperforms the models using existing distributed representation acquisition methods (Latent Dirichlet Allocation (LDA), Doc2Vec (PV-DM), BERT, and SCDV), confirming the effectiveness of the proposed method.