Publications

カンファレンス (国際) Doc2Vec-based Approach for Extracting Diverse Evaluation Expressions from Online Review Data

Kosuke Kurihara, Yoshiyuki Shoji (Aoyama Gakuin univ.), Sumio Fujita, Martin J. Dürst (Aoyama Gakuin univ.)

The 23rd International Conference on Information Integration and Web Intelligence (iiWAS 2021)

2021.11.29

This paper proposes a method for extracting diverse expressions from online movie review texts for a given keyword query. When people watch a movie that makes them cry, they generally do not say “I cried.” Instead, they use such euphemistic language as “I needed a handkerchief” or “My makeup was running.” To enable information retrieval based on audience reactions such as “movies that make me cry” using review texts, a variety of paraphrased expressions must be collected for arbitrary queries. Our proposed method extracts such expressions from review datasets by applying two extensions to Doc2Vec: 1) it changes the granularity of the training sentences to mitigate a lack of context, and 2) it applies query expansion for similarity calculation in advance. We conducted a large-scale experiment using crowdsourcing with 1.29 million actual sentences taken from Yahoo! Movies, Japan. The experimental result revealed that changing the training data granularity and adding the query expansion are both effective to accurately collect more diverse expressions that have a meaning similar to the given query.

Paper : Doc2Vec-based Approach for Extracting Diverse Evaluation Expressions from Online Review Data (外部サイト)