CONFERENCE (INTERNATIONAL) Classifying Community QA Questions That Contain an Image

Kenta Tamaki(Waseda Univ.), Riku Togashi, Sosuke Kato (Waseda Univ.), Sumio Fujita, Hideyuki Maeda (CyberAgent, Inc.), Tetsuya Sakai (Waseda Univ)

The 4th ACM SIGIR International Conference on the Theory of Information Retrieval (ICTIR 2018)

September 14, 2018

We consider the problem of automatically assigning a category to a given question posted to a Community Question Answering (CQA) site, where the question contains not only text but also an image. For example, CQA users may post a photograph of a dress and ask the community ``Is this appropriate for a wedding?'' where the appropriate category for this question might be ``Manners, Ceremonial occasions.'' We tackle this problem using Convolutional Neural Networks with a DualNet architecture for combining the image and text representations. Our experiments with real data from Yahoo Chiebukuro and crowdsourced gold-standard categories show that the DualNet approach outperforms a text-only baseline (p=.0000), a sum-and-product baseline (p=.0000), Multimodal Compact Bilinear pooling (p=.0000), and a combination of sum-and-product and MCB (p=.0000), where the p-values are based on a Tukey Honestly Significant Difference test with B = 5000 trials.

Paper : Classifying Community QA Questions That Contain an Image (external link)