その他 (国際) A Step from VQA to CQA: Adapting Visual QA Models for Community QA Tasks
Avikalp Srivastava (CMU), HsinWen Liu (Waseda Univ.), Sumio Fujita
In this paper, we study and develop methods to derive highlevel representations for image-text pairs in image-based community question answering (CQA) for performing tasks of practical significance on such a platform - automated question classification and finding experts for answering a question. Motivated by our aim to also utilize this work as a step towards basic question-answering on image-based CQA, and to utilize the advances in visual question answering models, we analyze the differences between visual QA & community QA datasets, understand the limitations of applying VQA models directly to CQA data and tasks, and make novel augmentations to VQA-inspired models to best exploit the multimodal data from Yahoo! Chiebukuro’s CQA dataset.