Yuya Ozawa, Taichi Yatsuka, Sumio Fujita

The 12th NTCIR Conference (NTCIR-12)

June 07, 2016

Yahoo Japan Search Technology(YJST) team participated in the Japanese iUnit Ranking and Summarization subtasks of NTCIR-12 MobileClick-2. For the iUnit Ranking subtask, we adopted LM-based approach, which is implemented on the basis of organizers’ baseline system. We examined language model based iUnit ranking using both KL-divergence and negative cross entropy with several model smoothing methods such as Bayesian smoothing with Dirichlet priors which commonly used in the document ranking in language modeling IR, or comparatively new Pitman-Yor process smoothing. Our system achieved 0.807 as Q-measure against the Japanese ranking test set. For the iUnit Summarization task, we used the organizer’s LM-based two-layer iUnit summarization baseline system but the ranking module is replaced by aforementioned our extended system. Due to word based matching, the baseline intent identification for the second layer allocation fails to identify any intent when no common word is found between iUnit and Intent. We introduced context based word embedding representation of both iUnit and Intent to identify the intent of iUnits which do not contain any explicit intent words. Finally our system achieved 25.8498 in M-measure against the Japanese summarization test set

Paper : YJST at the NTCIR-12 MobileClick-2 Task (external link)