カンファレンス (国際) TKB48 at TREC 2021 News Track

Lirong Zhang (University of Tsukuba), Hideo Joho (University of Tsukuba), Sumio Fujita



TKB48 incorporated document expansion methods such as docT5query and keyword extraction into indexing to solve the background link- ing problem. Using a transformer-based model, we calculated the text similarity of queries and documents at a semantic level and combined the semantic similarity and BM25 score for re-ranking background articles. We examined different combinations of re- ranking factors such as semantic similarities between expanded documents and attributes of topics. We found that increasing index fields produced by the docT5query model and keyword extraction model was beneficial. At the same time, the re-ranking performance was influenced by the amount of semantic similarity factors and their weight in the total relevance score. To discover the effec- tiveness of document expansion and our method using temporal recency, we further generated several unofficial runs incorporating a temporal topic classifier and learning to rank method. However, the lack of temporal topics limits the performance of the model. Our purposed algorithm outperformed the learning to rank method. Our future work will focus on fine-tuning of the docT5query model.

Paper : TKB48 at TREC 2021 News Track (外部サイト)