カンファレンス (国際) Simpler is Much Faster: Fair and Independent Inner Product Search
Kazuyoshi Aoyama (Osaka University), Daichi Amagata (Osaka University), Sumio Fujita, Takahiro Hara (Osaka University)
The 46th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2023)
The problem of inner product search (IPS) is important in many fields. Although maximum inner product search (MIPS) is often considered, its result is usually skewed and static. Users are hence hard to obtain diverse and/or new items by using the MIPS problem. Motivated by this, we formulate a new problem, namely the fair and independent IPS problem. Given a query, a threshold, and an output size 𝑘, this problem randomly samples 𝑘 items from a set of items such that the inner product of the query and item is not less than the threshold. For each item that satisfies the threshold, this problem is fair, because the probability that such an item is outputted is equal to that for each other item. This fairness can yield diversity and novelty, but this problem faces a computational challenge. Some existing (M)IPS techniques can be employed in this problem, but they require 𝑂(𝑛) or 𝑜 (𝑛) time, where 𝑛 is the dataset size. To scale well to large datasets, we propose a simple yet efficient algorithm that runs in 𝑂(log𝑛 + 𝑘) expected time. We conduct experiments using real datasets, and the results demonstrate that our algorithm is up to 330 times faster than baselines.