Fast Distance Oracles for Any Symmetric Norm

05/30/2022
โˆ™
by   Yichuan Deng, et al.
โˆ™
0
โˆ™

In the Distance Oracle problem, the goal is to preprocess n vectors x_1, x_2, โ‹ฏ, x_n in a d-dimensional metric space (๐•^d, ยท_l) into a cheap data structure, so that given a query vector q โˆˆ๐•^d and a subset SโІ [n] of the input data points, all distances q - x_i _l for x_iโˆˆ S can be quickly approximated (faster than the trivial โˆผ d|S| query time). This primitive is a basic subroutine in machine learning, data mining and similarity search applications. In the case of โ„“_p norms, the problem is well understood, and optimal data structures are known for most values of p. Our main contribution is a fast (1+ฮต) distance oracle for any symmetric norm ยท_l. This class includes โ„“_p norms and Orlicz norms as special cases, as well as other norms used in practice, e.g. top-k norms, max-mixture and sum-mixture of โ„“_p norms, small-support norms and the box-norm. We propose a novel data structure with ร•(n (d + mmc(l)^2 ) ) preprocessing time and space, and t_q = ร•(d + |S| ยทmmc(l)^2) query time, for computing distances to a subset S of data points, where mmc(l) is a complexity-measure (concentration modulus) of the symmetric norm. When l = โ„“_p , this runtime matches the aforementioned state-of-art oracles.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset
Success!
Error Icon An error occurred

Sign in with Google

×

Use your Google Account to sign in to DeepAI

×

Consider DeepAI Pro