A faster subquadratic algorithm for finding outlier correlations

Tutkimustuotos: Lehtiartikkelivertaisarvioitu

Standard

A faster subquadratic algorithm for finding outlier correlations. / Karppa, Matti; Kaski, Petteri; Kohonen, Jukka.

julkaisussa: ACM Transactions on Algorithms, Vuosikerta 14, Nro 3, 31, 01.07.2018, s. 1-26.

Tutkimustuotos: Lehtiartikkelivertaisarvioitu

Harvard

APA

Vancouver

Author

Bibtex - Lataa

@article{8739f9567c874027b2d34c29896fec10,
title = "A faster subquadratic algorithm for finding outlier correlations",
abstract = "We study the problem of detecting outlier pairs of strongly correlated variables among a collection of n variables with otherwise weak pairwise correlations. After normalization, this task amounts to the geometric task where we are given as input a set of n vectors with unit Euclidean norm and dimension d, and for some constants 0 < τ < ρ < 1, we are asked to find all the outlier pairs of vectors whose inner product is at least ρ in absolute value, subject to the promise that all but at most q pairs of vectors have inner product at most τ in absolute value. Improving on an algorithm of Valiant [FOCS 2012; J. ACM 2015], we present a randomized algorithm that for Boolean inputs ({−1, 1}-valued data normalized to unit Euclidean length) runs in time (Equation presented), where 0 < γ < 1 is a constant tradeoff parameter and M(μ, ν) is the exponent to multiply an [nμ] × [nν ] matrix with an [nν] × [nμ] matrix and Δ = 1/(1 − logτ ρ). As corollaries we obtain randomized algorithms that run in time (Equation presented) and in time (Equation presented), where 2 ≤ ω < 2.38 is the exponent for square matrix multiplication and 0.3 < α ≤ 1 is the exponent for rectangular matrix multiplication. The notation {\~O} (·) hides polylogarithmic factors in n and d whose degree may depend on ρ and τ. We present further corollaries for the light bulb problem and for learning sparse Boolean functions.",
keywords = "Correlation, Fast matrix multiplication, Light bulb problem, Rectangular matrix multiplication, Similarity search",
author = "Matti Karppa and Petteri Kaski and Jukka Kohonen",
year = "2018",
month = "7",
day = "1",
doi = "10.1145/3174804",
language = "English",
volume = "14",
pages = "1--26",
journal = "ACM Transactions on Algorithms",
issn = "1549-6325",
number = "3",

}

RIS - Lataa

TY - JOUR

T1 - A faster subquadratic algorithm for finding outlier correlations

AU - Karppa, Matti

AU - Kaski, Petteri

AU - Kohonen, Jukka

PY - 2018/7/1

Y1 - 2018/7/1

N2 - We study the problem of detecting outlier pairs of strongly correlated variables among a collection of n variables with otherwise weak pairwise correlations. After normalization, this task amounts to the geometric task where we are given as input a set of n vectors with unit Euclidean norm and dimension d, and for some constants 0 < τ < ρ < 1, we are asked to find all the outlier pairs of vectors whose inner product is at least ρ in absolute value, subject to the promise that all but at most q pairs of vectors have inner product at most τ in absolute value. Improving on an algorithm of Valiant [FOCS 2012; J. ACM 2015], we present a randomized algorithm that for Boolean inputs ({−1, 1}-valued data normalized to unit Euclidean length) runs in time (Equation presented), where 0 < γ < 1 is a constant tradeoff parameter and M(μ, ν) is the exponent to multiply an [nμ] × [nν ] matrix with an [nν] × [nμ] matrix and Δ = 1/(1 − logτ ρ). As corollaries we obtain randomized algorithms that run in time (Equation presented) and in time (Equation presented), where 2 ≤ ω < 2.38 is the exponent for square matrix multiplication and 0.3 < α ≤ 1 is the exponent for rectangular matrix multiplication. The notation Õ (·) hides polylogarithmic factors in n and d whose degree may depend on ρ and τ. We present further corollaries for the light bulb problem and for learning sparse Boolean functions.

AB - We study the problem of detecting outlier pairs of strongly correlated variables among a collection of n variables with otherwise weak pairwise correlations. After normalization, this task amounts to the geometric task where we are given as input a set of n vectors with unit Euclidean norm and dimension d, and for some constants 0 < τ < ρ < 1, we are asked to find all the outlier pairs of vectors whose inner product is at least ρ in absolute value, subject to the promise that all but at most q pairs of vectors have inner product at most τ in absolute value. Improving on an algorithm of Valiant [FOCS 2012; J. ACM 2015], we present a randomized algorithm that for Boolean inputs ({−1, 1}-valued data normalized to unit Euclidean length) runs in time (Equation presented), where 0 < γ < 1 is a constant tradeoff parameter and M(μ, ν) is the exponent to multiply an [nμ] × [nν ] matrix with an [nν] × [nμ] matrix and Δ = 1/(1 − logτ ρ). As corollaries we obtain randomized algorithms that run in time (Equation presented) and in time (Equation presented), where 2 ≤ ω < 2.38 is the exponent for square matrix multiplication and 0.3 < α ≤ 1 is the exponent for rectangular matrix multiplication. The notation Õ (·) hides polylogarithmic factors in n and d whose degree may depend on ρ and τ. We present further corollaries for the light bulb problem and for learning sparse Boolean functions.

KW - Correlation

KW - Fast matrix multiplication

KW - Light bulb problem

KW - Rectangular matrix multiplication

KW - Similarity search

UR - http://www.scopus.com/inward/record.url?scp=85052553224&partnerID=8YFLogxK

U2 - 10.1145/3174804

DO - 10.1145/3174804

M3 - Article

VL - 14

SP - 1

EP - 26

JO - ACM Transactions on Algorithms

JF - ACM Transactions on Algorithms

SN - 1549-6325

IS - 3

M1 - 31

ER -

ID: 28318629