GPU accelerated k-nearest neighbor kernel for sparse feature datasets

Brady Zhou, George Biros
UT MS Research
[pdf] [code]

In this report, we present an efficient library for computing k-nearest neighbors (kNN) on datasets with sparse features (that is, most of the features per database entry are zero). Our work uses advances in parallel computing and optimized GPU routines. This GPU implementation utilizes highly parallel routines that exploit the sparsity property and in cases of extreme sparsity, we are able to achieve over 100x speedup in time compared to other state-of-the-art approaches designed for more general (dense) features.