LSH Algorithm and
Locality-Sensitive Hashing (LSH)
is an algorithm for solving the approximate or exact
Near Neighbor Search in high dimensional spaces. This webpage
links to the newest LSH algorithms in Euclidean and
Hamming spaces, as well as
the E2LSH package, an implementation of an
early practical LSH algorithm.
This research was supported in part by NSF CAREER Grant #0133849 "Approximate
Algorithms for High-dimensional Geometric Problems".
Newest (not quite) LSH algorithms (2014):
These algorithms achieve performance better than the classic LSH algorithms by
using data-dependent hashing. They improve over classic LSH
algorithms for both Hamming and Euclidean space. These algorithms are
not dynamic however, in contrast to the classic LSH algorithms, which
use data-independent hashing and hence allow updates to the pointset.
Data-Dependent Hashing for Approximate Near Neighbors
(by Alexandr Andoni and Ilya Razenshteyn). Manuscript 2014.
Locality Sensitive Hashing
(by Alexandr Andoni, Piotr Indyk, Huy L. Nguyen, and Ilya Razenshteyn). In SODA'14.
Slides: Here are some
slides by Alexandr Andoni on the early version from SODA'14.
Survey of LSH in CACM (2008):
"Near-Optimal Hashing Algorithms for Approximate Nearest Neighbor in
High Dimensions" (by Alexandr Andoni and Piotr Indyk).
Communications of the ACM, vol. 51, no. 1, 2008, pp. 117-122.
also available directly from
CACM (for free).
Most Not so
recent algorithm for Euclidean space (2006):
Hashing Algorithms for Near Neighbor Problem in High Dimensions"
(by Alexandr Andoni and Piotr Indyk). In FOCS'06.
on this LSH algorithm from a talk given by Piotr Indyk.
Earlier algorithm for Euclidean
space (2006): a good introduction to LSH, and the description of
affairs as of 2006, is in the following book chapter
Hashing Scheme Based on p-Stable
Distributions (by Alexandr Andoni, Mayur Datar, Nicole Immorlica,
Piotr Indyk, and Vahab Mirrokni), appearing in the book
Neighbor Methods in Learning and
Vision: Theory and Practice,
by T. Darrell and P. Indyk and G. Shakhnarovich (eds.), MIT Press, 2006.
See also the book
introduction for a smooth introduction to NN problem and LSH.
Original LSH algorithm (1999):
the best algorithm for the Hamming space remains
previous version of the algorithm for the
Hamming distance is described in [GIM'99]
Implementation of LSH:
download the E2LSH package (alpha-version). The code is based on the algorithm
described in the book chapter (2006) from above.
You can download the manual for the code. The code has been developed by Alex Andoni in 2004-2005.