Locality Sensitive Hashing

Published: June 01, 2016

Locality-sensitive hashing (LSH) reduces the dimensionality of high-dimensional data, as it is the text data. LSH hashes input items so that similar items map to the same “buckets” with high probability (the number of buckets being much smaller than the universe of possible input items). LSH differs from conventional and cryptographic hash functions because it aims to maximize the probability of a “collision” for similar items. Locality-sensitive hashing has much in common with data clustering and nearest neighbor search.

Material

Python Locality Sensitive Hashing library that optionally supports persistence via redis

Papers

Datar, M.; Immorlica, N.; Indyk, P.; Mirrokni, V.S. (2004). Locality-Sensitive Hashing Scheme Based on p-Stable Distributions. Proceedings of the Symposium on Computational Geometry.
Pauleve, L.; Jegou, H.; Amsaleg, L. (2010). Locality sensitive hashing: A comparison of hash function types and querying mechanisms. Pattern recognition Letters.
Buhler, J. (2001). Efficient large-scale sequence comparison by locality-sensitive hashing. Bioinformatics, 17(5), 419-428.
Slaney, M., & Casey, M. (2008). Locality-sensitive hashing for finding nearest neighbors [lecture notes]. IEEE Signal Processing Magazine, 25(2), 128-131.
Kulis, B., & Grauman, K. (2012). Kernelized locality-sensitive hashing. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34(6), 1092-1104.
Satuluri, V., & Parthasarathy, S. (2012). Bayesian locality sensitive hashing for fast similarity search. Proceedings of the VLDB Endowment, 5(5), 430-441.
Koga, H., Ishibashi, T., & Watanabe, T. (2007). Fast agglomerative hierarchical clustering algorithm using Locality-Sensitive Hashing. Knowledge and Information Systems, 12(1), 25-53.

Books

Rajaraman, A.; Ullman, J. (2010). Mining of Massive Datasets, Ch. 3. Cambridge University Press

Share on

Twitter Facebook Xing LinkedIn Telegram Whatsapp

Locality Sensitive Hashing

See also

Material

Papers

Books

Share on

You May Also Enjoy

¿Para qué sirven los modelos?

La importancia de como mirar los datos (I): Introducción

Futur de la llar a mig termini

Chatbot: oportunidades e a miña propia proposición