{[ promptMessage ]}

Bookmark it

{[ promptMessage ]}

# emnlp - Hashing sketching and other approximate algorithms...

This preview shows pages 1–12. Sign up to view the full content.

1 Hashing , sketching , and other approximate algorithms for high-dimensional data Piotr Indyk MIT

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
2 Plan • Intro – High dimensionality – Problems Technique: randomized projection – Intuition – Proofoid • Applications: – Sketching/streaming – Nearest Neighbor Search • Conclusions • Refs
3 High-Dimensional Data To be or not to be … To be or not to be … (... , 2, …, 2, … , 1 , …, 1, …) to be or not (... , 1, …, 4, … , 2 , …, 2, …) (... , 6, …, 1, … , 3 , …, 6, …) (... , 1, …, 3, … , 7 , …, 5, …)

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
4 Problems • Storage – How to represent the data “accurately” using “small” space • Search – How to find “similar” documents Learning, etc… ? ?
5 Randomized Dimensionality Reduction

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
6 Randomized Dimensionality Reduction (a.k.a. “Flattening Lemma”) Johnson-Lindenstrauss lemma (1984) – Choose the projection plane “at random” – The distances are “approximately” preserved with “high” probability
7 Dimensionality Reduction, Formally JL: For any set of n points X in R d under Euclidean norm, there is a (1+ ε )- distortion embedding of X into R d’ , for d’=O(log n / ε 2 ) JL’: There is a distribution over random linear mappings A: R d R d’ , such that for any vector x we have ||Ax|| = (1 ±ε ) ||x|| with probability 1 - e -Cd’ ε ^2 Questions: What is the distribution ? Why does it work ?

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
8 Normal Distribution Normal distribution: – Range: (- , ) – Density: f(x)=e -x^2/2 / (2 π ) 1/2 – Mean= 0 , Variance= 1 Basic facts: – If X and Y independent r.v. with normal distribution, then X+Y has normal distribution – Var(cX)=c 2 Var(X) – If X,Y independent, then Var(X+Y)=Var(X)+Var(Y)
9 Back to the Embedding We use mapping Ax where each entry of A has normal distribution Let a 1 ,…,a d’ be the rows of A Consider Z=a i *x = a*x= i a i x i Each term a i x i – Has normal distribution – With variance x i 2 Thus, Z has normal distribution with variance i x i 2 =||x|| 2 This holds for each a j

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
10 What is ||Ax|| 2 • ||Ax|| 2 = (a 1 * x) 2 +…+(a d’ * x) 2 = Z 1 2 +…+Z d’ 2 where: – All Z i ’s are independent – Each has normal distribution with variance ||x|| 2 Therefore, E[ ||Ax|| 2 ]=d’*E[Z 1 2 ]=d’ ||x|| 2 By “law of large numbers” (quantitive): Pr[ | ||Ax|| 2 –d’ ||x|| 2 |> ε d’]<e -C d’ ε ^2 for some constant C
11 Streaming/sketching implications Can replace d -dimensional vectors by d ’- dimensional ones Cost: O(dd’) per vector Faster method known [Ailon-Chazelle’06] Can avoid storing the original d -dimensional vectors in the first place (thanks to linearity of the mapping A ) – Suppose: x is the histogram of a document We are receiving a stream of document words

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

### Page1 / 40

emnlp - Hashing sketching and other approximate algorithms...

This preview shows document pages 1 - 12. Sign up to view the full document.

View Full Document
Ask a homework question - tutors are online