jurafsky&martin_3rdEd_17 (1).pdf

# T x μ q s 2 n 1512 when applied to association

This preview shows pages 278–281. Sign up to view the full content.

t = ¯ x - μ q s 2 N (15.12) When applied to association between words, the null hypothesis is that the two words are independent, and hence P ( a , b ) = P ( a ) P ( b ) correctly models the relation- ship between the two words. We want to know how different the actual MLE prob- ability P ( a , b ) is from this null hypothesis value, normalized by the variance. The variance s 2 can be approximated by the expected probability P ( a ) P ( b ) (see Manning

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
15.3 M EASURING SIMILARITY : THE COSINE 279 and Sch¨utze (1999) ). Ignoring N (since it is constant), the resulting t-test association measure is thus (Curran, 2003) : t-test ( a , b ) = P ( a , b ) - P ( a ) P ( b ) p P ( a ) P ( b ) (15.13) See the Historical Notes section for a summary of various other weighting factors for distributional models of meaning. 15.3 Measuring similarity: the cosine To define similarity between two target words v and w , we need a measure for taking two such vectors and giving a measure of vector similarity. By far the most common similarity metric is the cosine of the angle between the vectors. In this section we’ll motivate and introduce this important measure. The cosine—like most measures for vector similarity used in NLP—is based on the dot product operator from linear algebra, also called the inner product : dot product inner product dot-product ( ~ v , ~ w ) = ~ v · ~ w = N X i = 1 v i w i = v 1 w 1 + v 2 w 2 + ... + v N w N (15.14) As we will see, most metrics for similarity between vectors are based on the dot product. The dot product acts as a similarity metric because it will tend to be high just when the two vectors have large values in the same dimensions. Alternatively, vectors that have zeros in different dimensions—orthogonal vectors—will have a dot product of 0, representing their strong dissimilarity. This raw dot-product, however, has a problem as a similarity metric: it favors long vectors. The vector length is defined as vector length | ~ v | = v u u t N X i = 1 v 2 i (15.15) The dot product is higher if a vector is longer, with higher values in each dimension. More frequent words have longer vectors, since they tend to co-occur with more words and have higher co-occurrence values with each of them. The raw dot product thus will be higher for frequent words. But this is a problem; we’d like a similarity metric that tells us how similar two words are regardless of their frequency. The simplest way to modify the dot product to normalize for the vector length is to divide the dot product by the lengths of each of the two vectors. This normalized dot product turns out to be the same as the cosine of the angle between the two vectors, following from the definition of the dot product between two vectors ~ a and ~ b : ~ a · ~ b = | ~ a || ~ b | cos q ~ a · ~ b | ~ a || ~ b | = cos q (15.16) The cosine similarity metric between two vectors ~ v and ~ w thus can be computed cosine
280 C HAPTER 15 V ECTOR S EMANTICS as: cosine ( ~ v , ~ w ) = ~ v · ~ w | ~ v || ~ w | = N X i = 1 v i w i v u u t N X i = 1 v 2 i v u u t N X i = 1 w 2 i (15.17) For some applications we pre-normalize each vector, by dividing it by its length, creating a unit vector of length 1. Thus we could compute a unit vector from ~ a by unit vector

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

### What students are saying

• As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

Kiran Temple University Fox School of Business ‘17, Course Hero Intern

• I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

Dana University of Pennsylvania ‘17, Course Hero Intern

• The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

Jill Tulane University ‘16, Course Hero Intern