jurafsky&martin_3rdEd_17 (1).pdf

T x μ q s 2 n 1512 when applied to association

Info icon This preview shows pages 278–281. Sign up to view the full content.

View Full Document Right Arrow Icon
t = ¯ x - μ q s 2 N (15.12) When applied to association between words, the null hypothesis is that the two words are independent, and hence P ( a , b ) = P ( a ) P ( b ) correctly models the relation- ship between the two words. We want to know how different the actual MLE prob- ability P ( a , b ) is from this null hypothesis value, normalized by the variance. The variance s 2 can be approximated by the expected probability P ( a ) P ( b ) (see Manning
Image of page 278

Info icon This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
15.3 M EASURING SIMILARITY : THE COSINE 279 and Sch¨utze (1999) ). Ignoring N (since it is constant), the resulting t-test association measure is thus (Curran, 2003) : t-test ( a , b ) = P ( a , b ) - P ( a ) P ( b ) p P ( a ) P ( b ) (15.13) See the Historical Notes section for a summary of various other weighting factors for distributional models of meaning. 15.3 Measuring similarity: the cosine To define similarity between two target words v and w , we need a measure for taking two such vectors and giving a measure of vector similarity. By far the most common similarity metric is the cosine of the angle between the vectors. In this section we’ll motivate and introduce this important measure. The cosine—like most measures for vector similarity used in NLP—is based on the dot product operator from linear algebra, also called the inner product : dot product inner product dot-product ( ~ v , ~ w ) = ~ v · ~ w = N X i = 1 v i w i = v 1 w 1 + v 2 w 2 + ... + v N w N (15.14) As we will see, most metrics for similarity between vectors are based on the dot product. The dot product acts as a similarity metric because it will tend to be high just when the two vectors have large values in the same dimensions. Alternatively, vectors that have zeros in different dimensions—orthogonal vectors—will have a dot product of 0, representing their strong dissimilarity. This raw dot-product, however, has a problem as a similarity metric: it favors long vectors. The vector length is defined as vector length | ~ v | = v u u t N X i = 1 v 2 i (15.15) The dot product is higher if a vector is longer, with higher values in each dimension. More frequent words have longer vectors, since they tend to co-occur with more words and have higher co-occurrence values with each of them. The raw dot product thus will be higher for frequent words. But this is a problem; we’d like a similarity metric that tells us how similar two words are regardless of their frequency. The simplest way to modify the dot product to normalize for the vector length is to divide the dot product by the lengths of each of the two vectors. This normalized dot product turns out to be the same as the cosine of the angle between the two vectors, following from the definition of the dot product between two vectors ~ a and ~ b : ~ a · ~ b = | ~ a || ~ b | cos q ~ a · ~ b | ~ a || ~ b | = cos q (15.16) The cosine similarity metric between two vectors ~ v and ~ w thus can be computed cosine
Image of page 279
280 C HAPTER 15 V ECTOR S EMANTICS as: cosine ( ~ v , ~ w ) = ~ v · ~ w | ~ v || ~ w | = N X i = 1 v i w i v u u t N X i = 1 v 2 i v u u t N X i = 1 w 2 i (15.17) For some applications we pre-normalize each vector, by dividing it by its length, creating a unit vector of length 1. Thus we could compute a unit vector from ~ a by unit vector
Image of page 280

Info icon This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Image of page 281
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

What students are saying

  • Left Quote Icon

    As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

    Student Picture

    Kiran Temple University Fox School of Business ‘17, Course Hero Intern

  • Left Quote Icon

    I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

    Student Picture

    Dana University of Pennsylvania ‘17, Course Hero Intern

  • Left Quote Icon

    The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

    Student Picture

    Jill Tulane University ‘16, Course Hero Intern