2020_High_Performance_Python_-or_Humans_O'Reilly_Media,_Inc. 583.pdf

This preview shows page 1 out of 1 page.

The preview shows page 1 - 1 out of 1 page.
This extreme reduction in memory use is partly why the speeds are somuch better. In addition to running only the multiplication operation forelements that are nonzero (thus reducing the number of operations need‐ed), we also don’t need to allocate such a large amount of space to saveour result in. This is the push and pull of speedups with sparse arrays—itis a balance between losing the use of efficient caching and vectorizationversus not having to do a lot of the calculations associated with the zerovalues of the matrix.One operation that sparse matrices are particularly good at is cosine simi‐larity. In fact, when creating aDictVectorizer, as we did in“IntroducingDictVectorizer and FeatureHasher”, it’s common to use cosine similarityto see how similar two pieces of text are. In general for these item-to-itemcomparisons (where the value of a particular matrix element is comparedto another matrix element), sparse matrices do quite well. Since the calls
End of preview. Want to read the entire page?

Upload your study docs or become a

Course Hero member to access this document

Term
Spring
Professor
SDesousa

  • Left Quote Icon

    Student Picture

  • Left Quote Icon

    Student Picture

  • Left Quote Icon

    Student Picture