ploys a hashing algorithm to assign token frequencies to columns.We’ve already looked at hash functions in“How Do Dictionaries and SetsWork?”. A hash converts a unique item (in this case a text token) into anumber, where multiple unique items might map to the same hashed val‐ue, in which case we get a collision. Good hash functions cause few colli‐sions. Collisions are inevitable if we’re hashing many unique items to asmaller representation. One feature of a hash function is that it can’t easilybe reversed, so we can’t take a hashed value and convert it back to theoriginal token.InExample 11-24, we ask for a fixed-width 10-column matrix—the de‐fault is a fixed-width matrix of 1 million elements, but we’ll use a tiny ma‐trix here to show a collision. The default 1-million-element width is a sen‐sible default for many applications.