Galloping Search Doubling range untl overshooT Then binary...

Galloping Search Doubling range until overshoot then binary search between I n 1 , I n ; cost is 2 log 2 n + 1 = Θ ( log n ) Blocked sort-based indexing (BSBI) Corpus ¿¿ M ¿ n = Θ ¿ Single-Pass In-Memory indexing (SPIMI) Step 1: Find the minimum element Θ ( 1 ) and delete it Θ ( log k ) Step 2: Insert element from Step 1 to min- heap Θ ( log k ) Final cost of heap is Θ ( log k ) otherwise scan whole list Θ ( k ) Immediate Merge N ¿ index = 1 ; D index = Θ ( C 2 M ) No Merge – Never merge sub-indexes N ¿ index = Θ ( C M ) ; D index = Θ ( C ) ; D fetch = Ω ( C M ) n-way Logarithmic Merge – Do every n group ( n default = 2 ) N ¿ index = Θ ( log n C M ) ; D merge = Θ ( C log n C M ) Auxiliary, main index D index = Θ ( C 2 M ) ; D fetch = Θ ( log n T ) Heap’s Law M = k T b ; 30 ≤k≤ 100 and b≈ 0.5 M = vocabulary size, T = number of tokens in the collection Zipf’s Law c f i = K i ; K = normalizing constant, i = term rank log cf i = log K log i ; t 1 = c f 1 ,t 2 = cf 1 2 ,t 3 = cf 1 3 ,… Unary - Represent n as n 1s with a final 0 e.g. 3 -> 1110

