2 if the estimated size of the relation is n r and

Info icon This preview shows pages 49–51. Sign up to view the full content.

View Full Document Right Arrow Icon
2. If the estimated size of the relation is n r and number of records per block is f r , allocate ( n r / f r ) × (1 + d ) buckets instead of ( n r / f r ) buckets. Here d is a fudge factor 4 , typically around 0.2 . Some space is wasted: about 20% of the space in the buckets will be empty. But the benefit is that some of the skew is handled and the probability of overflow is reduced. Handling bucket overflows Bucket overflows can be handled using two techniques: 1. Closed Hashing. If records must be inserted into a bucket and the bucket is already full, they are inserted into overflow buckets which are chained together in a linked list. Overflow handling using such a linked list is called overflow chaining . The form of hash structure that we have just described is sometimes referred to as closed hashing . 2. Open Hashing. In this technique, the set of buckets is fixed, and there are no overflow chains. Instead, if a bucket is full, the system inserts records in some other bucket in the initial set of buckets. When a new entry has to be inserted, the buckets are examined, starting with the hashed-to slot and proceeding in some probe sequence , until an unoccupied slot is found. The probe sequence can be any of the following: a. Liner probing. The interval between probes is fixed (usually 1). b. Quadratic probing. The interval between probes increases by some constant (usually 1) after each probe. c. Double hashing. The interval between probes is computed by another hash function. 4 Fudge factor: A quantity that is added or subtracted in order to increase the accuracy of a scientific measure. Figure: Bucket overflow handling using overflow chaining.
Image of page 49

Info icon This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
46 Comparative analysis of closed and open hashing Open hashing has been used to construct symbol tables for compilers and assemblers, but closed hashing is preferable for database systems. The reason is that deletion under open hashing is troublesome. Usually, compilers and assemblers perform only lookup and insertion operations in their symbol tables. However, in a database system insertion-deletion occurs frequently. Thus, open hashing is of only minor importance in database implementation. Hash Indices head2right Hashing can be used not only for file organization, but also for index- structure creation. head2right A hash index organizes the search- keys, with their associated record pointers, into a hash file structure. head2right We apply a hash function on a search key to identify a bucket, and store the key and its associated pointers in the bucket (or in overflow buckets). head2right Hash indices are always secondary indices — if the file itself is organized using hashing, a separate primary hash index on it using the same search-key is unnecessary. However, we use the term hash index to refer to both secondary index structures and hash organized files . Static and Dynamic Hashing In static hashing, we need to fix the set B of bucket addresses. Since most databases grow over time, if we are to use static hashing for such a database, we have three classes of options: 1. Choose a hash function based on the current size of the database. This option will result in performance degradation as the database grows.
Image of page 50
Image of page 51
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

What students are saying

  • Left Quote Icon

    As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

    Student Picture

    Kiran Temple University Fox School of Business ‘17, Course Hero Intern

  • Left Quote Icon

    I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

    Student Picture

    Dana University of Pennsylvania ‘17, Course Hero Intern

  • Left Quote Icon

    The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

    Student Picture

    Jill Tulane University ‘16, Course Hero Intern