Lecture #8 - Index Structures - SS ZG515: Data Warehousing...

Info iconThis preview shows pages 1–3. Sign up to view the full content.

View Full Document Right Arrow Icon
SS ZG515: Data Warehousing Index Structures An index is any data structure that takes as input a property of records – typically the values of one or more fields – and finds the records with that property quickly. An index lets us find records without having to look at not more than a fraction of all possible records. Te field(s) on which the index is based is called the search key . For a given data file , we create an index file , consisting of key-pointer pairs. There are many different data structures that serve as indexes: 1. Primary indexes on sorted files 2. Secondary indexes on unsorted files 3. B-trees, a commonly used index on any file 4. Hash indexes Indexes can be classified as: Single-level Examples: primary, secondary, clustering Multi-level Examples: ISAM (Indexed Sequential Access Method), B-tree, B + -tree Indexes can also be classified as: Dense If the index file contains the same number of records as the data file Sparse If the index file contains less number of records than the data file Single-Level Indexes 1. Primary Indexes These indexes require a sequential file (i.e. the file should be sorted on the search key field). When the search key is a key of the relation, we call the index as primary index, and when the search key is not a key of the relation, the index is called clustering index. The following examples will make things clear: Example 1: Consider the data file sorted on the key field Data File Index File Page 1 of 7 1 0 3 0 5 0 7 0 9 0 10 20 50 60 30 40 100 90 80 70
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
SS ZG515: Data Warehousing Primary index requires that the ordering field of the data file have a distinct value for each record. Primary index is sparse Contains as many records as there are blocks * in the data file (there are 5 blocks in this example and each block can hold only 2 records). The first record in each block of the data file is called anchor record of the block, or simply block anchor. There can be only one primary index on a table A dense index on the above data file will have 10 records, one for each key value, and record pointers instead of block pointers. 2. Clustering Indexes If records of a file are physically ordered on a nonkey field, called the clustering field. We can create a clustering index to speed up the retrieval of records that have the same value for the clustering field. This differs from a primary index, which requires that the ordering field of the data file have a distinct value for each record.
Background image of page 2
Image of page 3
This is the end of the preview. Sign up to access the rest of the document.

Page1 / 7

Lecture #8 - Index Structures - SS ZG515: Data Warehousing...

This preview shows document pages 1 - 3. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online