'; zhtm += ''; zhtm += '

' + pPage + ''; zhtm += ''; figDoc.write(zhtm); figDoc.close(); popUpWin.focus(); } //--> Using Oracle8 -- Chapter 8 - Adding Segments for Different Types of Indexes

Using Oracle8


Chapter 8

Adding Segments for Different Types of Indexes


Why Index?

Although the primary reason for adding indexes to tables is to speed data retrieval, you may use indexes for these additional reasons:

Why indexes are important
Imagine looking for a document in a filing cabinet that contains documents in a random order. You might have to look at each and every document before finding what you're looking for. The effort required to find the document will increase as the size of the filing cabinet and the number of documents within it increases. A database without an index is similar to such an unorganized filing cabinet. More than 50 percent of systems reporting a performance problem suffer from lack of an index or from the absence of an optimum index.

The first and third items are discussed in more detail in Chapter 17, "Using Constraints to Improve Your Application Performance," which is devoted to integrity constraints. Index clusters, mentioned in the second bullet, are covered in Chapter 18, "Using Indexes, Clusters, Caching, and Sorting Effectively." The fourth item needs some additional comments here.

SEE ALSO
To learn about using indexes with unique constraints,
How to use indexes with foreign key constraints,
How to create and manage index clusters,

Oracle tries to avoid accessing any more blocks than necessary when executing SQL statements. If a query is written in such a way that an index can be used to identify which rows are needed, the server process finds the required index entries. The process usually uses the rowids-pointers to the file, block, and record where the data is stored-to find the required block and move it into memory, if it's not already there. However, if the columns in the query's SELECT clause are all present in the index entry, the server process simply retrieves the values from the index entry and thus avoids the additional block search in the table itself. This technique, the biggest benefit of which is saving time, can also help out if there's a problem with the base table or the file where it's stored. Queries that can be satisfied from index entries will continue to function, even if the base table is unavailable.

Figure 8.1 explains the basic concept of locating data with an index. The user looking for specific information first looks for a keyword in the index. This keyword can be easily located because the index is sorted. The index contains the keyword with the detailed information's address. The desired data is quickly located by using this address information.

Figure 8.1 : Indexes provide a quick access path to the data.

An Oracle index is a structure, maintained as an independent segment, that contains an ordered set of entries from one or more columns in a table. These ordered entries are stored on a set of blocks known as leaf blocks. To provide fast access to any specific value in these leaf blocks, a structure of pointers is also maintained in the index. These pointers are stored on branch blocks. Each branch block contains pointers for a specific range of indexed values. The pointers themselves may point to a leaf block where the value can be found, or to another branch block that contains a specific subset of the value range.

Oracle uses a b*tree index structure, which guarantees that the chain (or number of blocks that must be examined to get from the highest level branch block to the required leaf block) is the same no matter what value is being requested. The number of blocks, or levels, in such a chain defines the height of a b*tree. The larger the height, the greater the number of blocks that have to be examined to reach the leaf block, and consequently, the slower the index. Figure 8.2 shows the logical structure of a b*tree index.

Figure 8.2 : A b*tree index consists of a set of ordered leaf blocks with a structure of branch blocks to aid navigation to the leaves.

When a leaf block fills up, an empty block is recruited to be a new leaf block; some records from the full block are moved into this new block. This activity is called "splitting a block." The branch block pointing to the original leaf block adds a new entry for the split block. If the branch block doesn't have room for the new entry, it also splits. This, in turn, requires that the branch block pointing to it needs to add a new entry for the split block. The very first branch block, called the "root block," is at the top of the index. If it fills up, it too will split, but in this case the original root block and the split block become the second level in the b*tree. A new root block is created, pointing initially to the two blocks that are now at the next level.

The Mechanics of Index Block Splits

Most indexes will fall into one of two categories-new entries will be randomly inserted between existing values, or new entries will always have a larger value than any existing value. The former is most often found on columns of character data, such as a last name column. The latter would typically be an index on a numeric primary key column, where a sequence generator or some other tool is used to increment the value of each new entry.

In indexes where values are inserted in an apparent random order, very few entries will be stored on the very last leaf block, but are placed on a leaf block somewhere in the sorted sequence of leaf blocks. When one of these blocks fills up and has to be split, Oracle can't make any assumptions about future entries that might be added in the same value range. Therefore, it splits the block in the middle, leaving the first 50 percent of the entries on the current block and moving the second 50 percent of the entries to the new leaf block. This 50-50 split maximizes the available space for new entries that, statistically, are just as likely to fall in the first part of the range as in the second.

Splitting the last leaf block
The algorithm for splitting the last index leaf block is optimal when the index is on a column that contains ever-increasing values, such as sequence numbers. Only the last entry is moved to the split block, leaving the preceding entries in place. On rare occasions, the index entry on a column containing unordered data, such as last names, causes the last leaf block to split. If the value is currently the highest in the index in such cases, the split will place just this entry on the new leaf block. This can lead to a further block split if other values need to be inserted in the former last leaf block. Although this is less efficient than splitting the block 50-50 the first time it fills, it occurs too infrequently to be a significant performance factor.

When a new index entry has a higher value than other current entries, it will be added to the last leaf block in the index. When this block fills up and needs to be split, Oracle may not perform a 50-50 block split. If the new entry is the highest value so far, Oracle simply adds a new leaf block and stores just the new record on it. In indexes where all new entries have higher values, this scheme will provide the maximum possible space on the new leaf block. That is where all new entries will be stored. It also saves the overhead of moving 50 percent of the entries from the current leaf block, which would free up space that would never be used anyway.

Managing a Standard B*Tree Index

The most common type of index is a standard b*tree index. We will take some time examining the various characteristics of this index type and learn how to build and manage one. If you've read through Chapter 5 "Managing Your Database Space," or Chapter 17, you're already introduced to some index issues. Chapter 5 emphasizes the general usefulness of placing indexes in their own tablespaces, away from the tables on which they're built. Chapter 17 briefly discusses the management of indexes required to support certain types of constraints.

SEE ALSO
For information on tablespace usage for different segment types,
To see when to use b*tree indexes,
Column order in composite indexes can differ from the table
The columns in a composite index may be defined in any order, irrespective of their order in the base table definition. They don't even have to be built on columns adjacent to each other in the underlying table.

Before proceeding, you need to be introduced to two terms that appear throughout this chapter. A composite index, also called a concatenated index, is an index that you create on multiple columns within a table.

Sizing an Index

As mentioned in Chapter 5 an index is typically smaller than the table on which it's based. This is because the index generally contains only a subset of the columns from the table and thus requires less storage for each entry than is taken by an entire row. If the index is on many or even all columns, though, it will almost certainly be larger than the table it's indexing. This is because each index entry stores not just the data from the table's columns, but also a rowid, which embodies the physical location of the row in the table. In addition, the index stores one or more branch blocks at every level of the b*tree structure. A table does not need any such blocks; it stores only the data itself.

Another space issue for an index is the use of empty space. When new rows are being added to a table, the rows can be stored on any available block. As blocks fill, rows are placed on empty blocks and continue to be stored there until they too get full. This can't be done with index entries because they're stored in a specific order. Even if a block is almost empty, Oracle can't store an index entry on it if the entry doesn't belong in the range of values now assigned to that block. Following a block split, an index will have two partially full blocks, neither of which can be used for entries outside either block range. A table, on the other hand, doesn't have to move records around when blocks fill up; it simply adds new records to empty blocks, which it can then continue to fill with other new rows, regardless of their values.

As discussed in Chapter 7 "Adding Segments for Tables," Oracle provides detailed descriptions of algorithms that you can use to estimate the total space required by various types of segments. As with computations for table sizes, some of the numbers you have to plug into the formulae are estimates. These include the average lengths of fields stored in indexed columns and estimates of how many rows will have NULLs in the indexed columns, if any. The calculations don't take into account how many block splits might occur while data is being added to the index, so they become less and less reliable for sizing long-term growth. The space required for branch blocks is included for in the calculations, but it's simply based on an expected ratio of branch blocks to leaf blocks. The actual number of branch blocks required depends on the number of distinct values in the index, a factor that isn't included in the space estimate.

Sizing an index with sample data
I recommend that, if you have good sample data, you should consider building a test index with this data and extrapolate the full index size based on the size of the test index. If you don't have good sample data, it's a somewhat pointless exercise to evaluate the index size by using Oracle's calculations- your input will be a guess.

You don't really need to know how big an index will be before you create one, unless you're very short of disk space. In this case, you should ensure that you'll have sufficient space for the entire index. Unlike a table, you rarely need to read an entire index from start to finish, so there's no real requirement to keep its blocks on contiguous disk space. Therefore, you don't need to define large extents for an index; you can afford to create it or let it grow via many small extents. Again, you don't need to have very precise sizing predictions if you plan to use small extents-you won't end up wasting too much space even if the last extent isn't very full, something you can't be sure of if you use large extents. If you want to work on the detailed sizing calculations, you can find Oracle's formulae in Appendix A of the Oracle8 Server Administrator's Guide.

Reuse table-sizing scripts
You may want to look at the sizing section in Chapter 7and review the scripts that compute sizing requirements for tables based on sample data. These scripts can be modified, if you want to use them, to estimate overall index size.

SEE ALSO
Get table-sizing details on

Creating an Index

In Chapter 18, "Using Indexes, Clusters, Caching, and Sorting Effectively," you learn what criteria help determine how useful an index would be in optimizing queries or other table access. The other main reasons to use an index were summarized at the start of this chapter. In this section it's assumed you've determined that you need a standard b*tree index on an existing table. You look at the syntax and how to use it to build an effective index. Most indexes you build will use the CREATE INDEX command. In Chapter 17 you can find out about indexes that Oracle builds automatically, if they're needed, to support certain integrity constraints.

SEE ALSO
For information on constraints that require indexes,

The syntax for the CREATE INDEX command to build a standard b*tree index on a table is shown in Listing 8.1.


Listing 8.1  Creating an index with the CREATE INDEX command

     01:     CREATE [UNIQUE] INDEX [index_schema.]index_name
     02:         ON [table_schema.]table_name (
     03:         column_name [ASC][DESC] [,...] )
     04:         [parallel_clause]
     05:         [NO[LOGGING]]
     06:         [TABLESPACE tablespace_name]
     07:         [NOSORT]
     08:         [storage_clause]
     09:         [space_utilization_clause]

Numbering of code lines
Line numberings were included in Listing 8.1 and other code listings to make discussion about this code easier to reference. The numbers should not be included with any command-line commands, as part of any Oracle scripts, or within SQL statements.

CREATE INDEX...ON on lines 1 and 2 are the required keywords for the command.

On line 1, UNIQUE creates an index in which every entry must be different from every other entry; by default, an index is non-unique and allows duplicate entries. index_schema is the name of the owner of the index; by default, it's the user creating the index. Finally, index_name is the name given to the index.

On line 2, table_schema is the name of the owner of the table on which the index is being built; by default, it's assumed to be in the schema of the user creating the index. Also on line 2, table_name is the name of the table on which the index is being created.

column_name on line 3 is the name of the index's leading column. You can include up to 31 additional columns, as long as the total length of an entry is less than half of the Oracle block size for the database.

Also on line 3, ASC and DESC are keywords provided for compatibility with standards; they have no impact on how the index is created. You can use only one of these two keywords per column, but you can apply either one to different columns in a composite index. Finally, [,...] indicates that you can include more than one column in the index, naming them in a comma-separated list.

On line 4, parallel_clause is one of



Causes all access to index to be serialized
Allows some parallel access
Set the number of query slaves to be used in an instance to build index in parallel; only one format can be used per statement
Set number of parallel server instances to be used when building index with intern-ode parallel operations; only one format can be used per statement

On line 5 of Listing 8.1, LOGGING and NOLOGGING determine whether creation of the index and subsequent activities will be logged (LOGGING, the default) or not logged (NOLOGGING) into the redo logs. The additional activities subject to this setting are direct loads through SQL*Loader and direct-load INSERT commands.

TABLESPACE tablespace_name on line 6 identifies the tablespace where the index will be created. By default, it's built in the default tablespace of the user creating the index.

On line 7 of Listing 8.1, NOSORT is used to prevent a sort when the rows are stored in the table in ascending order by the index key. The CREATE INDEX command will fail if any row is out of order. By default, Oracle assumes that the rows aren't in order and sorts the indexed data.

When to use CREATE INDEX's NOSORT option
Oracle's tables, like all relational tables, aren't guaranteed to be stored in any specific order. For NOSORT to work when you're creating an index, the table must have been loaded by using a single process with no parallel operations, and with a source of data already sorted in the order of the indexed column(s). The rows can be entered manually, one row at a time with INSERT statements, or with SQL*Loader in conventional or direct mode. The index needs to be created following such a load, before any addition-al DML statements are issued against the table; the row order may not be preserved by such commands.

The storage_clause on line 8 is as follows:

STORAGE (
     [INITIAL integer [K|M]]
     [NEXT integer [K|M]]
     [PCTINCREASE integer]
     [MINEXTENTS integer]
     [MAXEXTENTS [integer|UNLIMITED]]
     [FREELISTS integer]
     [FREELIST GROUPS integer]
     [BUFFER POOL [KEEP|RECYCLE|DEFAULT]] )

Order of columns in a composite index
If you're building a composite index and aren't sure which columns will be referenced most often, create the index with the columns ordered from the most to the least discriminating. For example, an index on the honorific (Mr., Ms., Dr., and so on), the first initial, and the last name columns: Put the last name first (many different values), the first initial (26 values), and the honorific (a handful of values).

KEEPAssigns blocks to the kept buffer pool
RECYCLEAssigns blocks to the recycle buffer pool
DEFAULTAssigns blocks to neither pool; this is the default if you don't include the BUFFER POOL option

The space_utilization_clause on line 9 is as follows:



Reserves space for new entries on a block (default is 10)
Sets number of transaction slots reserved in each block (default is 2)
Sets maximum number of transaction slots that can be created in a block (default is 255)

You don't have to name the columns in a composite index in the same order as they're defined in the table-nor do you, as this implies, have to use adjacent columns. Your best option is to include the most queried column first-a query that provides a value for the leading column of a composite index can use the index to find the required rows, even if the indexed columns aren't referenced by the query. You should include the other columns in descending order of frequency of reference for the same reason-Oracle can use as many of the leading columns of an index as are identified in an SQL statement's WHERE clause.

SEE ALSO
Find out more about parallel operations,
For information about SQL*Loader and its options, including direct and parallel direct loads,

Unique Indexes

You should rarely need to create an index as UNIQUE. A unique constraint, rather than a unique index, should be used to enforce uniqueness between rows. A unique constraint does use an index, and you can create one for this purpose as discussed in Chapter 17, but it doesn't have to be a unique index.

A composite unique index will ensure only that the set of values in each entry is distinct from all other values. It will allow the same value to be repeated in a column multiple times, as long as at least one other column has a different value from any existing entry. An entry will be stored in the index if at least one column has a non-NULL value. A NULL value in a column will be treated as potentially containing the same value as another NULL value in that same column. Consequently, an entry containing one or more NULLs, but with all the same values in the non-NULL columns, would be considered in violation of the unique condition. A row with these characteristics therefore couldn't be stored.

NULLs aren't considered for uniqueness
A row with a NULL in the indexed column won't be recorded in the index, so a unique index won't prevent multiple rows with a NULL in the indexed column from being stored.

Index Sort Order

All indexes are created with the values in ascending order, regardless of the ASC or DESC options settings on each column. Internally, indexes can be scanned in either direction, due to forward and backward pointers between the leaf blocks.

ASC and DESC index options
The options for ordering indexes in ascending or descending order via the ASC and DESC keywords are included only for compliance with SQL standards. Retrieval from an Oracle index in either direction is equally efficient.

Parallel Operations on Indexes

Unless your index is partitioned, only the creation of the index can be processed in parallel. Subsequent access to the index will be done through a single serial server. Without the parallel clause in the CREATE INDEX command, the creation will also be serial, regardless of the underlying table's definition. The PARALLEL clause's INSTANCES option is of significance only if you're running Oracle Parallel Server.

Check availability of parallel server processes
Parallel operations on indexes, as well as any other Oracle objects, will occur only if your instance (or instances) is running with a sufficient number of parallel server processes available. This is deter-mined by the PARALLEL_ MIN_SERVERS and PARALLEL_MAX_SERVERS parameter settings in your initialization file.

Parallel creation is usually much faster than serial creation, particularly if you have multiple CPUs on your database server and the table is striped across multiple disk drives. For this reason, you may want to build all your indexes with an appropriate degree of parallelism (and number of instances, if applicable). In particular, you should consider creating any indexes needed to enforce constraints with the CREATE INDEX command so that you can include the PARALLEL clause. You don't have the option of defining a parallel operation when an index is built automatically as a result of creating or enabling the constraint.

SEE ALSO
Read about partitioned indexes and their parent segments,

Logging Index Operations

You can speed the processing of your CREATE INDEX commands by not creating redo log entries. After you're done, however, you should try to back up the tablespace where the index is stored-recovery following a media failure can't reconstruct the index correctly without the normally created redo log entries. In addition, you should repeat the backup any time you perform one of the other operations that won't be logged due to the NOLOGGING setting, for exactly the same reason. Of course, in some cases you may not realize that a subsequent operation isn't generating redo entries.

Index activities affected by the NOLOGGING option
In addition to the creation of the index, the two other activities that aren't logged when the NOLOGGING option is in place are direct loads with SQL*Loader and direct load inserts with the INSERT...SELECT command.

To avoid any problems with future unlogged changes to your index, you might want to turn on logging after its creation. You can still create the index without logs-use the ALTER INDEX command after it's built.

If you don't include the LOGGING or NOLOGGING option in the CREATE INDEX command, the index will be built in, and future activities will use logging mode for non-partitioned indexes. Partitioned indexes will acquire their logging mode from the parent segment.

Index Tablespaces

By default, indexes are created in your default tablespace just like any other segment. Chapter 5explains why you should consider using different tablespaces for tables and for the indexes on them. Your default tablespace is typically the one where you build your tables, which probably isn't where you want your indexes. It's important to consider where you really want any new index to be created and, if necessary, to include the TABLESPACE clause in your CREATE INDEX command.

Index Space-Utilization Parameters

The space-utilization parameters behave differently for indexes than for tables, which can be confusing:

Creating Indexes at the Right Time

Unless you have a table that has all its rows physically stored in the order of an index that you need (in which case you can use the NOSORT option), the indexed columns will have to be sorted as part of the index creation. The only way you can guarantee that rows will be in the required order is if you sort them first and load them in that order, after which you prevent any further DML against the table until the index is built.

After an index is built, any changes to the underlying table that affect the indexed columns will be automatically included in the index. This adds to the overhead of processing the statements. Unless you need the index to speed the processing or to support concurrent query processing, you should consider dropping it during periods of heavy activity against the table.

Drop indexes that have frequent block splits
Indexes prone to block splitting are good candidates for dropping during periods of heavy DML-either because they're old and have no more free space for new rows on many of their blocks, or because their entries are frequently updated (which results in a DELETE of the old entry and an INSERT of the new, so that ordering is preserved). They can be recreated immediately or, if performance doesn't suffer too much without them, when the underlying table is no longer so busy.

Monitoring Space Usage

As mentioned earlier, the space freed in an index when an entry is deleted isn't available to any new entry added. The "hole" left by the deleted entry is in a specific location in the sort order of the index. For a simple analogy, consider a file cabinet. The drawers correspond to an index leaf block and the folders within are akin to index entries, assuming that they're filed in alphabetical order. If you discard a folder labeled BROWN, it leaves room in the file cabinet between the folders labeled BRONX and BRUIN. If you need to file a new folder for GREEN, you won't place it in the drawer from where the BROWN folder was taken, but in the drawer with the other GR… folders-say, between GRAY and GREY. However, a new folder for BROWNING could legitimately occupy the space vacated by the old BROWN folder.

An index works in much the same way. Freed space can be reused only if the entry's value allows it to fit into the value range opened by the deleted entry.

Index leaf entries aren't dropped only when their corresponding row is dropped from the table, but also when the row is updated, if the update affects the indexed column(s). An update is equivalent to taking a folder from the theoretical file cabinet, changing the label (say, from BROWN to WHITE), and then refiling it. The folder is obviously not going to be of any use filed in its original position; it has to be inserted in the correct position among the other folders beginning with W. As far as the index is concerned, the old entry is removed and a new one is created. The space left behind by the removed entry is subject to reuse in the same way as space left by an actual record deletion.

As you can possibly see, an index on a table that undergoes a lot of updates and deletions can end up with lots of free space that may or may not become reusable, depending on the nature of new records being inserted into the table. One type of table and corresponding index will very likely generate lots of free space on the leaf blocks. This is a table with a primary key based on some form of sequence number, either obtained from an Oracle sequence generator or created by the application.

Consider an ORDERS table that has an index on the ORDER_NUMBER column and where each new order is given the next highest unused number. Orders are added to the table when they're received and dropped after having been filled and the invoice paid. Over time the leaf blocks with the older orders start emptying out as they're filled and paid. In most cases they will become completely empty, and then they can be recycled for use with new orders. What if some are from delinquent customers who have never paid the bill, or for standing orders that remain in the system for years? The leaf blocks containing such order numbers will have to be maintained, even if there's just one order on them. The free space on these blocks can't be reused by new orders because they will have numbers much higher than the range reserved for these blocks.

Consider order numbers 1023 and 3345 on two separate index blocks (on blocks 22 and 28, for example). The blocks between these blocks-23 through 27-may have already been emptied and recycled with higher order numbers. However, the pointers between adjacent leaf blocks that allow an index to be scanned in ascending or descending order will "connect" blocks 22 and 28. Any order outside the range of 1023-3345 therefore can't be placed on either block because it would be out of logical order.

It's possible for an index to gradually become burdened by a large number of almost empty blocks. Such an index wastes disk space and results in slow access for values on the sparsely populated blocks. This chapter discusses an index option (reverse-key indexing) that can help you avoid this type of situation.

You should regularly evaluate the space usage-and hence, efficiency-of your indexes. Over time you'll learn which indexes are prone to space problems and which aren't, due to their underlying tables being relatively static or the indexed values being random and, hence, able to reuse space. You can use an ANALYZE command's option to see how well or how poorly an index is using its space, particularly with respect to deleted entries.

Using the INDEX_STATS view
INDEX_STATS is a temporary view created by the ANALYZE INDEX...VALIDATE STRUCTURE command. It exists only for the duration of the session that created it and can contain information for only one index at a time. If you execute a second ANALYZE INDEX...VALIDATE STRUCTURE command in your session, the INDEX_STATS view will contain only information about the second index analyzed. Only the session that created it can see the INDEX_STATS view, so another user-or even your userid connected to a different session-won't see the view. When you log out of your session, the view is removed and you'll need to rerun the ANALYZE command to recreate it.

The following command will populate the INDEX_STATS view with statistical information about the index:

ANALYZE INDEX [schema.]index_name VALIDATE STRUCTURE

Of particular interest are the columns LF_ROWS and DEL_LF_ROWS, which show the current number of entry slots in leaf blocks. They also show the total number of entries deleted from leaf blocks, respectively, and LF_ROWS_LEN and DEL_LF_ROWS_LEN, which show the total number of bytes associated with these entries. A rule of thumb is that when the number of, or space used by, deleted entries is greater than 20 percent of total entries, you should consider rebuilding the index to reclaim the space. However, you should also check the PCT_USED column. If this is 80 percent or more-an average amount of space you can expect to see used in a typical index-you may not want to incur the work of rebuilding the index. You should continue to monitor it, however, to ensure that the statistics stay in the preferred ranges.

Monitoring the number of keys (leaf entries) versus the number of levels in the b*tree over time is another measure you can apply to an index to see if it's becoming overburdened with deleted entry space. The latter is shown under the HEIGHT column of INDEX_STATS and shouldn't change if the total number of index entries stays the same. If the index height keeps increasing, it indicates that more branch block levels are being added. This behavior is to be expected if more entries are being stored in the leaf blocks. If, on the other hand, the additional branch levels are supporting the same number (or thereabouts) of leaf entries, the structure is becoming top-heavy with branch blocks. This occurs when branch blocks are being maintained for partially emptied leaf blocks.

Build a history of index statistics
If you want to keep a record of an index's statistics over time, you can issue the command CREATE TABLE table_name AS SELECT * FROM index_stats, where you use a date or sequence number as well as the index name as part of table_name. Remember to do this before you end your session or issue another ANALYZE command.

The statistics in INDEX_STATS aren't used by the Oracle optimizers, and the existence of the view in a session won't change the default optimizer behavior. This behavior is different from the statistics collected with the ANALYZE INDEX...COMPUTE STATISTICS or ANALYZE INDEX...ESTIMATE STATISTICS commands. However, if you use these commands, you'll see slightly different values in the DBA_INDEXES than you see in INDEX_STATS. This is because some values in the latter may reflect rows that have been deleted, whereas the values in the former are based only on the current index contents.

SEE ALSO
To learn more about the use of statistics by Oracle's optimizer,

Rebuilding an Index

You may have a number of reasons to rebuild an index. Here are some of the more common reasons:

You can use two methods to make these changes. The first is to drop the index and recreate it by using the CREATE INDEX command discussed earlier in this chapter. The second is to use the REBUILD option of the ALTER INDEX command. Each method has its advantages and disadvantages, which Table 8.1 summarizes.

Table 8.1  Alternatives for recreating an index
Drop and Rebuild:
Use REBUILD Option:
Can rename indexCan't rename index
Can change between UNIQUE and non-UNIQUE Can't change between UNIQUE and non-UNIQUE
Can change between b*tree and bitmapCan't change between b*tree and bitmap
Needs space for only one copy of the index Needs space for duplicate index temporarily
Requires a sort if data existsNever requires a sort
Index temporarily unavailable for queries Index remains available for queries
Can't use this method if index is used to support a constraint Can use this method for an index supporting a constraint

The biggest advantage to dropping and recreating your index is that you don't need space for the original index and the new index to exist at the same time. However, you can't assume that this means the process has no overhead. To build the new version of the index, Oracle will have to perform a sort of the column data in all the existing rows. This will require memory and, for large tables, may even require the use of temporary segments on disk. The sort process will also be time-consuming for a large table, and the index will be unavailable between the time it's dropped and new version is ready. As you may guess, the sort space overhead and time for the work to be done are the biggest disadvantages to this approach.

If you elect to use the drop-and-recreate approach to rebuild an index, you need to issue the DROP INDEX command (discussed in the next section), and then use the appropriate CREATE INDEX command (discussed earlier in this chapter). If the index is now being used to enforce a constraint, you can't use this method-Oracle will prevent you from successfully dropping the index. Of course, you can temporarily disable the constraint as long as you're prepared to deal with any changes to the table that may prevent you from reenabling again.

The biggest advantages and disadvantages to the rebuild option are exactly the opposite of those for the drop-and-create option. When rebuilding an index, Oracle simply reads the leaf block information, which is already in sorted order, to create the new index. When the index is built, it drops the old copy automatically. Because a sort isn't required, the process is relatively fast. Also, it leaves the original index in place for use by queries that may occur concurrently with the rebuild. The disadvantage is that you must have room in your database for the current and new versions of the index simultaneously. This shouldn't be a problem if you're moving the index to a different tablespace, but may be a deterrent if you need to use the same tablespace.

The syntax of the ALTER INDEX...REBUILD command is as follows:

ALTER INDEX index_name
     REBUILD
     [parallel_clause]
     [NO[LOGGING]]
     [TABLESPACE tablespace_name]
     [NO[REVERSE]]
     [storage_clause]
     [space_utilization_clause]

The parallel_clause takes the same format as that discussed with the CREATE INDEX command. Here it determines whether the rebuilt operation itself can be done in parallel. If it can, each parallel server process will be responsible for retrieving a subset of the current entries and building the new leaf blocks for them. This may cause more extents to be used than a serial rebuild because each slave process will create and use its own extents. Some of these extents may be trimmed back at the end of the operation to remove any unused blocks.

The REVERSE/NOREVERSE option determines whether the replacement index is (REVERSE) or isn't (NOREVERSE) a reverse-key index. You read more about reverse-key indexes later in this chapter; for now, simply note that this option allows you to build the replacement either way, regardless of how the current index is structured.

The TABLESPACE option can be used if you want to move the index to a different tablespace. If you don't include this option, the index will be rebuilt in the same tablespace as the original index, not in your default tablespace.

Back up your work after executing commands without redo entries
If you select the NOLOGGING option, you may want to back up the tablespace in which the new index resides as soon as possible. This option will preclude the creation of redo log entries for the replacement index build just as it does when used with the CREATE INDEX command, with the same consequences discussed.

The other clauses in the command all work exactly as they do for the CREATE INDEX command. You can refer to the earlier section, "Creating an Index," where this command and these options are explained in detail.

Dropping an Index

Unless your index was created for you when a constraint was enabled, you should be able to drop it at any time. You may decide to drop an index for any number of reasons, including the need to recreate it (as discussed in the previous section). You may also want to drop an index because it's no longer being used by the statements accessing the table, or because you're going to perform a major data load (or other intensive DML activity) and don't want to incur the overhead of concurrently maintaining the index. Parallel direct data loads, in particular, can't maintain any indexes on a table, so you should always drop them before using this loading technique.

The syntax for dropping an index is very straightforward:

DROP INDEX [schema.]index_name

You need to include the schema name only if the index doesn't belong to you.

If you need to drop an index that's supporting a constraint, you must first disable or drop the constraint. You can find details on these steps in Chapter 17. If the index had been created automatically as part of the constraint definition and enabling, it will also be dropped automatically.

Managing Bitmap Indexes

A bitmap index is a special type of index particularly suited to large tables and to columns with a small number of distinct values. Their structure also makes them very efficient when two or more conditions have to be met by a query and the columns involved each have a bitmap index. This structure, however, makes them less efficient than a regular b*tree index when indexed columns are updated, or when rows are added or deleted from the table.

Bitmap indexes behave like other indexes when created
After a bitmap index is created, it works behind the scenes just like any other index. The users don't have to do anything different to get a bitmap index to be used by a query than they do for a standard index. In addition, bitmap indexes are maintained transparently as records in the underlying table are inserted, updated, or deleted.

SEE ALSO
For when to use bitmap indexes,

Bitmap Index Internals

A bitmap index is actually stored inside a regular b*tree index structure. The entries in the leaf blocks consist of a value, the rowid of the first row represented in the bitmap, the bitmap itself, and some entry-management bytes. These leaf block entries are stored in order by value, just as in a regular b*tree index, and the index can also contain branch blocks to speed access to a required leaf block value.

Even more bitmap index internals
Before storing them, Oracle applies a compression algorithm to the bitmaps; this reduces the space required to store consecutive bits with the same value. Although this makes the index more compact (and, hence, reduces the number of blocks that have to read into memory when it's being scanned), it slows the process of changing a bit's value. To help reduce the overhead of managing com-pressed bitmaps, each bitmap actually stores a bit for every potential row in its range rather than just for stored rows. In this way, only the bitmap containing the value for the indexed column( s) of a newly inserted row will need to have its bitmap changed, flipping the bit for the corresponding row from 0 to 1. The other bitmap values stay unchanged.

The value portion of the bitmap entries correspond to the values from the table's indexed column. A small table will have one entry for every distinct value in the column, including an entry for NULLs. The bitmap for each entry will contain a bit for every row in the table. Each bit will be set according to the value in the corresponding row-a 1 if the column's value matches the index value, a 0 if it doesn't. This means that there will be only one index entry with a 1-bit for any given row in the index. Figure 8.3 shows a logical view of a bitmap index on a column with four different values.

Figure 8.3 : A bitmap index consists of index entries (values) and strings of bits (bitmaps) stored in a b*tree structure.

For larger tables, where bitmap indexes are most effective, a single bitmap can't store a bit for every row in the table. This is because each index entry-including the value, the bitmap, and the overhead bytes-must fit into less than half a block, a restriction placed on every Oracle index. To cover a large table, multiple index entries are created for each distinct value. Each entry contains a bitmap that represents a disjoint subset of the rows in the table. These bitmaps work the same way as described earlier-a rowid in each entry points to the first row covered by the entry's bitmap, and each bitmap uses a 1-bit to indicate that the value is in the corresponding row and a 0-bit for all other rows.

Although bitmap indexes are typically built on a single column, they can cover multiple columns. Composite bitmap indexes work exactly as previously described, except that each entry or set of equal-valued entries corresponds to distinct values in the combination of columns.

Using Bitmap Indexes

Bitmap indexes were designed to support the type of tables that are becoming common in data warehouse databases-very large, relatively static, and not normalized. They can support large tables easily because they're relatively compact, even on tables with millions of rows. They're best for static tables because they use compression algorithms that require heavy maintenance when indexed values are added, dropped, or updated. The same values will tend to occur frequently in tables that are not normalized, which wouldn't make their columns good candidates for a standard index.

Bitmap indexes are even efficient when two or more values are requested from a column because the bitmaps from each value can be combined with a Boolean OR. They're very efficient at finding rows based on multiple column queries, such as the following:

WHERE region = 'West' AND week = 43 AND salesman = 202 AND
product = 1497 AND color = 'RED'

With a bitmap index on each column named in this example, the required rows can be found by performing a Boolean AND on each bitmap. The resulting bitmap will have a 1-bit only in the positions corresponding to rows having all the required values. Boolean ANDs and ORs are some of the most efficient operations a computer can perform.

Use a large sample size when estimating bitmap index storage requirements
A single bitmap can store only a finite number of bits, representing a fixed number of rows. In a small table, every row can be covered by one bitmap for each distinct value in the indexed column. At some point, the insertion of a new row will require a second bitmap to be built for each value, doubling the number of blocks needed to store the bitmap entries. An insertion will cause the third bitmap to later be built for each value, increasing the number of leaf blocks required by another 50 percent. Until your index is large enough so that the addition of a new set of bitmaps per value doesn't increase its size significantly, you can't extrapolate its final size accurately.

Building a Bitmap Index

To create a bitmap index, you should use the CREATE INDEX command discussed earlier, except that you have to include the keyword BITMAP where the optional UNIQUE keyword appears. All the other options shown in Listing 8.1 for the CREATE INDEX command can be used for a bitmap index.

You can also rebuild a bitmap index by using the ALTER INDEX...REBUILD command. The only option shown in the previous section that you can't use for a bitmap index is REVERSE. Of course, you can always drop and rebuild a bitmap index by using the standard DROP INDEX command.

One aspect of bitmap indexes that's difficult to manage is planning how big they will be, using a formula such as the one published by Oracle for regular b*tree indexes. The amount of data compression isn't going to be obvious from an examination of sample data. Your best way to estimate the size of the final index is to extrapolate it from an index built on a sample table.

Managing Reverse-Key Indexes

A reverse-key index is one in which the value portions of the entries in the leaf blocks are saved in reverse bit order. The entries themselves are stored in the order of the resulting bit patterns rather than in the order of the original values. The branch blocks, of course, also reflect the stored values-that is, the reversed bit values rather than the values as found in the tables. The result of this bit reversal is to make somewhat random the locations of entries throughout the index. Values that normally would be stored sequentially are likely to be scattered throughout the index on different blocks.

You use a reverse-key index to help avoid the problem of having an index develop lots of empty spaces because entries were dropped and no new records could reuse the space. This is likely to happen in a normally organized index when new records have increasingly higher values, such as sequence numbers, but not all the older entries are removed from other blocks.

A good example of this type of index is one on an employee ID column. In a growing company, more new employees are added than existing employees leave. The remaining employees have index entries for their ID numbers on the older leaf blocks, which are probably partially empty as a result of some earlier employees leaving. However, the new employees' ID numbers will be too high in value to be stored within the value ranges of these blocks. Over time, the index may become inefficient due to increasing amounts of unusable free space.

As discussed earlier, you would have to drop and rebuild the index to compress the earlier values into full blocks. Rather than have to compress the index regularly, you could build it as a reverse-key index, which should spread high and low values around the used blocks. When entries are removed, new ones stand a reasonable chance of fitting into the ranges opened by the deletions.

SEE ALSO
To learn when to use reverse-key indexes,

Creating a Reverse-Key Index

Use the CREATE INDEX command shown in Listing 8.1 to create a reverse-key index, but include the keyword REVERSE where the optional NOSORT keyword appears. All the other options shown in Listing 8.1 for the CREATE INDEX command can be used for a reverse-key index.

To convert an existing index into a reverse-key index, use the ALTER INDEX...REBUILD command as described earlier. Use the keyword REVERSE as shown in that section. Of course, you can always drop a regular index and rebuild it as a reverse-key index, using the standard DROP INDEX command and the CREATE INDEX command as just described.

Rebuilding Reverse-Key Indexes

To rebuild a reverse-key index, you can drop the original index and create it again. If you decide to use the ALTER INDEX...REBUILD command, you can make any other changes you need, such as changing the tablespace or storage values.

Changing the type of index during a rebuild
A reverse-key index will be rebuilt as a reverse-key index whether or not you use the optional keyword REVERSE. However, if you use the NOREVERSE keyword, your rebuilt index will no longer be a reverse-key index, but will instead be a standard index.

Managing Index-Organized Tables

An index-organized table is a hybrid structure consisting of b*tree index and regular table elements. The b*tree components are the use of ordered entries on leaf blocks, and branch blocks that provide fast access to the required leaf blocks. The table-like features include the use of standard SQL commands for table management, such as creation and alteration, and the use of standard DML and SELECT commands to manipulate rows. (See Chapter 7for information on table creation and management.) The complete rows-not just the indexed column data-are stored in the b*tree block structure.

Index-organized tables provide the fast data access that's normally achieved only by creating an index on a table but don't require the storage space, or the management, of a separate index. You should consider using an index-organized table for lookup tables, which are almost exclusively used to find a value, or set of values, based on a key. For example, a table containing city names you find by entering the postal zip code are an ideal candidate for an index-organized table.

Indexed-organized tables do have some restrictions. Unlike a regular table, you can't create additional indexes; therefore it's unlikely that you would want to use this structure for tables with columns with different columns that you typically access by specific value. Remember that UPDATE and DELETE commands on specific rows, not just queries, can benefit from indexes on the columns referenced in the WHERE clause.

The ordering of an indexed-organized table is based on its primary key, so the table must have a primary key defined. This, of course, is recommended for all relational tables, but isn't a requirement and makes a second restriction (if you consider it such) on index-organized tables.

Finally, as with a regular index, the space vacated by a dropped row can be reused only if a subsequently inserted row has the same value, or a value that lies between the adjacent column values. This can result in a table with more free space than you would expect to see in a regular table where the PCTUSED setting controls the reuse of released space.

SEE ALSO
To see when to use index-organized tables,

Why Index-Organized Tables Don't Support Additional Indexes

An index uses a rowid to point to a specific row in a table. The rowid identifies where the row is physically stored-the block number within that file and, to ensure that each rowid is unique, the row number on that block. It's this last piece that makes rowids unusable for entries in index-organized tables. The row number for a regular table's row is actually the number of the row's entry in the block's row directory. The row directory is simply an ordered list of byte addresses, entry one containing the byte where row one starts, entry two containing the entry where row two starts, and so on. If the row has to be moved to another location in the block due to an UPDATE that extends its length, for example, Oracle simply moves the row and updates its directory entry to reflect its new starting byte address.

In an index-organized table, the rows have to be stored in sorted order, just as the entries in an index are stored. If a new row is inserted into the table with a primary key value that lies between two existing values, the row must be logically stored between the two rows containing these values. Suppose that the table contains the values "Baltimore" and "Boston," and that they are in row positions 10 and 11 on a particular block-that is, the tenth entry in block directory contains the starting byte address for the row containing "Baltimore" and its eleventh entry points to the "Boston" row.

Index-organized tables can't support unique constraints
A unique constraint is enforced by Oracle with an index built on the constrained columns. Due to the limitation that index-organized tables can't have additional indexes, you can't add a unique constraint to such a table.

If we add a row with "Biloxi" as its primary key, it now needs to use the 11th slot in the block row directory for its byte address; the entry for "Boston," and any other rows following it, have to be moved up to the next slot. Were the rows being treated as regular table entries, this action would effectively change the rowid addresses for the "Boston" row and the rows following it on the block. Furthermore, if we had been able to build indexes on this table, Oracle would have to find every index entry that pointed to these rows and update the rowid values, even if the rows themselves didn't move by even one byte from their original locations.

You can help control space usage within an index-organized table by splitting each row into two pieces, the first of which acts more like a regular index entry and the second of which behaves more like a table row. Oracle will build distinct segments to store the index-organized table when you do this. The first segment will hold the portion of each row that constitutes the indexed, or ordered, values; the second segment will hold the remainder of each row. This approach reduces the amount of wasted space that can occur when a row is dropped or the primary key value is changed, causing the row to be relocated. Only the row piece in the first of the two segments, the ordered segment, leaves space that has to be reused by an equivalent value. The contents in the second segment are treated like the rows in any other table-space can be reused by any other row piece that fits on the block, and an UPDATE doesn't necessarily cause the row to migrate to a new block.

To build your index-organized table in two such segments, you begin by determining how much of an average row you want to keep in the index portion and how much in the table portion. Generally, you want the primary-key columns in the former and the balance of the row in the latter. You then need to compute the percentage of the block that the index portion consumes for an average row. When that's done, you can build the index-organized table and identify this percentage with storage information for the table's two parts. You can even identify a column that determines where the break between the index portion and the row portion is made; this will ensure that the entire primary key is kept together, even if its length is less than the specified percentage.

Creating an Index-Organized Table

To create an index-organized table, you use the CREATE TABLE command (discussed in detail in Chapter 7 with some additional keywords and certain restrictions. Listing 8.2 shows the syntax for this command.


Listing 8.2  The CREATE TABLE command for an index-organized table

     01:          CREATE TABLE [table_schema.]table_name
     02:          ([column_description [,...],
     03:          [CONSTRAINT constraint_name]
     04:          PRIMARY KEY (column_n|ame [,...]))
     05:          ORGANIZATION INDEX
     06:          [TABLESPACE tablespace_name]
     07:          [storage_clause]
     08:          [space_utilization_clause]
     09:          [[PCTTHRESHOLD integer]
     10:          [INCLUDING column_name]
     11:          OVERFLOW
     12:               [TABLESPACE tablespace_name]
     13:               [storage_clause]
     14:               [space_utilization_clause]
     15:               [PCTUSED integer]]

On line 2, column_description includes the column name, the column datatype, an optional size, and an optional constraint.

Restrictions on constraints in index-organized tables
Because an index-organized table can't have additional indexes, you can't use a unique constraint on any column or column combination. You can include any other type of constraint, however.

SEE ALSO

To learn about the column-definition options used when building a table,
The syntax to add a constraint to a column definition,

PRIMARY KEY on line 3 can be a named or an unnamed constraint, defined as a column or a table constraint on a single column or composite key.

SEE ALSO
For details on primary key constraints,

Line 4's ORGANIZATION INDEX are required keywords.

On line 8 PCTTHRESHOLD integer sets the percentage of space in a block that any row can consume; the balance of a row that exceeds this threshold is stored in an overflow area. The default value is 50.

Interaction of PCTTHRESHOLD and OVERFLOW clauses
If you include a value for the PCTTHRESHOLD option and don't include an OVERFLOW clause, any row that exceeds the percentage of block space defined by PCTTHRESHOLD will be rejected.

INCLUDING column_name on line 9 names the first column that will be placed in the overflow area, if needed; the column must be the last column defined in the primary key or a non-primary key column.

OVERFLOW on line 10 is a required keyword that introduces the definition of the overflow segment on lines 11-through 14:

Monitoring Index-Organized Tables

The data dictionary stores information about index-organized tables in a number of different tables that you can see through the DBA_/ALL_/USER_TABLES and the DBA_/ALL_/USER_INDEXES views. The *_TABLES views contain the basic definition of the table under the name you provide in the CREATE TABLE command. These views also contain information about the overflow segment, which, if present, contains values for the extent sizes and related storage information. The overflow segment also holds the statistics generated from the ANALYZE TABLE command, whether you name the base table or the overflow table name in the command.

The *_INDEXES views contain most of the segment's description, including its storage and space utilization values, as well as statistics generated with the ANALYZE command. These views also show the name given to the index in which the rows are actually stored. If you don't specify an overflow option, these views will contain the only detailed information about the physical storage of the segment.

Identify the index characteristics' names and, possibly, its overflow segment to find out all the information about an index-organized table in the data dictionary. You can use the following query to find values in the data dictionary columns specific to the objects comprising index-organized tables:

Finding the extents belonging to an index-organized table
If you query the DBA_EXTENTS data dictionary view to find the segments belonging to an index-organized table, you won't find them under the segment name you gave to the table. Instead, you find them listed under the Oracle-sup-plied segment names for the index segment and, if there is one, the overflow table segment. To relate these extents to a specific index-organized table, therefore, you have to use a query, like the one provided, to find the names of its extents.

SELECT t.owner AS "Owner",
     t.table_name AS "Table Name",
     i.index_name AS "Index Name",
     i.tablespace_name AS "Index TS",
     o.table_name AS "Overflow Segment",
     o.tablespace_name AS "Overflow TS",
     i.pct_threshold AS "Overflow Pct",
     i.include_column AS "Include Col"
FROM dba_tables t, dba_indexes i, dba_tables o
WHERE i.pct_threshold IS NOT NULL
     AND t.table_name = i.table_name
     AND o.iot_name(+) = t.table_name;

You can, of course, restrict this query by naming a specific table or a specific owner (for example, adding the clause AND t.table_name = 'table_name' or AND t.owner = 'owner_name', respectively). Of course, after you use this query to identify the names Oracle has given to the indexes and, when appropriate, to the overflow objects, you can use their names to query the *_TABLES and *_INDEXES views to see other information about them.

To find out how well a specific indexed-organized table is behaving, you can use the ANALYZE INDEX...VALIDATE STRUCTURE on the index built for the table. You should look for the same criteria discussed earlier regarding determining whether a regular index needs reorganizing. However, if you do find problems with an index for an index-organized table, you can't use the ALTER INDEX...REBUILD command to recreate it. If you determine that it does need reorganizing, you may have to drop and recreate the entire index-organized table. However, you can use the ALTER INDEX command on the index itself or the ALTER TABLE command on the base table name in order to make some changes to the index that you could otherwise make to any other primary-key index with the ALTER INDEX command.

You can also monitor the block usage in an index-organized table's overflow segment. To do this, use the ANALYZE TABLE command to compute or estimate statistics on the overflow segment. As mentioned earlier, you can name the base table name or the overflow segment name in the command. If you determine that there are storage problems, you may need to drop and recreate the whole index-organized table. Unfortunately, you aren't allowed to make changes to the overflow segment directly with an ALTER TABLE command in the current release.


© Copyright, Macmillan Computer Publishing. All rights reserved.