'; zhtm += '

' + pPage + ''; zhtm += ''; figDoc.write(zhtm); figDoc.close(); popUpWin.focus(); } //--> Using Oracle8 -- Chapter 22 - Tuning for Different Types of Applications

Using Oracle8

Chapter 22 Tuning for Different Types of Applications

Transaction Processing Tuning
DSS and Data-Warehousing Tuning

Tune OLTP systems
Tune DSS systems
Optimize star queries
Optimize parallel operations

Transaction Processing Tuning

Online Transaction Processing (OLTP) systems have several characteristics:

High throughput
Insert/update intensive
Large amount of concurrent users
Data volume changing rapidly

Typical OLTP applications include airline reservation systems and banking applications. When you design an OLTP system, you must ensure that the large numbers of concurrent users don't affect the system performance. At the same time, you should get high availability, speed, and recoverability. You need to consider several issues when using OLTP systems:

Rollback segments
Indexes, clusters, and hashing
Transaction modes
Size of the data block
Transaction processing monitors
Multithreaded server
Well-tuned memory structures

Tuning Rollback Segments

In an OLTP environment, an increased number of transactions occur, which may require a lot of rollback segments. When you're using parallel DML statements, the rollback segments become more important. You should have rollback segments that belong to tablespaces with lots of free space. You should also have unlimited rollback segments or a very high value for MAXEXTENTS.

SEE ALSO
More information on tuning rollback segments,

Using Discrete Transactions

Discrete transactions can be used concurrently with standard transactions. Discrete transactions are useful for transactions with certain characteristics:

Transactions are short and non-distributed.
Transactions modify very few database blocks.
Transactions don't change individual database blocks more than once per transaction.
Transactions don't modify data that will be needed by long-running queries for read consistency.
Transactions don't need to see the new value of data after modifying the data.
Modified tables don't contain any LONG values.

ORA-1555 during discrete transactions

Undo information isn't generated during a discrete transaction. Therefore, you may get "snap-shot too old" errors if a discrete transaction starts and commits during a long query, and the discrete transaction modifies blocks used by the long query.

When you use a discrete transaction, all the changes made to any data will be deferred until the transaction commits. Redo information is generated but undo information isn't generated. The redo information is stored in a separate location in memory and gets written to the redo log when the transaction commits.

Changes made during discrete transactions

Discrete transactions can't see their own changes. Therefore, they can't perform inserts or updates on both tables involved in a referential integrity constraint.

Use discrete transactions

Set the parameter DISCRETE_TRANSACTIONS_ENABLED to TRUE in the initialization file. (If DISCRETE_TRANSACTIONS_ENABLED is set to FALSE, all the transactions will be run as standard transactions.)
Use the procedure BEGIN_DISCRETE_TRANSACTION as the first statement in a transaction.

Scope of a discrete transaction

The procedure BEGIN_DISCRETE_TRANSACTION is effective only for the current transaction. As soon as the transaction commits or rolls back, the next transaction will be treated as a standard transaction.

The example in Listing 22.1 uses a discrete transaction for a movie theater application. The name of the movie, show time, and number of tickets are passed as the arguments to this procedure. The procedure checks the tickets available. If enough tickets are available, it issues the tickets; otherwise, it will indicate the number of tickets available for that movie and show time.

LISTING 22.1 A movie theater application using a discrete transaction

     01:      CREATE PROCEDURE BUY_TICKETS (movie_name IN VARCHAR(25),
     02:              num_of_tickets IN NUMBER(10),
     03:              movie_datetime  IN DATETIME,
     04:              status OUT VARCHAR(5))
     05:     AS
     06:     DECLARE
     07:       theatre_capacity    NUMBER(3);
     08:       tickets_sold        NUMBER(3);
     09:       tickets_available   NUMBER(3);
     10:     BEGIN
     11:       dbms_transaction.begin_discrete_transaction;
     12:       FOR I IN 1 . .  2 LOOP
     13:            BEGIN
     14:                 SELECT theatre_max, tics_sold
     15:                 INTO theatre_capacity, tickets_sold
     16:                 FROM movies
     17:                 WHERE movie_title = movie_name
     18:                 AND show_time = movie_datetime
     19:                 FOR UPDATE;
     20:       tickets_available := (theatre_capacity - tickets_sold)
     21:       IF tickets_available <= num_of_tickets
     22:       THEN
     23:            status := "Sorry. Only " & tickets-available & 
                             "tickets are available";
     24:       ELSE
     25:            UPDATE movies
     26:            SET tics_sold = tickets_sold + num_of_tickets
     27:            WHERE movie_title = movie_name
     28:            AND show_time = movie_datetime;
     29:   
     30:            status := "Requested number of tickets available."
     31:       END IF;
     32:       COMMIT;
     33:       EXIT;
     34:            EXCEPTION
     35:       WHEN dbms_transaction.discrete_transaction_failed THEN
     36:            ROLLBACK;
     37:       END;
     38:             END LOOP;
     39:           END;

Discrete transaction failure

In this example, if the discrete transaction fails, the DISCRETE_TRANSACTION_FAILED exception occurs and the transaction is rolled back. The second iteration of the loop will re-execute the transaction but this time as a standard transaction.

Transaction Processing Monitors (TPMs)

When a large number of concurrent users are connecting to the database, the demand for resources can become very heavy. On UNIX, the limiting factor is usually the physical memory. On Windows NT, each process can address a maximum of 4GB address space; 2GB of this 4GB address space is shared between processes. The shadow processes are implemented as threads within a single process; therefore, the limiting factor becomes the 2GB address space limit. There are several solutions to this problem, such as Multi-Threaded Server, concurrent manager, and connection multiplexing.

Fault tolerance using TPM

The application client isn't aware of the application server servicing it; therefore, fault tolerance can be provided by using the transaction monitor architecture.

Another approach is to split the application logic into an application client and application server. The application client becomes responsible for the collection and presentation of the data, whereas the application server becomes responsible for providing business services that it implements through accessing various data stores (resource managers). The application server will normally be written in a 3GL such as C or COBOL, and the interface to the resource manager is using the normal precompiler or API mechanisms. This environment is normally managed with a transaction processing monitor (also called a transaction monitor), such as BEA Tuxedo. The TPM will normally provide the following:

A messaging interface
Message routing between client and server
Application server management including registration, startup, shutdown, and load balancing

A typical application server's logic

Connect to Oracle:

       EXEC SQL CONNECT

Wait for the message from the application client (an ATMI call for BEA Tuxedo).
Access Oracle:

       EXEC SQL SELECT
       EXEC SQL UPDATE

Commit:

       EXEC SQL COMMIT;

Send a reply to the application client (an ATMI call for BEA Tuxedo).
Return to step 2 and repeat the process.

In a high-concurrency environment, many application clients can share the same application server. The TPM is responsible for routing messages. As the load increases, the TPM can also spawn more application servers.

Monitoring transactions in pending state

The DBA_2PC_PENDING and the DBA_2PC_NEIGHBORS views can be used to monitor if the transaction is in a pending state.

Distributed transaction processing can be obtained in the transaction-processing (TP) architecture by routing messages to different application servers. This can be homogeneous (Oracle-Oracle) or heterogeneous (Oracle-Sybase). The limitation is that the transaction is committed separately on the data stores. Thus, you'll need to use XA to achieve global heterogeneous transactions. In the XA architecture, each resource manager exports an XA API. The TPM becomes the 2PC coordinator and uses the XA API to control the prepare/commit/rollback of transactions within the resource managers. Direct commits and rollbacks from the application servers are replaced with calls to the TPM through the XA interface.

Transaction Recovery in the XA Environment

If the TPM crashes or becomes unavailable while performing a 2PC, a transaction may be left in pending state. In this state, it may be holding locks and preventing other users from proceeding. You can use the commit force or rollback force statements as needed to manually force the transaction to commit or rollback.

Purchasing a TPM

Several vendors provide UNIX TPMs. The following support XA:

Tuxedo System/T, UNIX System Laboratories
Top End, NCR Corporation
Encina, Transarc Corporation
CICS/6000, IBM Corporation
CICS 9000, Hewlett-Packard

The following don't currently support XA:

VIS/TP, VISystems, Inc.
UniKix, UniKix, Inc.
Micro Focus Transaction System, Micro Focus

Writing an Oracle TPM Application

The TPM vendor documentation should describe the actual APIs used to talk to the TPM. However, all of them have a way to indicate where a transaction begins and ends and a way to send a request and receive a response from a client to a server. In the following example, which uses Tuxedo /T verbs, the SQL COMMIT is replaced with the TPM COMMIT.

The first example is an Oracle managed transaction. For the client, use

Tpcall("debit_credit");

Use the following for the server:

Debit_credit_service (TPSCVCINFO *input)
{
     extract data from the input;
     EXEC SQL UPDATE debit_data;
     EXEC SQL UPDATE credit_data;
     EXEC SQL COMMIT WORK;
     Tpreturn(output_data);
}

The next example is a TPM-managed transaction using XA. Use the following for the client:

Tpbegin();
Tpcall("debit");
Tpcall("credit");
Tpcommit();

For server 1, use

Debit_service(TPSCVCINFO *input)
{
     extract data from the input;
     EXEC SQL UPDATE debit_data;
     Tpreturn(output_data);
}

For server 2, use the following:

credit_service(TPSCVCINFO *input)
{
     extract data from the input;
     EXEC SQL UPDATE credit_data;
     Tpreturn(output_data);
}

For the TPM API and programming overview, you can obtain the material from the vendor.

DSS and Data-Warehousing Tuning

DSS systems generally perform queries on huge amounts of data that has been gathered from OLTP systems. You need to consider several issues when using DSS systems:

Index management
Data block size
Star queries
Parallel execution

Adding Indexes

An index is generally used to provide a fast access path to the data. When using indexes in a DSS environment where you would be dealing with a huge amount of data, you need to take several special measures:

Create indexes after inserting data in the table. Use SQL*Loader or Import to load the data first, and then create the index. This will be faster because the index doesn't have to be maintained for each insertion; instead, it's created at the end when all the data has been inserted.
Create enough indexes based on the type of queries you'll be running.

Using the UNRECOVERABLE option

Using the UNRECOVERABLE option during index creation can speed up index creation. Because this won't generate any redo log records during index creation, you should back up after the index is created.

In the DSS system, the data changes won't be a lot, so you should be able to create relatively more indexes without worrying about the performance. In general, be careful when creating indexes because unnecessary indexes can degrade performance.

When you're using star queries, the indexes on the fact table can be partitioned or non-partitioned. Local partitioned indexes are the simplest, but their disadvantage is that a search of local non-prefixed index requires searching of all the index partitions.

Managing Sort Space

In a DSS environment, you usually will end up with queries that require some kind of sorting. Several initialization parameters can be manipulated to manage the amount of sort space used by these operations:

SORT_AREA_SIZE. You can set this parameter from the initialization file. It specifies the amount of memory to allocate per parallel server process for sort operations. If you have sufficient memory on your system, you'll benefit in performance by setting a large value for SORT_AREA_SIZE because the entire operation can be performed in memory. If memory isn't enough, you should set SORT_AREA_SIZE to a lower value and increase the size of the buffer cache so that blocks from temporary sort segments can be cached. The SORT_AREA_SIZE is relevant to parallel query operations and to the query portion of DML or DDL statements.

Effect of sort area size

As the sort area size decreases, the amount of I/O increases due to the need to perform merges on a large number of runs. A large sort area, on the other hand, can result in a high operating system paging rate.

SORT_DIRECT_WRITES. You can set this parameter in the initialization file. The recommended value is AUTO. When this parameter is set to AUTO and SORT_AREA_SIZE is greater than 10 times the buffer size, the buffer cache will be bypassed for the writing of sort runs. Avoiding the buffer cache can improve performance by reducing the path length, reducing memory bus utilization, and reducing LRU latch contention on SMP machines. SORT_DIRECT_WRITES has no effect on hashing.
SORT_AREA_RETAINED_SIZE. This parameter specifies the maximum amount of User Global Area (UGA) memory retained after a sort run completes. If more memory is required by a sort, a temporary segment is allocated and the sort becomes an external sort. SORT_AREA_RETAINED_SIZE is maintained for each sort operation in a query.

Large sort areas can be used effectively by combining a large SORT_AREA_SIZE with a minimal SORT_AREA_RETAINED_SIZE.

Release memory after completing sorts

The SORT_AREA_RETAINED_SIZE parameter allows you to specify the level to which memory should be released as soon as possible after a sort completes. If memory isn't released until the user disconnects, large sorts will create problems in the system.

Managing Hash Join Space

If Cx(Y) is used to represent the cost to perform operation x on table Y, the cost to perform a hash join is as follows (provided that you have sufficient memory available):

Chj(T1,T2) < Cread(T1) + Cread(T2) + Chash(T1,T2)

When you use hash join operations, an in-memory hash table is built from smaller table, and then the larger table is scanned and joined by using a hash table probe. Hash joins are applicable to equijoins, anti-joins, and outer joins. Indexes aren't required to perform a hash join.

Suppose that you have two tables, french and engineers, and want the names of all the French engineers. french is the smaller of the two tables.

SELECT /*+ use_hash(french) */ french.name
FROM french, engineers
WHERE french.name = engineers.name

To perform the hash join, an area of memory known as the hash memory is allocated (see Figure 22.1).

Figure 22.1 : The in-memory hash table is obtained by using the french table.

Applying hash functions to data

When applying hash functions to data, Key1 = Key2 => hash Key1 = hash(Key2 ) but the reverse isn't true. (hash represents a hash function, and Key1and Key2represent data.)

The first stage of the join involves scanning and partitioning the french table and building an in-memory hash filter and hash table. This is the build, and the french table is known as the build input. The french table is hash partitioned into smaller chunks so that at least one partition can be accommodated in the hash memory, which reduces the number of comparisons during the join. As the build input is scanned, parts of some partitions may be written to disk (temporary segment).

A hash filter is created for each partition and stays in memory even if the partition doesn't fit. The hash filter is used to efficiently discard rows that don't join. After the french table is scanned completely, the size of each partition is known, as many partitions as possible are loaded into memory, and a single hash table is built on the in-memory partitions.

Case 1: The hash memory is large enough for all the partitions of the french table; therefore, the entire join is completed by simply scanning the engineers table and probing the build.

Case 2: The hash memory isn't large enough to fit all the partitions of the french table. In this case, the engineers table is scanned, and each row is partitioned using the same method (see Figure 22.2). Then for each row,

The "no hope" rows are discarded after verifying them with the hash filter.
Rows that correspond to in-memory french partitions are joined immediately, and rows that don't are written to an engineers partition on disk (temporary segment).

Figure 22.2 : Hash joins are performed by using the in-memory hash table and the hash filters.

After the engineers table is scanned, phase 2 begins. In phase 2, the smaller of the french and engineers partitions are scanned into memory and a hash table is built. The larger partition is scanned, and the join is completed by probing the hash table. If a partition won't fit in memory, the join will degenerate to a nested loop type mechanism.

You can set the following parameters in the initialization file or by using the ALTER SESSION command from Server Manager:

HASH_JOIN_ENABLED (the default is TRUE). Setting this to TRUE allows you to use hash joins.
HASH_AREA_SIZE (the default is twice the SORT_AREA_SIZE). This parameter specifies the size of the hash memory. It should be approximately half of the square root of S, where S is the size (in MB) of the smaller of the inputs to the join operation. The value shouldn't be less than 1MB. HASH_AREA_SIZE is relevant to parallel query operations and to the query portion of DML or DDL statements.

Set the size of hash area appropriately

Each process that performs a parallel join operation uses an amount of memory equal to HASH_AREA_SIZE. Setting HASH_AREA_SIZE too large can cause the system to run out memory, whereas a setting that's too small can degrade performance.

HASH_MULTIBLOCK_IO_COUNT (the default is DB_FILE_MULTIBLOCK_IO_COUNT). This parameter specifies the number of sequential blocks a hash join should read and write in a single I/O.

Designing Tables for Star Queries

Star queries are most suitable for data warehouses, where there's a huge amount of data but relatively few data changes. A star schema is a natural data representation for many data warehousing applications; it consists of one very large table, referred to as a fact table, and many smaller tables, called dimension tables. (Note that the rule-based optimizer doesn't recognize star schemas.) Dimension tables aren't related to each other; however, each dimension table is related to the fact table through a primary key/foreign key relationship. Usually, there's a b*tree concatenated index on the columns of the fact table that are related to the dimension tables.

An example of a star schema would be a retail environment (see Figure 22.3).

Figure 22.3 : In a star schema, the relationship between the involved tables represents a structure.

The following tables can be part of a retail environment:

sales with columns customer_key, supplier_key, part_key, store_key, date, quantity, and total_price
suppliers with columns supplier_key, name, address, and telephone
customers with columns customer_key, name, and address
parts with columns part_key, name, and cost
stores with columns store_key, address, and telephone

The sales table can be the fact table and contain several millions of records for the different sales transactions. The dimension tables-suppliers, customers, parts, and stores-are relatively small and provide additional details about a sale.

If you need to find sales information, use a star query. It uses specific details from the dimension tables. For example, if you want to find the total sales of a specific part purchased by a specific group of customers from a specific store and sort the results by suppliers, use the following query:

Select supplier.name, sum(total_sales)
from sales, customers, parts, suppliers, stores
where
sales.customer_key = customer.customer_key and
sales.part_key = parts.part_key and
sales.supplier_key = suppliers.supplier_key and
sales.store_key = stores.store_key and
customers.name in ('IBM','HP','COMPAQ') and
parts.name = '2 GB disk' and
stores.name = 'EGG HEAD SOFTWARE'
group by suppliers.name;

Tuning Star Queries

To efficiently use star queries, you must use the cost-based optimizer (set OPTIMIZER_MODE to CHOOSE) and analyze with COMPUTE STATISTICS all the tables involved in the star query. To analyze tables, use the following command from the Server Manager prompt:

SVRMGR> ANALYZE TABLE tablename COMPUTE STATISTICS;

You can also tune star queries in other ways:

You can improve the optimizer plan by using hints. The most effective method is to order the tables in the FROM clause in the order of the keys in the index with the large table last. Then use the USE_NL hint or the STAR hint.

STAR hint versus USE_NL hint

The STAR hint /*+ STAR */ forces a star query plan to be used if possible. The USE_NL hint /* + USE_NL (table) */ causes each specified table to join with another row source with a nested loops join, using the specified table as the inner table.

The columns in the concatenated index of the fact table should take advantage of any ordering of the data. If all the queries specify predicates on each dimension table, a single concatenated index is sufficient, but if you use queries that omit the leading columns of the concatenated index, you'll have to create additional indexes.
Denormalized views can be effective at times when too much normalization of information can cause the optimizer to consider many permutations and result in very slow queries. For example, you have two tables, brands and manufacturers, that can be combined into a product view as follows:

CREATE VIEW product AS SELECT /*+ NO_MERGE */ *
FROM brands,manufacturers WHERE brands.mfkey =
manufacturers.mfkey;

This will improve performance by caching the result of the view and reducing the executions of the small table joins.

You also can use a star transformation by setting STAR_TRANSFORMATION_ENABLED to TRUE in the initialization file and using the STAR_TRANSFORMATION hint in the query. Tables with the following characteristics can't be used with a star transformation, however:

Tables with very few bitmap indexes
Anti-joined tables
Tables already used as dimension tables in a subquery
Remote tables
Tables that are unmerged views

The star transformation is ideal under any of the following conditions:

The fact table is sparse.
There are a lot of dimension tables.
In some queries, not all dimension tables have constraining predicates.

The star transformation doesn't rely on computing a Cartesian product of the dimension tables; instead, it uses bitmap indexes on individual fact table columns. It works by generating new subqueries that can be used to drive a bitmap index access path for the fact table.

SEE ALSO
More information on using the cost-based optimizer,
See how to use hints,
More information on bitmap indexes,

Parallel Operations

Oracle can perform the following operations in parallel:

Parallel query
Parallel DML (INSERT, UPDATE, DELETE, APPEND hint, and parallel index scans)
Parallel DDL
Parallel recovery
Parallel loading
Parallel propagation (for replication)

You should try to parallelize operations that have high elapsed time or process a large number of rows.

Tuning Parallel Operations in a DSS Environment

You can use several techniques to optimize parallel operations.

One way is to increase the default degree of parallelism for I/O- bound operations and decrease the degree of parallelism for memory-bound operations. Follow these guidelines when adjusting the degree of parallelism:

Use the ALTER TABLE command or hints to change the degree of parallelism.
Reducing the degree of parallelism will increase the number of concurrent parallel operations.
If the operation is I/O-bound, spread the data over more disks than there are CPUs and then increase the parallelism in stages until it becomes CPU-bound.

You also can verify that all the parts of the query plan for SQL statements that process huge amounts of data are executing in parallel. By using EXPLAIN PLAN, verify that the plan steps have an OTHER_TAG of PARALLEL_TO_PARALLEL, PARALLEL_TO_SERIAL, PARALLEL_COMBINED_WITH_PARENT, or PARALLEL_COMBINED_WITH_ CHILD. Any other keyword or null indicates serial execution and possible bottleneck. Follow these guidelines to improve parallelism of the SQL statements:

Because Oracle can parallelize joins more efficiently than subqueries, you should convert your subqueries into joins.
Use PL/SQL functions in the WHERE clause of the main query rather than correlated subqueries.
Queries with distinct aggregates should be rewritten as nested queries.

You can create and populate tables in parallel by using the PARALLEL and the NOLOGGING options with the CREATE TABLE statement. For example,

CREATE TABLE new_table PARALLEL NOLOGGING
AS SELECT col1,col2, col3 FROM old_table;

You also can create indexes by using the PARALLEL and NOLOGGING clauses of the CREATE INDEX statement. Index creation takes place serially unless you specify the PARALLEL clause. An index created with an INITIAL of 5MB and a PARALLEL DEGREE of 8 will use at least 40MB during index creation because the STORAGE clause refers to the storage of each subindex created by the query server processes.

The number of CPUs can affect the amount of parallelism

If the degree of parallelism isn't specified in the PARALLEL clause of CREATE INDEX, the number of CPUs is used as the degree of parallelism.

Another technique to optimize parallel operations is to set the initialization parameters correctly:

OPTIMIZER_PERCENT_PARALLEL. The default value of 0 will cause the least usage of resources and will generally result in a long response time. On the other hand, a value of 100 will cause the optimizer to use a parallel plan, unless a serial plan is faster. The recommended value is 100 divided by the number of concurrent users.

Parallelism is influenced by the usage of hints

A non-zero setting of OPTIMIZER_PERCENT_PARALLEL is overridden if you use a FIRST_ROWS hint or set OPTIMIZER_MODE to FIRST_ROWS.

PARALLEL_MAX_SERVERS. The recommended value is

2×CPUs×Number of Concurrent Users.

PARALLEL_MIN_SERVERS. It's recommended that you set this to the same value as PARALLEL_MAX_SERVERS.
SHARED_POOL_SIZE. Oracle reserves memory from the SHARED_POOL for the parallel server processes. You can use the following formula to determine the setting of this parameter:

(CPUs + 2)×(PARALLEL_MIN_SERVERS) ×1.5×(BLOCK_SIZE)

Set PARALLEL_MIN_SERVERS appropriately

You can use the V$PQ_SYSSTAT view to determine if you've set the value of PARALLEL_MIN_SERVERS too low or too high. If the Servers Started statistics are continuously increasing, you need to increase this parameter. On the other hand, if very few parallel server processes are busy at any given time, you should decrease this value.

SORT_AREA_SIZE. Use a large SORT_AREA_SIZE because the parallel queries would generally be doing some sort of sorting. A small SORT_AREA_SIZE could lead to a lot of sort runs.
PARALLEL_ADAPTIVE_MULTI_USER. The recommended value for this is FALSE. When this parameter is set to TRUE, it automatically reduces the requested degree of parallelism based on the current number of active parallel execution users on the system. The effective degree of parallelism is based on the degree of parallelism set by the table attributes or hints, divided by the total number of parallel execution users. This parameter works best for a single-node SMP machine but can be used in an OPS environment if all the following conditions are true:
Users that execute parallel operations connect to the same node
Each node is an SMP
Instance groups aren't configured