'; zhtm += ''; zhtm += '

' + pPage + ''; zhtm += ''; figDoc.write(zhtm); figDoc.close(); popUpWin.focus(); } //--> Using Oracle8 -- Chapter 22 - Tuning for Different Types of Applications

Using Oracle8


Chapter 22

Tuning for Different Types of Applications


Transaction Processing Tuning

Online Transaction Processing (OLTP) systems have several characteristics:

Typical OLTP applications include airline reservation systems and banking applications. When you design an OLTP system, you must ensure that the large numbers of concurrent users don't affect the system performance. At the same time, you should get high availability, speed, and recoverability. You need to consider several issues when using OLTP systems:

Tuning Rollback Segments

In an OLTP environment, an increased number of transactions occur, which may require a lot of rollback segments. When you're using parallel DML statements, the rollback segments become more important. You should have rollback segments that belong to tablespaces with lots of free space. You should also have unlimited rollback segments or a very high value for MAXEXTENTS.

SEE ALSO
More information on tuning rollback segments,

Using Discrete Transactions

Discrete transactions can be used concurrently with standard transactions. Discrete transactions are useful for transactions with certain characteristics:

ORA-1555 during discrete transactions
Undo information isn't generated during a discrete transaction. Therefore, you may get "snap-shot too old" errors if a discrete transaction starts and commits during a long query, and the discrete transaction modifies blocks used by the long query.

When you use a discrete transaction, all the changes made to any data will be deferred until the transaction commits. Redo information is generated but undo information isn't generated. The redo information is stored in a separate location in memory and gets written to the redo log when the transaction commits.

Changes made during discrete transactions
Discrete transactions can't see their own changes. Therefore, they can't perform inserts or updates on both tables involved in a referential integrity constraint.

Use discrete transactions

  1. Set the parameter DISCRETE_TRANSACTIONS_ENABLED to TRUE in the initialization file. (If DISCRETE_TRANSACTIONS_ENABLED is set to FALSE, all the transactions will be run as standard transactions.)
  2. Use the procedure BEGIN_DISCRETE_TRANSACTION as the first statement in a transaction.

Scope of a discrete transaction
The procedure BEGIN_DISCRETE_TRANSACTION is effective only for the current transaction. As soon as the transaction commits or rolls back, the next transaction will be treated as a standard transaction.

The example in Listing 22.1 uses a discrete transaction for a movie theater application. The name of the movie, show time, and number of tickets are passed as the arguments to this procedure. The procedure checks the tickets available. If enough tickets are available, it issues the tickets; otherwise, it will indicate the number of tickets available for that movie and show time.


LISTING 22.1  A movie theater application using a discrete transaction

     01:      CREATE PROCEDURE BUY_TICKETS (movie_name IN VARCHAR(25),
     02:              num_of_tickets IN NUMBER(10),
     03:              movie_datetime  IN DATETIME,
     04:              status OUT VARCHAR(5))
     05:     AS
     06:     DECLARE
     07:       theatre_capacity    NUMBER(3);
     08:       tickets_sold        NUMBER(3);
     09:       tickets_available   NUMBER(3);
     10:     BEGIN
     11:       dbms_transaction.begin_discrete_transaction;
     12:       FOR I IN 1 . .  2 LOOP
     13:            BEGIN
     14:                 SELECT theatre_max, tics_sold
     15:                 INTO theatre_capacity, tickets_sold
     16:                 FROM movies
     17:                 WHERE movie_title = movie_name
     18:                 AND show_time = movie_datetime
     19:                 FOR UPDATE;
     20:       tickets_available := (theatre_capacity - tickets_sold)
     21:       IF tickets_available <= num_of_tickets
     22:       THEN
     23:            status := "Sorry. Only " & tickets-available & 
                             "tickets are available";
     24:       ELSE
     25:            UPDATE movies
     26:            SET tics_sold = tickets_sold + num_of_tickets
     27:            WHERE movie_title = movie_name
     28:            AND show_time = movie_datetime;
     29:   
     30:            status := "Requested number of tickets available."
     31:       END IF;
     32:       COMMIT;
     33:       EXIT;
     34:            EXCEPTION
     35:       WHEN dbms_transaction.discrete_transaction_failed THEN
     36:            ROLLBACK;
     37:       END;
     38:             END LOOP;
     39:           END;


Discrete transaction failure
In this example, if the discrete transaction fails, the DISCRETE_TRANSACTION_FAILED exception occurs and the transaction is rolled back. The second iteration of the loop will re-execute the transaction but this time as a standard transaction.

Transaction Processing Monitors (TPMs)

When a large number of concurrent users are connecting to the database, the demand for resources can become very heavy. On UNIX, the limiting factor is usually the physical memory. On Windows NT, each process can address a maximum of 4GB address space; 2GB of this 4GB address space is shared between processes. The shadow processes are implemented as threads within a single process; therefore, the limiting factor becomes the 2GB address space limit. There are several solutions to this problem, such as Multi-Threaded Server, concurrent manager, and connection multiplexing.

Fault tolerance using TPM
The application client isn't aware of the application server servicing it; therefore, fault tolerance can be provided by using the transaction monitor architecture.

Another approach is to split the application logic into an application client and application server. The application client becomes responsible for the collection and presentation of the data, whereas the application server becomes responsible for providing business services that it implements through accessing various data stores (resource managers). The application server will normally be written in a 3GL such as C or COBOL, and the interface to the resource manager is using the normal precompiler or API mechanisms. This environment is normally managed with a transaction processing monitor (also called a transaction monitor), such as BEA Tuxedo. The TPM will normally provide the following:

A typical application server's logic

  1. Connect to Oracle:
       EXEC SQL CONNECT

  1. Wait for the message from the application client (an ATMI call for BEA Tuxedo).
  2. Access Oracle:
       EXEC SQL SELECT
       EXEC SQL UPDATE

  1. Commit:
       EXEC SQL COMMIT;

  1. Send a reply to the application client (an ATMI call for BEA Tuxedo).
  2. Return to step 2 and repeat the process.

In a high-concurrency environment, many application clients can share the same application server. The TPM is responsible for routing messages. As the load increases, the TPM can also spawn more application servers.

Monitoring transactions in pending state
The DBA_2PC_PENDING and the DBA_2PC_NEIGHBORS views can be used to monitor if the transaction is in a pending state.

Distributed transaction processing can be obtained in the transaction-processing (TP) architecture by routing messages to different application servers. This can be homogeneous (Oracle-Oracle) or heterogeneous (Oracle-Sybase). The limitation is that the transaction is committed separately on the data stores. Thus, you'll need to use XA to achieve global heterogeneous transactions. In the XA architecture, each resource manager exports an XA API. The TPM becomes the 2PC coordinator and uses the XA API to control the prepare/commit/rollback of transactions within the resource managers. Direct commits and rollbacks from the application servers are replaced with calls to the TPM through the XA interface.

Transaction Recovery in the XA Environment

If the TPM crashes or becomes unavailable while performing a 2PC, a transaction may be left in pending state. In this state, it may be holding locks and preventing other users from proceeding. You can use the commit force or rollback force statements as needed to manually force the transaction to commit or rollback.

Purchasing a TPM

Several vendors provide UNIX TPMs. The following support XA:

The following don't currently support XA:

Writing an Oracle TPM Application

The TPM vendor documentation should describe the actual APIs used to talk to the TPM. However, all of them have a way to indicate where a transaction begins and ends and a way to send a request and receive a response from a client to a server. In the following example, which uses Tuxedo /T verbs, the SQL COMMIT is replaced with the TPM COMMIT.

The first example is an Oracle managed transaction. For the client, use

Tpcall("debit_credit");

Use the following for the server:

Debit_credit_service (TPSCVCINFO *input)
{
     extract data from the input;
     EXEC SQL UPDATE debit_data;
     EXEC SQL UPDATE credit_data;
     EXEC SQL COMMIT WORK;
     Tpreturn(output_data);
}

The next example is a TPM-managed transaction using XA. Use the following for the client:

Tpbegin();
Tpcall("debit");
Tpcall("credit");
Tpcommit();

For server 1, use

Debit_service(TPSCVCINFO *input)
{
     extract data from the input;
     EXEC SQL UPDATE debit_data;
     Tpreturn(output_data);
}

For server 2, use the following:

credit_service(TPSCVCINFO *input)
{
     extract data from the input;
     EXEC SQL UPDATE credit_data;
     Tpreturn(output_data);
}

For the TPM API and programming overview, you can obtain the material from the vendor.

DSS and Data-Warehousing Tuning

DSS systems generally perform queries on huge amounts of data that has been gathered from OLTP systems. You need to consider several issues when using DSS systems:

Adding Indexes

An index is generally used to provide a fast access path to the data. When using indexes in a DSS environment where you would be dealing with a huge amount of data, you need to take several special measures:

Using the UNRECOVERABLE option
Using the UNRECOVERABLE option during index creation can speed up index creation. Because this won't generate any redo log records during index creation, you should back up after the index is created.

In the DSS system, the data changes won't be a lot, so you should be able to create relatively more indexes without worrying about the performance. In general, be careful when creating indexes because unnecessary indexes can degrade performance.

When you're using star queries, the indexes on the fact table can be partitioned or non-partitioned. Local partitioned indexes are the simplest, but their disadvantage is that a search of local non-prefixed index requires searching of all the index partitions.

Managing Sort Space

In a DSS environment, you usually will end up with queries that require some kind of sorting. Several initialization parameters can be manipulated to manage the amount of sort space used by these operations:

Effect of sort area size
As the sort area size decreases, the amount of I/O increases due to the need to perform merges on a large number of runs. A large sort area, on the other hand, can result in a high operating system paging rate.

Large sort areas can be used effectively by combining a large SORT_AREA_SIZE with a minimal SORT_AREA_RETAINED_SIZE.

Release memory after completing sorts
The SORT_AREA_RETAINED_SIZE parameter allows you to specify the level to which memory should be released as soon as possible after a sort completes. If memory isn't released until the user disconnects, large sorts will create problems in the system.

Managing Hash Join Space

If Cx(Y) is used to represent the cost to perform operation x on table Y, the cost to perform a hash join is as follows (provided that you have sufficient memory available):

Chj(T1,T2) < Cread(T1) + Cread(T2) + Chash(T1,T2)

When you use hash join operations, an in-memory hash table is built from smaller table, and then the larger table is scanned and joined by using a hash table probe. Hash joins are applicable to equijoins, anti-joins, and outer joins. Indexes aren't required to perform a hash join.

Suppose that you have two tables, french and engineers, and want the names of all the French engineers. french is the smaller of the two tables.

SELECT /*+ use_hash(french) */ french.name
FROM french, engineers
WHERE french.name = engineers.name

To perform the hash join, an area of memory known as the hash memory is allocated (see Figure 22.1).

Figure 22.1 : The in-memory hash table is obtained by using the french table.

Applying hash functions to data
When applying hash functions to data, Key1 = Key2 => hash Key1 = hash(Key2 ) but the reverse isn't true. (hash represents a hash function, and Key1and Key2represent data.)

The first stage of the join involves scanning and partitioning the french table and building an in-memory hash filter and hash table. This is the build, and the french table is known as the build input. The french table is hash partitioned into smaller chunks so that at least one partition can be accommodated in the hash memory, which reduces the number of comparisons during the join. As the build input is scanned, parts of some partitions may be written to disk (temporary segment).

A hash filter is created for each partition and stays in memory even if the partition doesn't fit. The hash filter is used to efficiently discard rows that don't join. After the french table is scanned completely, the size of each partition is known, as many partitions as possible are loaded into memory, and a single hash table is built on the in-memory partitions.

Case 1: The hash memory is large enough for all the partitions of the french table; therefore, the entire join is completed by simply scanning the engineers table and probing the build.

Case 2: The hash memory isn't large enough to fit all the partitions of the french table. In this case, the engineers table is scanned, and each row is partitioned using the same method (see Figure 22.2). Then for each row,

Figure 22.2 : Hash joins are performed by using the in-memory hash table and the hash filters.

After the engineers table is scanned, phase 2 begins. In phase 2, the smaller of the french and engineers partitions are scanned into memory and a hash table is built. The larger partition is scanned, and the join is completed by probing the hash table. If a partition won't fit in memory, the join will degenerate to a nested loop type mechanism.

You can set the following parameters in the initialization file or by using the ALTER SESSION command from Server Manager:

Set the size of hash area appropriately
Each process that performs a parallel join operation uses an amount of memory equal to HASH_AREA_SIZE. Setting HASH_AREA_SIZE too large can cause the system to run out memory, whereas a setting that's too small can degrade performance.

Designing Tables for Star Queries

Star queries are most suitable for data warehouses, where there's a huge amount of data but relatively few data changes. A star schema is a natural data representation for many data warehousing applications; it consists of one very large table, referred to as a fact table, and many smaller tables, called dimension tables. (Note that the rule-based optimizer doesn't recognize star schemas.) Dimension tables aren't related to each other; however, each dimension table is related to the fact table through a primary key/foreign key relationship. Usually, there's a b*tree concatenated index on the columns of the fact table that are related to the dimension tables.

An example of a star schema would be a retail environment (see Figure 22.3).

Figure 22.3 : In a star schema, the relationship between the involved tables represents a structure.

The following tables can be part of a retail environment:

The sales table can be the fact table and contain several millions of records for the different sales transactions. The dimension tables-suppliers, customers, parts, and stores-are relatively small and provide additional details about a sale.

If you need to find sales information, use a star query. It uses specific details from the dimension tables. For example, if you want to find the total sales of a specific part purchased by a specific group of customers from a specific store and sort the results by suppliers, use the following query:

Select supplier.name, sum(total_sales)
from sales, customers, parts, suppliers, stores
where
sales.customer_key = customer.customer_key and
sales.part_key = parts.part_key and
sales.supplier_key = suppliers.supplier_key and
sales.store_key = stores.store_key and
customers.name in ('IBM','HP','COMPAQ') and
parts.name = '2 GB disk' and
stores.name = 'EGG HEAD SOFTWARE'
group by suppliers.name;

Tuning Star Queries

To efficiently use star queries, you must use the cost-based optimizer (set OPTIMIZER_MODE to CHOOSE) and analyze with COMPUTE STATISTICS all the tables involved in the star query. To analyze tables, use the following command from the Server Manager prompt:

SVRMGR> ANALYZE TABLE tablename COMPUTE STATISTICS;

You can also tune star queries in other ways:

STAR hint versus USE_NL hint
The STAR hint /*+ STAR */ forces a star query plan to be used if possible. The USE_NL hint /* + USE_NL (table) */ causes each specified table to join with another row source with a nested loops join, using the specified table as the inner table.

CREATE VIEW product AS SELECT /*+ NO_MERGE */ *
FROM brands,manufacturers WHERE brands.mfkey =
manufacturers.mfkey;

This will improve performance by caching the result of the view and reducing the executions of the small table joins.

You also can use a star transformation by setting STAR_TRANSFORMATION_ENABLED to TRUE in the initialization file and using the STAR_TRANSFORMATION hint in the query. Tables with the following characteristics can't be used with a star transformation, however:

The star transformation is ideal under any of the following conditions:

The star transformation doesn't rely on computing a Cartesian product of the dimension tables; instead, it uses bitmap indexes on individual fact table columns. It works by generating new subqueries that can be used to drive a bitmap index access path for the fact table.

SEE ALSO
More information on using the cost-based optimizer,
See how to use hints,
More information on bitmap indexes,

Parallel Operations

Oracle can perform the following operations in parallel:

You should try to parallelize operations that have high elapsed time or process a large number of rows.

Tuning Parallel Operations in a DSS Environment

You can use several techniques to optimize parallel operations.

One way is to increase the default degree of parallelism for I/O- bound operations and decrease the degree of parallelism for memory-bound operations. Follow these guidelines when adjusting the degree of parallelism:

You also can verify that all the parts of the query plan for SQL statements that process huge amounts of data are executing in parallel. By using EXPLAIN PLAN, verify that the plan steps have an OTHER_TAG of PARALLEL_TO_PARALLEL, PARALLEL_TO_SERIAL, PARALLEL_COMBINED_WITH_PARENT, or PARALLEL_COMBINED_WITH_ CHILD. Any other keyword or null indicates serial execution and possible bottleneck. Follow these guidelines to improve parallelism of the SQL statements:

You can create and populate tables in parallel by using the PARALLEL and the NOLOGGING options with the CREATE TABLE statement. For example,

CREATE TABLE new_table PARALLEL NOLOGGING
AS SELECT col1,col2, col3 FROM old_table;

You also can create indexes by using the PARALLEL and NOLOGGING clauses of the CREATE INDEX statement. Index creation takes place serially unless you specify the PARALLEL clause. An index created with an INITIAL of 5MB and a PARALLEL DEGREE of 8 will use at least 40MB during index creation because the STORAGE clause refers to the storage of each subindex created by the query server processes.

The number of CPUs can affect the amount of parallelism
If the degree of parallelism isn't specified in the PARALLEL clause of CREATE INDEX, the number of CPUs is used as the degree of parallelism.

Another technique to optimize parallel operations is to set the initialization parameters correctly:

Parallelism is influenced by the usage of hints
A non-zero setting of OPTIMIZER_PERCENT_PARALLEL is overridden if you use a FIRST_ROWS hint or set OPTIMIZER_MODE to FIRST_ROWS.

2×CPUs×Number of Concurrent Users.
(CPUs + 2)×(PARALLEL_MIN_SERVERS) ×1.5×(BLOCK_SIZE)

Set PARALLEL_MIN_SERVERS appropriately
You can use the V$PQ_SYSSTAT view to determine if you've set the value of PARALLEL_MIN_SERVERS too low or too high. If the Servers Started statistics are continuously increasing, you need to increase this parameter. On the other hand, if very few parallel server processes are busy at any given time, you should decrease this value.


© Copyright, Macmillan Computer Publishing. All rights reserved.