Saturday, October 10, 2009

Oracle Performance Improvement Method

Oracle Performance Improvement Method

1. Perform the following initial standard checks:
Get candid feedback from users. Determine the performance project’s scope
and subsequent performance goals, as well as performance goals for the
future. This process is key in future capacity planning.
Get a full set of operating system, database, and application statistics from the
system when the performance is both good and bad. If these are not available,
then get whatever is available. Missing statistics are analogous to missing
evidence at a crime scene: They make detectives work harder and it is more
Sanity-check the operating systems of all systems involved with user
performance. By sanity-checking the operating system, you look for hardware
or operating system resources that are fully utilized. List any over-used
resources as symptoms for analysis later. In addition, check that all hardware
shows no errors or diagnostics.

2. Check for the top ten most common mistakes with Oracle, and determine if any of
these are likely to be the problem. List these as symptoms for later analysis. These
are included because they represent the most likely problems. ADDM
automatically detects and reports nine of these top ten issues.

3. Build a conceptual model of what is happening on the system using the symptoms
as clues to understand what caused the performance problems.

4. Propose a series of remedy actions and the anticipated behavior to the system,
then apply them in the order that can benefit the application the most. ADDM
produces recommendations each with an expected benefit. A golden rule in
performance work is that you only change one thing at a time and then measure
the differences. Unfortunately, system downtime requirements might prohibit
such a rigorous investigation method. If multiple changes are applied at the same
time, then try to ensure that they are isolated so that the effects of each change can be independently validated.

5. Validate that the changes made have had the desired effect, and see if the user's
perception of performance has improved. Otherwise, look for more bottlenecks,
and continue refining the conceptual model until your understanding of the
application becomes more accurate.

6. Repeat the last three steps until performance goals are met or become impossible
due to other constraints.

This method identifies the biggest bottleneck and uses an objective approach to
performance improvement. The focus is on making large performance improvements
by increasing application efficiency and eliminating resource shortages and
bottlenecks. In this process, it is anticipated that minimal (less than 10%) performance gains are made from instance tuning, and large gains (100% +) are made from isolating application inefficiencies.

Following analysis assumes that statistics for both the operating system and the database have been gathered.

1. Is the response time/batch run time acceptable for a single user on an empty or
lightly loaded system?
If it is not acceptable, then the application is probably not coded or designed
optimally, and it will never be acceptable in a multiple user situation when system
resources are shared. In this case, get application internal statistics, and get SQL
Trace and SQL plan information. Work with developers to investigate problems in
data, index, transaction SQL design, and potential deferral of work to
batch/background processing.

2. Is all the CPU being utilized?
If the kernel utilization is over 40%, then investigate the operating system for
network transfers, paging, swapping, or process thrashing. Otherwise, move onto
CPU utilization in user space. Check to see if there are any non-database jobs
consuming CPU on the system limiting the amount of shared CPU resources, such
as backups, file transforms, print queues, and so on. After determining that the
database is using most of the CPU, investigate the top SQL by CPU utilization.
These statements form the basis of all future analysis. Check the SQL and the
transactions submitting the SQL for optimal execution. Oracle provides CPU
statistics in V$SQL and V$SQLSTATS.
If the application is optimal and there are no inefficiencies in the SQL execution,
consider rescheduling some work to off-peak hours or using a bigger system.

3. At this point, the system performance is unsatisfactory, yet the CPU resources are
not fully utilized.
In this case, you have serialization and unscalable behavior within the server. Get the WAIT_EVENTS statistics from the server, and determine the biggest
serialization point. If there are no serialization points, then the problem is most
likely outside the database, and this should be the focus of investigation.
Elimination of WAIT_EVENTS involves modifying application SQL and tuning
database parameters. This process is very iterative and requires the ability to drill
down on the WAIT_EVENTS systematically to eliminate serialization points.

No comments: