You were brought to this page based on an internet search
and as a free service to Oracle DBAs.
The text below is an except from the book,
Oracle Performance Firefighting, written by
Craig Shallahamer of
OraPub, Inc.
Figures and tables are not included on this page, only their reference.
To order the book in either print or PDF form, click
here.
©2009, 2010 by Craig Shallahamer. This is copyrighted material.
PleaseOut of respect for those involved in the creation of the book and also for
their familes, we ask you to respect the copyright both in intent and deed. Thank you.
-------------------------------
The Oracle response-time analysis shows Oracle processes are primarily waiting to get cache buffer chain (CBC) latches, and Oracle processes are consuming nearly all of the available database server CPU. (This is not uncommon with severe latch contention.) CBCs are used to answer the question, "Is the block in the buffer cache?" CBCs get stressed when the answer is usually, "Yes, the block is in the cache." Repeatedly asking this question and accessing buffers in memory stresses the CPU subsystem. As I'll detail in Chapter 6, an Oracle-focused CBC solution is to increase the number of CBC latches by changing an instance parameter. This is a perfect example of why you must know Oracle's architecture. A fantastic diagnosis is important, but you must know what to do with that stellar diagnosis! This is why this book focuses on both diagnosis and Oracle internals.
Based on the Oracle analysis, you can anticipate what you will find in the application and the operating system analyses. In this situation, you would expect the operating system to be experiencing a raging CPU bottleneck and the application SQL to be asking for a lot of buffers (that is, buffer gets as opposed to block gets). Also, you will want to check the order-entry SQL. It's likely that the order-entry SQL is either causing the intense logical IO activity, and hence CBC activity, or being negatively affected by it.
A quick operating system analysis clearly shows a raging CPU bottleneck. As expected, the CPU subsystem is 90% busy on average, with a run queue nearly always greater than the number of CPU cores (which means processes are needing to wait for CPU resources). While the IO subsystem is doing some real work, there are no volumes busier than 60%, and their response times are well under the rule of thumb of 10 ms. There is no memory swapping. As in many cases, the network has been deemed out of scope. A solution focused on the operating system is to acquire more CPU cycles for Oracle processes. There are many ways to go about doing this, such as by identifying processes that do not need to be or should not be running during peak times.
©2009, 2010 by Craig Shallahamer. This is copyrighted material.
PleaseOut of respect for those involved in the creation of the book and also for
their familes, we ask you to respect the copyright both in intent and deed. Thank you.
|