SQL Trace , TKPROF
SQL_TRACE (10046 trace)
SQL_TRACE is the main method for collecting SQL Execution information in Oracle. It records a wide range of information and statistics that can be used to tune SQL operations.
Enabling SQL_TRACE
The SQL Trace facility can be enabled/disabled for an individual session or at the instance level. If the initialisation Parameter SQL_TRACE is set to TRUE in the init.ora of an instance, then all sessions will be traced. Note that using this initialization parameter to enable the SQL trace facility for the entire instance can have a severe performance impact.
------------------------------
The quickest way to capture the SQL is being processed by a session is to switch on SQL trace or set the 10046 event for a representative period of time. The resulting trace files can be read in their raw state or translated using the tkprof utility.
------------------------------------------------
Tracing a SQL session
Start session trace
To start a SQL trace for the current session, execute:
ALTER SESSION SET sql_trace = true;
You can also add an identifier to the trace file name for later identification:
ALTER SESSION SET sql_trace = true; ALTER SESSION SET tracefile_identifier = mysqltrace;
Stop session trace
To stop SQL tracing for the current session, execute:
ALTER SESSION SET sql_trace = false;
Tracing other user's sessions
DBA's can use DBMS_SYSTEM.SET_SQL_TRACE_IN_SESSION to trace problematic database sessions. Steps:
- Get the SID and SERIAL# for the process you want to trace.
SQL> select sid, serial# from sys.v_$session where ... SID SERIAL# ---------- ---------- 8 13607
- Enable tracing for your selected process:
SQL> ALTER SYSTEM SET timed_statistics = true; SQL> execute dbms_system.set_sql_trace_in_session(8, 13607, true);
- Ask user to run just the necessary to demonstrate his problem.
- Disable tracing for your selected process:
SQL> execute dbms_system.set_sql_trace_in_session(8,13607, false);
- Look for trace file in USER_DUMP_DEST:
$ cd /app/oracle/admin/oradba/udump $ ls -ltr total 8 -rw-r----- 1 oracle dba 2764 Mar 30 12:37 ora_9294.trc
Tracing an entire database
To enable SQL tracing for the entire database, execute:
ALTER SYSTEM SET sql_trace = true SCOPE=MEMORY;
To stop, execute:
ALTER SYSTEM SET sql_trace = false SCOPE=MEMORY;
Identifying trace files
Trace output is written to the database's UDUMP directory.
The default name for a trace files is INSTANCE_PID_ora_TRACEID.trc where:
- INSTANCE is the name of the Oracle instance,
- PID is the operating system process ID (V$PROCESS.OSPID); and
- TRACEID is a character string of your choosing.
Size of trace files
The trace file size is limited by the parameter MAX_DUMP_FILE_SIZE. The unit of this parameter, if you don't specify the K or M option, is in OS block size.
Be sure this parameter is set to a value high enough for your purpose (e.g. some MB). Of course this depends on the amount and complexitiy of statements which have to be run while tracing. If this value is set too low, possibly the dump file size limit will be reached before the execution of the crucial statements and the trace file will be closed before the interesting parts can be recorded in it.
On the other hand, when this parameter is set to UNLIMITED (default value), if the program to be traced is working forth and forth and the trace mode is not finished, the trace file can grow without limit which means until the associated file system or disk is full. A DBA can stop the trace of a session using the DBMS_MONITOR (10g and up), DBMS_SYSTEM or DBMS_SUPPORT package.
Formatting output
Trace output is quite unreadable. However, Oracle provides a utility, called TKProf, that can be used to format trace output.
==========================================================================
TKPROF Usage
TKPROF allows you to analyse a trace file to determine where time is
being spent and what query plans are being used on SQL statements.
1 - Set TIMED_STATISTICS if required at database level.
2 - Get SQL_TRACE output for the session you wish to monitor
3 - Find the appropriate trace file (In USER_DUMP_DEST, default $ORACLE_HOME/rdbms/log on Unix). You can find the most recent trace files on Unix with the command: ls -ltr This will list the most recent files LAST 4 - Run tkprof on the trace file thus: tkprof tracefile outfile [explain=user/password] [options...] TKPROF Options ~~~~~~~~~~~~~~ print=integer List only the first 'integer' SQL statements. insert=filename List SQL statements and data inside INSERT statements. sys=no TKPROF does not list SQL statements run as user SYS. record=filename Record statements found in the trace file. sort=option Set of zero or more of the following sort options: prscnt number of times parse was called prscpu cpu time parsing prsela elapsed time parsing prsdsk number of disk reads during parse prsqry number of buffers for consistent read during parse prscu number of buffers for current read during parse prsmis number of misses in library cache during parse execnt number of execute was called execpu cpu time spent executing exeela elapsed time executing exedsk number of disk reads during execute exeqry number of buffers for consistent read during execute execu number of buffers for current read during execute exerow number of rows processed during execute exemis number of library cache misses during execute fchcnt number of times fetch was called fchcpu cpu time spent fetching fchela elapsed time fetching fchdsk number of disk reads during fetch fchqry number of buffers for consistent read during fetch fchcu number of buffers for current read during fetch fchrow number of rows fetched userid userid of user that parsed the cursor
============================================================================~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
TKProf Interpretation
TKProf is an executable that 'parses' Oracle trace files to produce more readable output. In the default mode, all the information in TKProf is available from the base trace file.TKProf Structure
TKProf output for an individual cursor has the following structure: SELECT NULL FROM DUAL call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 1 0.00 0.00 0 0 0 0 Execute 1 0.00 0.00 0 0 0 0 Fetch 2 0.00 0.00 0 3 0 1 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 4 0.00 0.00 0 3 0 1 Misses in library cache during parse: 0 Optimizer goal: FIRST_ROWS Parsing user id: 271 Rows Row Source Operation ------- --------------------------------------------------- 1 TABLE ACCESS FULL DUAL (cr=3 r=0 w=0 time=21 us) Elapsed times include waiting on following events: Event waited on Times Max. Wait Total Waited ---------------------------------------- Waited ---------- ------------ SQL*Net message to client 2 0.00 0.00 SQL*Net message from client 2 0.00 0.00Overall the structure is:
- SQL Statement
- Parse/Execute/Fetch statistics and timings
- Library Cache information
- Row source plan
- Events waited for by the statement
Parse/Execute/Fetch statistics and timings This section contains the bulk of the useful timing information for each statement. This can be used in conjunction with the 'Row source plan' and 'Events waited for by the statement' to give the full picture. Columns in the Parse/Execute/Fetch table have the following meanings: call Statistics for each cursor's activity are divided in to 3 areas: Parse/Execute/Fetch. A total is also calculated. Parse statistics from parsing the cursor. This includes information for plan generation etc. Execute statistics for the execution phase of a cursor Fetch statistics for actually fetching the rows count number of times each individual activity has been performed on this particular cursor cpu cpu time used by this cursor elapsed elapsed time for this cursor (includes the cpu time) disk This indicates the number of blocks read from disk. Generally it would be preferable for blocks to be read from the buffer cache rather than disk. query This column is incremented if a buffer is read in Consistent mode. A Consistent mode buffer is one that has been generated to give a consistent read snapshot for a long running transaction. current This column is incremented if a buffer is found in the buffer cache that is new enough for the current transaction and is in current mode (and it is not a CR buffer). This applies to buffers that have been read in to the cache as well as buffers that already exist in the cache in current mode. rows Rows retrieved by this step Library Cache information Tracing a statement records some information regarding library cache usage which is externalised by TKProf in this section. Most important here is "Misses in library cache during parse:" which shows whether or not a statement is being re-parsed. If a statement is being shared well then you should see a minimal number of misses here (1 or 0 preferably). If sharing is not occurring then high values in this field can indicate that. Row source plan This section displays the access path used at execution time for each statement along with timing and actual row counts returned by each step in the plan. This can be very useful for a number of reasons. Example: Rows Row Source Operation ------- --------------------------------------------------- [A] 1 TABLE ACCESS FULL DUAL [B] (cr=3 [C] r=0 [D] w=0 [E] time=21 us [F]) >>Row count [A]- the row counts output in this section are the actual number of rows returned at each step in the query execution. These actual counts can be compared with the estimated cardinalities (row counts)from an optimizer explain plan. Any differences may indicate a statistical problem that may result in a poor planchoice. >>Row Source Operation [B] - Shows the operation executed at this step in the plan. >>IO Stats - For each step in the plan, [C] is the consistent reads, [D] is the physical reads and [E] is the writes. These statistics can be useful in identifying steps that read or write a particularly large proportion of the overall data. >>Timing - [F] shows the cumulative elapsed time for the step and the steps that preceded it. This section is very useful when looking for the point in an access path that takes all the time. By looking for the point at where the majority of the time originates it is possible to narrow down a number of problems.
Events waited for by the statement This section displays all wait events that a statement has waited for during the tracing. This section can be very useful when used in conjunction with the statistics and row source information for tracking down the causes of problems associated with long wait times. High numbers of waits or waits with a long total duration may be candidates for investigation dependent on the wait itself.==============================================================Potential TKProf Usage Examples Spotting Relatively High Resource Usage update ... where ... ----------------------------------------------------------------------- | call | count | cpu | elapsed | disk | query | current | rows | |---------|-------|-----|---------|------|---------|---------|--------| | Parse | 1 | 7 | 122 | 0 | 0 | 0 | 0 | | Execute | 1 | 75 | 461 | 5 | [H] 297 | [I] 3 | [J] 1 | | Fetch | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ----------------------------------------------------------------------- This statement is a single execute of an update. [H] shows that this query is visiting 297 buffers to find the rows to update [I] shows that only 3 buffer are visited performing the update [J] shows that only 1 row is updated. Reading 297 buffers to update 1 rows is a lot of work and would tend to indicate that the access path being used is not particularly efficient. Perhaps there is an index missing that would improve the access performance? Spotting Over Parsing select ... ------------------------------------------------------------------------- | call | count | cpu | elapsed | disk | query | current | rows | |---------|-------|---------|---------|------|--------|---------|-------| | Parse | [M] 2 | [N] 221 | 329 | 0 | 45 | 0 | 0 | | Execute | 3 | [O] 9 | [P] 17 | 0 | 0 | 0 | 0 | | Fetch | 3 | 6 | 8 | 0 | [L] 4 | 0 | [K] 1 | ------------------------------------------------------------------------- Misses in library cache during parse: 2 [Q] Here we have a select that we suspect may be a candidate for over parsing. [K] is shows that the query has returned 1 row. [L] shows that 4 buffers were read to get this row back. This is fine.
[M] show that the statement is parsed twice - this is not desirable especially as the parse cpu usage is a high [N] in comparison to the execute figures : [O] & [P] (ie the elapsed time for execute is 17 seconds but the statement spends over 300 seconds to determine the access path etc in the parse phase. [Q] shows that these parses are hard parses. If [Q] was 1 then the statement would have had 1 hard parse followed by a soft parse (which just looks up the already parsed detail in the library cache). This is not a particularly bad example in terms of total counts since the query has only been executed a few times. However if this pattern is reproduced for each execution this could be a significant issue. Excessive parsing should be avoided as far as possible by ensuring that code is shared:
- using bind variables
- make shared pool large enough to hold query definitions in memory long enough to be reused.
Spotting Queries that Execute too frequently The following query has a high elapsed time and is a candidate for investigation: UPDATE ... SET ... WHERE COL = :bind1; call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- -------- Parse 0 0.00 0.00 0 0 0 0 Execute 488719 66476.95 66557.80 1 488729 1970566 488719 Fetch 0 0.00 0.00 0 0 0 0 ------- ------ -------- ---------- ---------- ---------- ---------- -------- total 488719 66476.95 66557.80 1 488729 1970566 488719 From the above, the update executes 488,719 times and takes in total ~ 65,000 seconds to do this.The majority of the time is spent on CPU. A single row is updated per execution. For each row updated ~1 buffer is queried. ~2 million buffers are visited to perform the update. On average the elapsed time is ~ 0.1 second per execution. A sub-second execution time would normally be acceptable for most queries, but if the query is not scaleable and is executed numerous times, then the time can quickly add up to a large number. It would appear that in this case the update may be part of a loop where individual values are passsed and 1 row is updated per value. This structure does not scale with large number of values meaning that it can become inefficient. One potential solution is to try to 'batch up' the updates so that multiple rows are updated within the same execution. As Oracle releases have progressed a number of optimizations and enhancements have been made to improve the handling of 'batch' operations and to make them more efficient. In this way, code modifications to replace frequently executed relatively inefficient statements by more scaleable operations can have a significant impact.
SQL trace files contain detailed timing information. By default, Oracle does not track timing, so all timing figures in trace files will show as zero. If you would like to see legitimate timing information, then you need to enable timed statistics. You can do this at the instance level by setting the following parameter in the instance parameter file and restarting the instance:timed_statistics = trueYou can also dynamically enable or disable timed statistics collection at either the instance or the session level with the following commands:ALTER SYSTEM SET timed_statistics = TRUE|FALSE; ALTER SESSION SET timed_statistics = TRUE|FALSE;There is no known way to enable timed statistics collection for an individual session from another session (akin to the SYS.dbms_system.set_sql_trace_in_session built-in).There is very high overhead associated with enabling SQL trace. Some DBAs believe the performance penalty could be over 25%. Another concern is that enabling SQL trace causes the generation of potentially large trace files. For these reasons, you should use SQL trace sparingly. Only trace what you need to trace and think very carefully before enabling SQL trace at the instance level.On the other hand, there is little, if any, measurable performance penalty in enabling timed statistics collection. Many DBAs run production databases with timed statistics collection enabled at the system level so that various system statistics (more than just SQL trace files) will include detailed timing information
No comments:
Post a Comment