Thursday, February 26, 2015

SQL Trace , TKPROF

SQL Trace , TKPROF



SQL_TRACE (10046 trace)

SQL_TRACE is the main method for collecting SQL Execution information in Oracle. It records a wide range of information and statistics that can be used to tune SQL operations.

Enabling SQL_TRACE

The SQL Trace facility can be enabled/disabled for an individual session or at the instance level. If the initialisation Parameter SQL_TRACE is set to TRUE in the init.ora of an instance, then all sessions will be traced. Note that using this initialization parameter to enable the SQL trace facility for the entire instance can have a severe performance impact.


------------------------------
The quickest way to capture the SQL is being processed by a session is to switch on SQL trace or set the 10046 event for a representative period of time. The resulting trace files can be read in their raw state or translated using the tkprof utility.

------------------------------------------------


Tracing a SQL session


Start session trace

To start a SQL trace for the current session, execute:
ALTER SESSION SET sql_trace = true;
You can also add an identifier to the trace file name for later identification:
ALTER SESSION SET sql_trace = true;
ALTER SESSION SET tracefile_identifier = mysqltrace;


Stop session trace

To stop SQL tracing for the current session, execute:
ALTER SESSION SET sql_trace = false;


Tracing other user's sessions

DBA's can use DBMS_SYSTEM.SET_SQL_TRACE_IN_SESSION to trace problematic database sessions. Steps:

  • Get the SID and SERIAL# for the process you want to trace.
SQL> select sid, serial# from sys.v_$session where ...
       SID    SERIAL#
---------- ----------
         8      13607
  • Enable tracing for your selected process:
SQL> ALTER SYSTEM SET timed_statistics = true;
SQL> execute dbms_system.set_sql_trace_in_session(8, 13607, true);
  • Ask user to run just the necessary to demonstrate his problem.
  • Disable tracing for your selected process:
SQL> execute dbms_system.set_sql_trace_in_session(8,13607, false);
  • Look for trace file in USER_DUMP_DEST:
$ cd /app/oracle/admin/oradba/udump
$ ls -ltr
total 8
-rw-r-----    1 oracle   dba         2764 Mar 30 12:37 ora_9294.trc


Tracing an entire database

To enable SQL tracing for the entire database, execute:
ALTER SYSTEM SET sql_trace = true SCOPE=MEMORY;
To stop, execute:
ALTER SYSTEM SET sql_trace = false SCOPE=MEMORY;


Identifying trace files

Trace output is written to the database's UDUMP directory.
The default name for a trace files is INSTANCE_PID_ora_TRACEID.trc where:
  • INSTANCE is the name of the Oracle instance,
  • PID is the operating system process ID (V$PROCESS.OSPID); and
  • TRACEID is a character string of your choosing.


Size of trace files

The trace file size is limited by the parameter MAX_DUMP_FILE_SIZE. The unit of this parameter, if you don't specify the K or M option, is in OS block size.
Be sure this parameter is set to a value high enough for your purpose (e.g. some MB). Of course this depends on the amount and complexitiy of statements which have to be run while tracing. If this value is set too low, possibly the dump file size limit will be reached before the execution of the crucial statements and the trace file will be closed before the interesting parts can be recorded in it.
On the other hand, when this parameter is set to UNLIMITED (default value), if the program to be traced is working forth and forth and the trace mode is not finished, the trace file can grow without limit which means until the associated file system or disk is full. A DBA can stop the trace of a session using the DBMS_MONITOR (10g and up), DBMS_SYSTEM or DBMS_SUPPORT package.


Formatting output

Trace output is quite unreadable. However, Oracle provides a utility, called TKProf, that can be used to format trace output.

==========================================================================

TKPROF Usage


TKPROF allows you to analyse a trace file to determine where time is 
  being spent and what query plans are being used on SQL statements.

   1 - Set TIMED_STATISTICS if required at database level.
  
   2 - Get SQL_TRACE output for the session you wish to monitor
   
   3 - Find the appropriate trace file (In USER_DUMP_DEST, default
       $ORACLE_HOME/rdbms/log on Unix). 
       You can find the most recent trace files on Unix with the command:
         ls -ltr
       This will list the most recent files LAST

   4 - Run tkprof on the trace file thus:

         tkprof tracefile outfile [explain=user/password] [options...]

TKPROF Options
~~~~~~~~~~~~~~
   print=integer    List only the first 'integer' SQL statements.
   insert=filename  List SQL statements and data inside INSERT statements.
   sys=no           TKPROF does not list SQL statements run as user SYS.
   record=filename  Record statements found in the trace file.
   sort=option      Set of zero or more of the following sort options:

     prscnt  number of times parse was called
     prscpu  cpu time parsing
     prsela  elapsed time parsing
     prsdsk  number of disk reads during parse
     prsqry  number of buffers for consistent read during parse
     prscu   number of buffers for current read during parse
     prsmis  number of misses in library cache during parse

     execnt  number of execute was called
     execpu  cpu time spent executing
     exeela  elapsed time executing
     exedsk  number of disk reads during execute
     exeqry  number of buffers for consistent read during execute
     execu   number of buffers for current read during execute
     exerow  number of rows processed during execute
     exemis  number of library cache misses during execute

     fchcnt  number of times fetch was called
     fchcpu  cpu time spent fetching
     fchela  elapsed time fetching
     fchdsk  number of disk reads during fetch
     fchqry  number of buffers for consistent read during fetch
     fchcu   number of buffers for current read during fetch
     fchrow  number of rows fetched

     userid  userid of user that parsed the cursor
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~


TKProf Interpretation
TKProf is an executable that 'parses' Oracle trace files to produce more readable output. 
In the default mode, all the information in TKProf is available from the base trace file.

TKProf Structure

TKProf output for an individual cursor has the following structure: SELECT NULL FROM DUAL call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 1 0.00 0.00 0 0 0 0 Execute 1 0.00 0.00 0 0 0 0 Fetch 2 0.00 0.00 0 3 0 1 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 4 0.00 0.00 0 3 0 1 Misses in library cache during parse: 0 Optimizer goal: FIRST_ROWS Parsing user id: 271 Rows Row Source Operation ------- --------------------------------------------------- 1 TABLE ACCESS FULL DUAL (cr=3 r=0 w=0 time=21 us) Elapsed times include waiting on following events: Event waited on Times Max. Wait Total Waited ---------------------------------------- Waited ---------- ------------ SQL*Net message to client 2 0.00 0.00 SQL*Net message from client 2 0.00 0.00
Overall the structure is:
  • SQL Statement
  • Parse/Execute/Fetch statistics and timings
  • Library Cache information
  • Row source plan
  • Events waited for by the statement
  • Parse/Execute/Fetch statistics and timings This section contains the bulk of the useful timing information for each statement. This can be used in conjunction with the 'Row source plan' and 'Events waited for by the statement' to give the full picture. Columns in the Parse/Execute/Fetch table have the following meanings: call Statistics for each cursor's activity are divided in to 3 areas: Parse/Execute/Fetch. A total is also calculated. Parse statistics from parsing the cursor. This includes information for plan generation etc. Execute statistics for the execution phase of a cursor Fetch statistics for actually fetching the rows count number of times each individual activity has been performed on this particular cursor cpu cpu time used by this cursor elapsed elapsed time for this cursor (includes the cpu time) disk This indicates the number of blocks read from disk. Generally it would be preferable for blocks to be read from the buffer cache rather than disk. query This column is incremented if a buffer is read in Consistent mode. A Consistent mode buffer is one that has been generated to give a consistent read snapshot for a long running transaction. current This column is incremented if a buffer is found in the buffer cache that is new enough for the current transaction and is in current mode (and it is not a CR buffer). This applies to buffers that have been read in to the cache as well as buffers that already exist in the cache in current mode. rows Rows retrieved by this step Library Cache information Tracing a statement records some information regarding library cache usage which is externalised by TKProf in this section. Most important here is "Misses in library cache during parse:" which shows whether or not a statement is being re-parsed. If a statement is being shared well then you should see a minimal number of misses here (1 or 0 preferably). If sharing is not occurring then high values in this field can indicate that. Row source plan This section displays the access path used at execution time for each statement along with timing and actual row counts returned by each step in the plan. This can be very useful for a number of reasons. Example: Rows Row Source Operation ------- --------------------------------------------------- [A] 1 TABLE ACCESS FULL DUAL [B] (cr=3 [C] r=0 [D] w=0 [E] time=21 us [F]) >>Row count [A]- the row counts output in this section are the actual number of rows returned at each step in the query execution. These actual counts can be compared with the estimated cardinalities (row counts)from an optimizer explain plan. Any differences may indicate a statistical problem that may result in a poor planchoice. >>Row Source Operation [B] - Shows the operation executed at this step in the plan. >>IO Stats - For each step in the plan, [C] is the consistent reads, [D] is the physical reads and [E] is the writes. These statistics can be useful in identifying steps that read or write a particularly large proportion of the overall data. >>Timing - [F] shows the cumulative elapsed time for the step and the steps that preceded it. This section is very useful when looking for the point in an access path that takes all the time. By looking for the point at where the majority of the time originates it is possible to narrow down a number of problems.
Events waited for by the statement This section displays all wait events that a statement has waited for during the tracing. This section can be very useful when used in conjunction with the statistics and row source information for tracking down the causes of problems associated with long wait times. High numbers of waits or waits with a long total duration may be candidates for investigation dependent on the wait itself.
==============================================================
Potential TKProf Usage Examples Spotting Relatively High Resource Usage update ... where ... ----------------------------------------------------------------------- | call | count | cpu | elapsed | disk | query | current | rows | |---------|-------|-----|---------|------|---------|---------|--------| | Parse | 1 | 7 | 122 | 0 | 0 | 0 | 0 | | Execute | 1 | 75 | 461 | 5 | [H] 297 | [I] 3 | [J] 1 | | Fetch | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ----------------------------------------------------------------------- This statement is a single execute of an update. [H] shows that this query is visiting 297 buffers to find the rows to update [I] shows that only 3 buffer are visited performing the update [J] shows that only 1 row is updated. Reading 297 buffers to update 1 rows is a lot of work and would tend to indicate that the access path being used is not particularly efficient. Perhaps there is an index missing that would improve the access performance? Spotting Over Parsing select ... ------------------------------------------------------------------------- | call | count | cpu | elapsed | disk | query | current | rows | |---------|-------|---------|---------|------|--------|---------|-------| | Parse | [M] 2 | [N] 221 | 329 | 0 | 45 | 0 | 0 | | Execute | 3 | [O] 9 | [P] 17 | 0 | 0 | 0 | 0 | | Fetch | 3 | 6 | 8 | 0 | [L] 4 | 0 | [K] 1 | ------------------------------------------------------------------------- Misses in library cache during parse: 2 [Q] Here we have a select that we suspect may be a candidate for over parsing. [K] is shows that the query has returned 1 row. [L] shows that 4 buffers were read to get this row back. This is fine.
[M] show that the statement is parsed twice - this is not desirable especially as the parse cpu usage is a high [N] in comparison to the execute figures : [O] & [P] (ie the elapsed time for execute is 17 seconds but the statement spends over 300 seconds to determine the access path etc in the parse phase.  [Q] shows that these parses are hard parses. If [Q] was 1 then the statement would have had 1 hard parse followed by a soft parse (which just looks up the already parsed detail in the library cache). This is not a particularly bad example in terms of total counts since the query has only been executed a few times. However if this pattern is reproduced for each execution this could be a significant issue. Excessive parsing should be avoided as far as possible by ensuring that code is shared:
  • using bind variables
  • make shared pool large enough to hold query definitions in memory long enough to be reused.
Spotting Queries that Execute too frequently The following query has a high elapsed time and is a candidate for investigation: UPDATE ... SET ... WHERE COL = :bind1; call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- -------- Parse 0 0.00 0.00 0 0 0 0 Execute 488719 66476.95 66557.80 1 488729 1970566 488719 Fetch 0 0.00 0.00 0 0 0 0 ------- ------ -------- ---------- ---------- ---------- ---------- -------- total 488719 66476.95 66557.80 1 488729 1970566 488719 From the above, the update executes 488,719 times and takes in total ~ 65,000 seconds to do this.
The majority of the time is spent on CPU. A single row is updated per execution. For each row updated ~1 buffer is queried. ~2 million buffers are visited to perform the update. On average the elapsed time is ~ 0.1 second per execution. A sub-second execution time would normally be acceptable for most queries, but if the query is not scaleable and is executed numerous times, then the time can quickly add up to a large number.  It would appear that in this case the update may be part of a loop where individual values are passsed and 1 row is updated per value. This structure does not scale with large number of values meaning that it can become inefficient. One potential solution is to try to 'batch up'  the updates so that multiple rows are updated within the same execution. As Oracle releases have progressed a number of optimizations and enhancements have been made to improve the handling of 'batch' operations and to make them more efficient. In this way, code modifications to replace frequently executed relatively inefficient statements by more scaleable operations can have a significant impact.
============================================================================
SQL trace files contain detailed timing information. By default, Oracle does not track timing, so all timing figures in trace files will show as zero. If you would like to see legitimate timing information, then you need to enable timed statistics. You can do this at the instance level by setting the following parameter in the instance parameter file and restarting the instance:
timed_statistics = true
You can also dynamically enable or disable timed statistics collection at either the instance or the session level with the following commands:
ALTER SYSTEM SET timed_statistics = TRUE|FALSE;
ALTER SESSION SET timed_statistics = TRUE|FALSE;
There is no known way to enable timed statistics collection for an individual session from another session (akin to the SYS.dbms_system.set_sql_trace_in_session built-in).
There is very high overhead associated with enabling SQL trace. Some DBAs believe the performance penalty could be over 25%. Another concern is that enabling SQL trace causes the generation of potentially large trace files. For these reasons, you should use SQL trace sparingly. Only trace what you need to trace and think very carefully before enabling SQL trace at the instance level.
On the other hand, there is little, if any, measurable performance penalty in enabling timed statistics collection. Many DBAs run production databases with timed statistics collection enabled at the system level so that various system statistics (more than just SQL trace files) will include detailed timing information

No comments:

Post a Comment