Ask Tom search engine

Today my boss asked me if I know how to fix the search engine for FireFox for AskTom.oracle.com since Tom Kyte has moved it to APEX. Actually I didn’t have the search engine even installed anymore (hope that means that I’m already so good with Oracle myself ūüėČ ) but thought that it would be nice to have it again.

Therefore I quickly build the search engine for the new Ask Tom site.

To install it, download this file by right-clicking on it and choosing “Save Link As”:

SearchEngineInstaller.png

Unfortunately – or better fortunately – WordPress doesn’t allow Javascript within posts nor xml files as media. Therefore I have to work with faked png files here.
After you download the file rename it to “SearchEngineInstaller.html” and open the file in your browser.
Then click on “Install Ask Tom search engine”, install the engine and “Mission accomplished”!

If you are interested in the “code” you can download the search engine file here: AskTomSearchEngine.png
Again with right click and “Save Link As”. Then rename it to “AskTomSearchEngine.xml”

A few facts about TimesTen

  • TimesTen gets usually deployed in the Middle Tier – same location where the application runs
  • TimesTen is persistent and can run as the only database on a system
  • TimesTen persists to the disk over a background process
  • TimesTen supports High Availability
  • TimesTen can be used as cache between the application and the Oracle database
  • TimesTen can be linked into applications directly – the calls to TimesTen happens over normal subroutine calls in the code which require no context switches on the system
  • You can cache either a group of related tables or even a subset of specific rows
  • TimesTen supports read-only and update able caches
  • TimesTen supports “DynamicLoad” which loads the data from Oracle when you need it
  • You can have as much read-only subscribers as you want
  • TimesTen supports SQL and PL/SQL
  • TimesTen is certified on all main application servers and OR-Mapping tools like Hibernate

The big picture

A couple of weeks ago a customer had some troubles with the overall performance. He complained that everything is slow and (of course) nothing changed since the last few weeks. They also immediately blamed the database on their side to be the issue. The usual stuff also, I’m sure that sounds just too familiar to you.

So as they “identified” the database already as being the issue my team requested some AWR reports and as we got them I noticed a strange but all to common behavior. My team mates got the AWR reports, went to the SQL Statistics, sections “SQL ordered by Elapsed Time” and “SQL ordered by CPU Time” and identified immediately a materialized view rebuild as cause for the problem. That looked like this:

SQL ordered by Elapsed Time

  • Resources reported for PL/SQL code includes the resources used by all SQL statements called by the code.
  • % Total DB Time is the Elapsed Time of the SQL statement divided into the Total Database Time multiplied by 100
Elapsed Time (s) CPU Time (s) Executions Elap per Exec (s) % Total DB Time SQL Id SQL Module SQL Text
1,343 679 1 1342.85 33.00 f20ccnxhvbk65 DECLARE job BINARY_INTEGER := …
770 323 1 770.24 18.93 1usnr4gmcq60d /* MV_REFRESH (DEL) */ delete …
571 355 1 571.23 14.04 gz04689vd55db /* MV_REFRESH (INS) */INSERT /…
311 311 1 310.91 7.64 0vhmfumrjchnh SQL*Plus BEGIN dm_incr_symbols_post_pro…
288 9 0 7.09 bb3f2gjndvjss oracle@crptd1 (TNS V1-V3) SELECT /*+ OPAQUE_TRANSFORM */…
258 258 1 257.76 6.34 75vtwb7j4jzdm SQL*Plus INSERT INTO SYMB_EXTRACTT SELE…
149 121 1 148.80 3.66 90wtn50vy6af6 DECLARE job BINARY_INTEGER := …
114 103 1 113.75 2.80 9993mp6h7kqkp INSERT /*+ BYPASS_RECURSIVE_CH…
48 38 2 24.00 1.18 3nkcg1h5ysqss DECLARE job BINARY_INTEGER := …
48 38 2 23.98 1.18 fvb5prrr7b0c3 MERGE INTO FT_E_UPS1 UPS1 USIN…

Back to SQL Statistics
Back to Top

SQL ordered by CPU Time

  • Resources reported for PL/SQL code includes the resources used by all SQL statements called by the code.
  • % Total DB Time is the Elapsed Time of the SQL statement divided into the Total Database Time multiplied by 100
CPU Time (s) Elapsed Time (s) Executions CPU per Exec (s) % Total DB Time SQL Id SQL Module SQL Text
679 1,343 1 679.06 33.00 f20ccnxhvbk65 DECLARE job BINARY_INTEGER := …
355 571 1 355.50 14.04 gz04689vd55db /* MV_REFRESH (INS) */INSERT /…
323 770 1 323.18 18.93 1usnr4gmcq60d /* MV_REFRESH (DEL) */ delete …
311 311 1 310.89 7.64 0vhmfumrjchnh SQL*Plus BEGIN dm_incr_symbols_post_pro…
258 258 1 257.74 6.34 75vtwb7j4jzdm SQL*Plus INSERT INTO SYMB_EXTRACTT SELE…
121 149 1 121.05 3.66 90wtn50vy6af6 DECLARE job BINARY_INTEGER := …
103 114 1 102.58 2.80 9993mp6h7kqkp INSERT /*+ BYPASS_RECURSIVE_CH…
38 48 2 19.20 1.18 3nkcg1h5ysqss DECLARE job BINARY_INTEGER := …
38 48 2 19.20 1.18 fvb5prrr7b0c3 MERGE INTO FT_E_UPS1 UPS1 USIN…
29 29 12 2.38 0.70 68930z34bm3db SQL*Plus select ‘file[‘ || substr(trim(…
9 288 0 7.09 bb3f2gjndvjss oracle@dftg1 (TNS V1-V3) SELECT /*+ OPAQUE_TRANSFORM */…

So here we have a PL/SQL job which does the materialized view refresh (first line) and the statements for the refresh itself as second and third line. For completeness here the first statement:


DECLARE
 job BINARY_INTEGER := :job;
 next_date DATE := :mydate;
 broken BOOLEAN := FALSE;
BEGIN
 dbms_refresh.refresh('"ABTGHS"."DB_CREAT_STST_PWOR"');
 :mydate := next_date;
 IF broken THEN
 :b := 1;
 ELSE
 :b := 0;
 END IF;
END;

So all what they did was to go there, look the first SQL with high elapsed and cpu time and nearly reported back to them that this is the problem and they have to solve this. I call this the “lucky shot method”. Sometimes when you are lucky than the reason for the issue on the DB is a bad SQL or a bunch of bad SQLs which max out the DB on CPU power, or I/O or whatever. In such a case you just go to those sections, identify the SQLs, fix them and everything is good again – you were lucky. This works sometimes and you are a hero because it took you just 5 minutes for fixing the issue but sometimes it doesn’t and you blame it on some weird constellation of OS, network and something else so that nobody recognizes that you are just a fool and didn’t look at the big picture. Thinking of that I noticed that this is just far too common in IT. Tech admins, DBA, developers – in every section you have people like this. Looking 5 minutes into the issue and telling you then that this and that is the issue. You go ahead and fix them but still no change. So next round trip, and next, and next, and next. All could have been prevented if the person had just once a look into the big picture…

So what is the big picture, what do I mean by that?

I’m a fan of knowing what happens and why. Always analyze all the information you got, even request some more if you think that some vital information is missing and make your conclusions out of that. Stop the try and error method, the lucky shot method. The AWR report I got from the customer is an all too good example. There were 2 simple lines which made me curious:

Snap Id Snap Time Sessions Cursors/Session
Begin Snap: 17934 05-May-10 09:00:59 129 144.1
End Snap: 17935 05-May-10 10:00:12 130 143.0
Elapsed: 59.21 (mins)
DB Time: 67.81 (mins)

I’m talking about the last two. The first tells me that the snapshot which got compared are in a time range of 59.21 minutes. The DB time, so the time when the DB was actually working was 67.81 minutes. So something looks strange here. If I would be in a single core environment I wouldn’t be over my 59.21. If I would be in a dual-core environment then the database was just working 57% of the time (59.21 x 2 cores = 118.42 – DB time multiplies by the amount of cores where work was performed parallel). Next I had a look into the locks going on. If you have high locking then the DB time is also idle but the throughput is low. But it turned that this was also no issue. The next information confirmed then that the DB wasn’t the problem at all:

Operating System Statistics

Statistic Total
AVG_BUSY_TIME 33,088
AVG_IDLE_TIME 321,941

The average idle time statistic is much higher than the busy time. So I requested the amount of cores on that system and it turned out that there are 14 available. A busy database would have a DB time of max. 828.94 minutes. No locking going on. I followed up with them and it turned out as I thought: The database wasn’t the problem at all. In fact it was pretty much idle over the whole time. The issue was on the application side. A java application which memory got filled up so that it had to do full garbage collection all the time to continue processing.

Instead of doing the lucky shot we prevented us and them from a lot of headache by looking into the big picture.

Bitmap indexes and cardinality

In my last two posts I talked a little bit about bitmap indexes, how they work and why they lock. I also mentioned that they should be used for low cardinality but that there is no cut-and-dried answer about what low cardinality is. I’m not starting here to talk about cardinality on bitmap indexes, actually Richard Foote did a much better job on that so if you interested in that check out the following blog posts from him:

Bitmap index usecases

In my previous post I talked about bitmap indexes, where they should be used and where they shouldn’t be used or what you have to keep in mind when using them. Today I show you a example where they are very helpful. Remember I said that bitmap indexes where designed for read intensive environments and that they are useful on columns with low cardinality. Lets assume you have a table for all your employees in your company. The company has three different locations, lets say 1=New York; 2=London; 3=Paris:

SQL> CREATE TABLE employees
(
 empId      NUMBER,
 deptId     NUMBER,
 name       VARCHAR2(50),
 gender     VARCHAR2(6)
);

Table created.

Now I insert lets say 100,000 employees distributed over the three locations. The distribution on the locations is based on a random number out of DBMS_RANDOM. Same for the gender:

SQL> BEGIN
 2    FOR n IN 1..100000 LOOP
 3      INSERT INTO employees VALUES (n, ROUND(DBMS_RANDOM.VALUE(0.5,3.5),0), 'Employee ' || n, DECODE(ROUND(DBMS_RANDOM.VALUE(1,2),0),1,'MALE','FEMALE'));
 4    END LOOP;
 5    COMMIT;
 6  END;
 7  /

PL/SQL procedure successfully completed.

A quick look at the distributions:

SQL> SELECT gender, COUNT(*) FROM employees GROUP BY gender;

GENDER   COUNT(*)
------ ----------
FEMALE      49985
MALE        50015

SQL> SELECT deptId, COUNT(*) from employees GROUP BY deptId;

 DEPTID   COUNT(*)
---------- ----------
 1      33532
 2      33456
 3      33012

So as you can see the distribution is good and the cardinality is “bad”. As the next step I create a normal index on it and look how it behaves.

SQL> CREATE INDEX EMPLOYEES_GENDER_I001 ON employees (gender);

Index created.

SQL> CREATE INDEX EMPLOYEES_DEPT_I001 ON employees (deptId);

Index created.

SQL> exec DBMS_STATS.GATHER_TABLE_STATS(USER, 'EMPLOYEES', CASCADE=>TRUE);

PL/SQL procedure successfully completed.

Now lets have a closer look at some selects and their explain plans. First I just count the number of female employees and the number of employees in the New York (deptId=1) office. As I have indexes on both columns Oracle should retrieve the data directly from them – no need to go to the table:

SQL> set autotrace traceonly explain;
SQL> SELECT COUNT(*) FROM employees WHERE gender = 'FEMALE';

Execution Plan
----------------------------------------------------------
Plan hash value: 938208788

-----------------------------------------------------------------------------------------------
| Id  | Operation             | Name                  | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |                       |     1 |     7 |    88  (18)| 00:00:01 |
|   1 |  SORT AGGREGATE       |                       |     1 |     7 |            |          |
|*  2 |   INDEX FAST FULL SCAN| EMPLOYEES_GENDER_I001 | 50108 |   342K|    88  (18)| 00:00:01 |
-----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

 2 - filter("GENDER"='FEMALE')

SQL> SELECT COUNT(*) FROM employees WHERE deptId = 1;

Execution Plan
----------------------------------------------------------
Plan hash value: 3728864332

-----------------------------------------------------------------------------------------
| Id  | Operation         | Name                | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |                     |     1 |     3 |    74   (9)| 00:00:01 |
|   1 |  SORT AGGREGATE   |                     |     1 |     3 |            |          |
|*  2 |   INDEX RANGE SCAN| EMPLOYEES_DEPT_I001 | 34186 |   100K|    74   (9)| 00:00:01 |
-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

 2 - access("DEPTID"=1)

I got an INDEX FAST FULL SCAN and an INDEX RANGE SCAN – fair enough. Costs are 88 and 74. Now what if I have a more complex statement? What if I want to have all male employees who are either in the New York or the London office:

SQL> SELECT COUNT(*) FROM employees WHERE gender = 'MALE' AND deptId IN (1,2);

Execution Plan
----------------------------------------------------------
Plan hash value: 1712853197

--------------------------------------------------------------------------------
| Id  | Operation          | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |           |     1 |     9 |   177  (16)| 00:00:01 |
|   1 |  SORT AGGREGATE    |           |     1 |     9 |            |          |
|*  2 |   TABLE ACCESS FULL| EMPLOYEES | 34047 |   299K|   177  (16)| 00:00:01 |
--------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

 2 - filter("GENDER"='MALE' AND ("DEPTID"=1 OR "DEPTID"=2))

This leads already to a TABLE ACCESS FULL with costs of 177. Now the same story with bitmap indexes on the columns:

SQL> DROP INDEX EMPLOYEES_GENDER_I001;

Index dropped.

SQL> DROP INDEX EMPLOYEES_DEPT_I001;

Index dropped.

SQL> CREATE BITMAP INDEX EMPLOYEES_GENDER_I001 ON employees (gender);

Index created.

SQL> CREATE BITMAP INDEX EMPLOYEES_DEPT_I001 ON employees (deptId);

Index created.

SQL> exec DBMS_STATS.GATHER_TABLE_STATS(USER, 'EMPLOYEES', CASCADE=>TRUE);

PL/SQL procedure successfully completed.

First lets do the counts again:

SQL> SELECT COUNT(*) FROM employees WHERE gender = 'FEMALE';

Execution Plan
----------------------------------------------------------
Plan hash value: 3004312514

-----------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name                  | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                       |     1 |     7 |     3   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE             |                       |     1 |     7 |            |          |
|   2 |   BITMAP CONVERSION COUNT   |                       | 51145 |   349K|     3   (0)| 00:00:01 |
|*  3 |    BITMAP INDEX SINGLE VALUE| EMPLOYEES_GENDER_I001 |       |       |            |          |
-----------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

 3 - access("GENDER"='FEMALE')

SQL> SELECT COUNT(*) FROM employees WHERE deptId = 1;

Execution Plan
----------------------------------------------------------
Plan hash value: 2661849472

---------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name                | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                     |     1 |     3 |     3   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE             |                     |     1 |     3 |            |          |
|   2 |   BITMAP CONVERSION COUNT   |                     | 34039 |    99K|     3   (0)| 00:00:01 |
|*  3 |    BITMAP INDEX SINGLE VALUE| EMPLOYEES_DEPT_I001 |       |       |            |          |
---------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

 3 - access("DEPTID"=1)

As you can see the optimizer chose another plan now for both tables. Instead of an INDEX FAST FULL SCAN and an INDEX RANGE SCAN I got now both times a BITMAP INDEX SINGLE VALUE and a BITMAP CONVERSION COUNT. Costs dropped from 88 and 74 to 3! Now the complexer statement:

SQL> SELECT COUNT(*) FROM employees WHERE gender = 'MALE' AND deptId IN (1,2);

Execution Plan
----------------------------------------------------------
Plan hash value: 597175463

-------------------------------------------------------------------------------------------------------
| Id  | Operation                     | Name                  | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |                       |     1 |     9 |     9   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE               |                       |     1 |     9 |            |          |
|   2 |   BITMAP CONVERSION COUNT     |                       | 33998 |   298K|     9   (0)| 00:00:01 |
|   3 |    BITMAP AND                 |                       |       |       |            |          |
|*  4 |     BITMAP INDEX SINGLE VALUE | EMPLOYEES_GENDER_I001 |       |       |            |          |
|   5 |     BITMAP OR                 |                       |       |       |            |          |
|*  6 |      BITMAP INDEX SINGLE VALUE| EMPLOYEES_DEPT_I001   |       |       |            |          |
|*  7 |      BITMAP INDEX SINGLE VALUE| EMPLOYEES_DEPT_I001   |       |       |            |          |
-------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

 4 - access("GENDER"='MALE')
 6 - access("DEPTID"=1)
 7 - access("DEPTID"=2)

I got again another plan and here you can see the power of bitmap indexes now. Oracle takes the department ids and does now a BINARY OR on the bitmap. Out of the result it does a BINARY AND with the gender bitmap and look at the result: The costs dropped from 177 on a FULL TABLE SCAN to 9 on some bitmap operations! So as you can see on this short demonstration bitmap indexes are quite powerful!