Bitmap index usecases

In my previous post I talked about bitmap indexes, where they should be used and where they shouldn’t be used or what you have to keep in mind when using them. Today I show you a example where they are very helpful. Remember I said that bitmap indexes where designed for read intensive environments and that they are useful on columns with low cardinality. Lets assume you have a table for all your employees in your company. The company has three different locations, lets say 1=New York; 2=London; 3=Paris:

SQL> CREATE TABLE employees
(
 empId      NUMBER,
 deptId     NUMBER,
 name       VARCHAR2(50),
 gender     VARCHAR2(6)
);

Table created.

Now I insert lets say 100,000 employees distributed over the three locations. The distribution on the locations is based on a random number out of DBMS_RANDOM. Same for the gender:

SQL> BEGIN
 2    FOR n IN 1..100000 LOOP
 3      INSERT INTO employees VALUES (n, ROUND(DBMS_RANDOM.VALUE(0.5,3.5),0), 'Employee ' || n, DECODE(ROUND(DBMS_RANDOM.VALUE(1,2),0),1,'MALE','FEMALE'));
 4    END LOOP;
 5    COMMIT;
 6  END;
 7  /

PL/SQL procedure successfully completed.

A quick look at the distributions:

SQL> SELECT gender, COUNT(*) FROM employees GROUP BY gender;

GENDER   COUNT(*)
------ ----------
FEMALE      49985
MALE        50015

SQL> SELECT deptId, COUNT(*) from employees GROUP BY deptId;

 DEPTID   COUNT(*)
---------- ----------
 1      33532
 2      33456
 3      33012

So as you can see the distribution is good and the cardinality is “bad”. As the next step I create a normal index on it and look how it behaves.

SQL> CREATE INDEX EMPLOYEES_GENDER_I001 ON employees (gender);

Index created.

SQL> CREATE INDEX EMPLOYEES_DEPT_I001 ON employees (deptId);

Index created.

SQL> exec DBMS_STATS.GATHER_TABLE_STATS(USER, 'EMPLOYEES', CASCADE=>TRUE);

PL/SQL procedure successfully completed.

Now lets have a closer look at some selects and their explain plans. First I just count the number of female employees and the number of employees in the New York (deptId=1) office. As I have indexes on both columns Oracle should retrieve the data directly from them – no need to go to the table:

SQL> set autotrace traceonly explain;
SQL> SELECT COUNT(*) FROM employees WHERE gender = 'FEMALE';

Execution Plan
----------------------------------------------------------
Plan hash value: 938208788

-----------------------------------------------------------------------------------------------
| Id  | Operation             | Name                  | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |                       |     1 |     7 |    88  (18)| 00:00:01 |
|   1 |  SORT AGGREGATE       |                       |     1 |     7 |            |          |
|*  2 |   INDEX FAST FULL SCAN| EMPLOYEES_GENDER_I001 | 50108 |   342K|    88  (18)| 00:00:01 |
-----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

 2 - filter("GENDER"='FEMALE')

SQL> SELECT COUNT(*) FROM employees WHERE deptId = 1;

Execution Plan
----------------------------------------------------------
Plan hash value: 3728864332

-----------------------------------------------------------------------------------------
| Id  | Operation         | Name                | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |                     |     1 |     3 |    74   (9)| 00:00:01 |
|   1 |  SORT AGGREGATE   |                     |     1 |     3 |            |          |
|*  2 |   INDEX RANGE SCAN| EMPLOYEES_DEPT_I001 | 34186 |   100K|    74   (9)| 00:00:01 |
-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

 2 - access("DEPTID"=1)

I got an INDEX FAST FULL SCAN and an INDEX RANGE SCAN – fair enough. Costs are 88 and 74. Now what if I have a more complex statement? What if I want to have all male employees who are either in the New York or the London office:

SQL> SELECT COUNT(*) FROM employees WHERE gender = 'MALE' AND deptId IN (1,2);

Execution Plan
----------------------------------------------------------
Plan hash value: 1712853197

--------------------------------------------------------------------------------
| Id  | Operation          | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |           |     1 |     9 |   177  (16)| 00:00:01 |
|   1 |  SORT AGGREGATE    |           |     1 |     9 |            |          |
|*  2 |   TABLE ACCESS FULL| EMPLOYEES | 34047 |   299K|   177  (16)| 00:00:01 |
--------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

 2 - filter("GENDER"='MALE' AND ("DEPTID"=1 OR "DEPTID"=2))

This leads already to a TABLE ACCESS FULL with costs of 177. Now the same story with bitmap indexes on the columns:

SQL> DROP INDEX EMPLOYEES_GENDER_I001;

Index dropped.

SQL> DROP INDEX EMPLOYEES_DEPT_I001;

Index dropped.

SQL> CREATE BITMAP INDEX EMPLOYEES_GENDER_I001 ON employees (gender);

Index created.

SQL> CREATE BITMAP INDEX EMPLOYEES_DEPT_I001 ON employees (deptId);

Index created.

SQL> exec DBMS_STATS.GATHER_TABLE_STATS(USER, 'EMPLOYEES', CASCADE=>TRUE);

PL/SQL procedure successfully completed.

First lets do the counts again:

SQL> SELECT COUNT(*) FROM employees WHERE gender = 'FEMALE';

Execution Plan
----------------------------------------------------------
Plan hash value: 3004312514

-----------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name                  | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                       |     1 |     7 |     3   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE             |                       |     1 |     7 |            |          |
|   2 |   BITMAP CONVERSION COUNT   |                       | 51145 |   349K|     3   (0)| 00:00:01 |
|*  3 |    BITMAP INDEX SINGLE VALUE| EMPLOYEES_GENDER_I001 |       |       |            |          |
-----------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

 3 - access("GENDER"='FEMALE')

SQL> SELECT COUNT(*) FROM employees WHERE deptId = 1;

Execution Plan
----------------------------------------------------------
Plan hash value: 2661849472

---------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name                | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                     |     1 |     3 |     3   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE             |                     |     1 |     3 |            |          |
|   2 |   BITMAP CONVERSION COUNT   |                     | 34039 |    99K|     3   (0)| 00:00:01 |
|*  3 |    BITMAP INDEX SINGLE VALUE| EMPLOYEES_DEPT_I001 |       |       |            |          |
---------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

 3 - access("DEPTID"=1)

As you can see the optimizer chose another plan now for both tables. Instead of an INDEX FAST FULL SCAN and an INDEX RANGE SCAN I got now both times a BITMAP INDEX SINGLE VALUE and a BITMAP CONVERSION COUNT. Costs dropped from 88 and 74 to 3! Now the complexer statement:

SQL> SELECT COUNT(*) FROM employees WHERE gender = 'MALE' AND deptId IN (1,2);

Execution Plan
----------------------------------------------------------
Plan hash value: 597175463

-------------------------------------------------------------------------------------------------------
| Id  | Operation                     | Name                  | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |                       |     1 |     9 |     9   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE               |                       |     1 |     9 |            |          |
|   2 |   BITMAP CONVERSION COUNT     |                       | 33998 |   298K|     9   (0)| 00:00:01 |
|   3 |    BITMAP AND                 |                       |       |       |            |          |
|*  4 |     BITMAP INDEX SINGLE VALUE | EMPLOYEES_GENDER_I001 |       |       |            |          |
|   5 |     BITMAP OR                 |                       |       |       |            |          |
|*  6 |      BITMAP INDEX SINGLE VALUE| EMPLOYEES_DEPT_I001   |       |       |            |          |
|*  7 |      BITMAP INDEX SINGLE VALUE| EMPLOYEES_DEPT_I001   |       |       |            |          |
-------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

 4 - access("GENDER"='MALE')
 6 - access("DEPTID"=1)
 7 - access("DEPTID"=2)

I got again another plan and here you can see the power of bitmap indexes now. Oracle takes the department ids and does now a BINARY OR on the bitmap. Out of the result it does a BINARY AND with the gender bitmap and look at the result: The costs dropped from 177 on a FULL TABLE SCAN to 9 on some bitmap operations! So as you can see on this short demonstration bitmap indexes are quite powerful!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.