And Hello again!

Returning to the blogging world

It has been a bit more than two and a half years when I started to blog under the Oracle domain https://blogs.oracle.com/developer/. During all that time I focussed on a simple, single goal: Provide a blog for Oracle developers out there.

Continue reading “And Hello again!”

Blog update!

Hi everyone!

This blog hasn’t seen any updates since over a year now and I do apologize for that. However, the reason for that is simple. Last year I’ve decided to take on a new project. But before I was going to tell everybody about it I wanted to make sure that I also really got the time to do so. In the meantime I got the confidence that the project will succeed and that I have enough time to take it forward. So, how to say it… it’s almost like breaking up with a girlfriend. Well, here it is: I got a new one! Blog I mean of course. Last year in August I started a new blog: https://blogs.oracle.com/developer/
The intention of that blog is to show how write the right code for using Oracle technologies. Over the years I’ve seen a lot of badly written code that would never ever perform on top of technologies such as the Oracle Database. Unfortunately, I’ve also not seen too many folks that did understand Database performance and were good programmers. There seems to be a gap between the database world and the programming world, well I’m sure that that doesn’t come as a surprise to you. So, with this new blog I want to bridge the gap as good as I can.

Let’s stay friends!

However, this blog is not dead yet! The other blog is for a very specific goal. But there is more in my IT life than just that. And everything that doesn’t fit into the above one will sooner or later end up here. That being said, the above one will definitely get more attention from me going forward so you may want to add that one to your RSS Reader as well!

Last but not least: I’ve also finally managed to go live on Twitter. If you prefer that way of communication you can follow me @GeraldVenzl

2012 in review

The WordPress.com stats helper monkeys prepared a 2012 annual report for this blog.

Here’s an excerpt:

4,329 films were submitted to the 2012 Cannes Film Festival. This blog had 46,000 views in 2012. If each view were a film, this blog would power 11 Film Festivals

Click here to see the complete report.

Indexing strategies with Jonathan Lewis

Today I was part of a NYOUG seminar about “Indexing Strategies” held by Mr. Jonathan Lewis himself. If somebody doesn’t know Jon Lewis, he is one of the top Oracle gurus on the entire planet with more than 25 years of Oracle experience. He probably used Oracle before I was even born (1985)! The very interesting fact however is that he actually never ever worked for Oracle but still has all the deep inside knowledge. Usually those gurus came out from Oracle or are still working for them. I guess the great amount of years dealing with Oracle helped him there.

However, Jonathan Lewis is also one of my top favorite because he shares his knowledge on his personal blog: http://jonathanlewis.wordpress.com In fact, Jons blog is the only one which I never can catch up with because he posts literally quicker than I got time to read his fabulous posts… reminds me that I should update my Blogroll.

The seminar itself was simply great. It was not free but those 230 bucks where totally worth it. Not only is Jon Lewis a great speaker – was the first time that I actually saw him – he also gave great detailed insights into indexes how they work, what is all possible with them and why Oracle is actually taking an index and much more important when it is not! . Furthermore he combined his sessions with some real world examples that he himself dealt with in the past which makes the whole thing much less theoretical.

Of course I won’t cover the whole seminar here, but I’ll end with some interesting things that I took out of it – some of which I got reminded again, some which were new for me:

  • Indexing means: Getting to data quickly
  • Indexes focus/compact data
  • There is always a Trade-Off between loading and querying performance
  • Index maintenance is expensive – it introduces costs
  • It’s all about precision
  • Non-unique indexes include the rowid automatically – for Oracle there are no non-unique indexes
  • Index compression can save a lots of space on repetitive columns
  • Index compression means elimination of duplicates
  • Primary key constraints don’t necessarily need unique key indexes but only indexes with the columns in the right order
  • Reverse indexes scatter entries accros the line, clustering factor can end up terrible
  • Don’t duplicate indexes – e.g. FK constraints can share the first columns of another index
  • Difference between tables and indexes: Index blocks have to be at the right place
  • Don’t trust Oracle – test everything!

 

2010 in review

The stats helper monkeys at WordPress.com mulled over how this blog did in 2010, and here’s a high level summary of its overall blog health:

Healthy blog!

The Blog-Health-o-Meter™ reads Fresher than ever.

Crunchy numbers

Featured image

The average container ship can carry about 4,500 containers. This blog was viewed about 18,000 times in 2010. If each view were a shipping container, your blog would have filled about 4 fully loaded ships.

 

In 2010, there were 12 new posts, growing the total archive of this blog to 89 posts. There were 11 pictures uploaded, taking up a total of 164kb. That’s about a picture per month.

The busiest day of the year was December 10th with 127 views. The most popular post that day was BULK COLLECT & FORALL vs. CURSOR & FOR-LOOP.

Where did they come from?

The top referring sites in 2010 were google.co.in, google.com, forums.oracle.com, oracleportal.it, and yandex.ru.

Some visitors came searching, mostly for bulk collect, private strand flush not complete, bulk collect forall, bulk collect in cursor, and enq: dx – contention.

Attractions in 2010

These are the posts and pages that got the most views in 2010.

1

BULK COLLECT & FORALL vs. CURSOR & FOR-LOOP September 2007
27 comments

2

Oracle 10g enqueue waits May 2009
1 comment

3

Private strand flush not complete October 2008

4

Finally a good tool for Data Modeling: Oracle SqlDeveloper Data Modeling November 2008
6 comments

5

Prepared statement in batch mode vs. FORALL in PL/SQL November 2007

Why my job is safe

I spent the last couple of days with coding a new registration module for a private website that I administrate. The requirements were the very basic ones:

  • Make an HTML form
  • Validate the inputs
  • Send a confirmation email with a link
  • On link activation: Retrieve the data and write it into a table in the database

As those requirements are common in so many registration forms all over the web, I thought: Before coding something again, just have a look at Google and see if  you can find something useful to embed. And I found stuff, lots of stuff but surprisingly every example I found did the following:

  • HTML form
  • Write data into the table with an “active” column set to false
  • Send a confirmation email with an link with an unique identifier for the inserted row (most times a md5 hash value)
  • On link activation: Retrieve the unique identifier
  • Update the row in the database with active flag set to true

And, at least some examples, took also care about the house keeping and cleaned the inactive rows from time to time.

Well, this does the job of course but the way how it does it is simply dirty – very dirty!

From the functional aspect there are following issues:

  • Every SQL statement for retrieving the active data has to make sure to include ” AND active = TRUE” in the WHERE clause
  • If an user is unlucky, the cleanup process removes his data before he can click the activation link if the process doesn’t implement a time range

From the performance side however:

All I want to do is:

  • Validate the data from the form
  • Save it temporarily until the link is clicked or the data gets expired
  • Write it into the database

With those solutions found on the internet however I would have done following:

  • Validate the data
  • Write it into the database (with an column extension for an active flag)
  • Retrieve the data again and update it to set it active
  • Create a cleanup job to purge inactive data
  • Extend all my SQLs against that table with a “AND active = TRUE” WHERE clause

So I have more steps, more unnecessary hops to the database, probably maintain another index over time for the cleanup job. Sometimes I wonder why programmers don’t think first before the code. Of course many roads lead to Rome but that doesn’t mean that you should just take a random one. But as long as there are programmers out there who don’t think or don’t think hard enough there will be always performance problems and people like me needed to solve them. So I guess I shouldn’t complain but my dream of the perfect IT world is still alive…

Well, the webpage in my case is made in PHP which has one nice feature called: Sessions
That does the job perfectly fine and most important: Clean! It is enabled by default, allows you to store data within a session and includes also an expire functionality which is also enabled by default. It supports two ways of storing the session id: Either within a cookie in the browser (enabled by default) or you can also pass the session id over POST and GET requests. As I use a confirmation email to activate the registration I just had to make sure to deactivate the cookie storage feature first as I pass the session id over a link/GET request. I wouldn’t have to deactivate the cookie storage from a functional point of view as the cookie would be just never read again and would expire some time but from security and the “do it right” aspect this has to be deactivated. So all I did was following:

// Disable cookie storage
ini_set("session.use_cookies",0);
// Start the session
session_start();

// Save the data within the session
$_SESSION["Title"] = $Title;
$_SESSION["FirstName"] = $FistName;
$_SESSION["LastName"] = $LastName;

//Send the email with the activation link
if (mail($Email, "Please activate your registration", "http://www.mywebsite.com/activate.php?sid=".htmlspecialchars (session_id()))
{
 // If sending the mail was successful, store all session data and close the session
 session_write_close();
}
else
{
 // If sending the mail failed, report an error, delete the session data and destroy the session!
 // Error reporting....
 session_unset();
 session_destroy();
}

And all the activation.php file does, is:


// Disable cookie storage
ini_set("session.use_cookies",0);
// Set the session id from GET request
session_id($_GET["sid"]);
// Start session
session_start();
// Get variables
$Title = $_SESSION["Title"];
$FistName = $_SESSION["FirstName"];
$LastName = $_SESSION["LastName"];

// Connect to database
$hDB = @mysql_connect("localhost", "user","password");
mysql_select_db("myDB", $hDB);

// Insert data into database
if (mysql_query("INSERT INTO table (Title, FirstName, LastName) VALUES ('".mysql_real_escape_string($Title)."','".mysql_real_escape_string($FirstName)."','".mysql_real_escape_string($LastName)."'")
{
 // Destroy session
 session_unset();
 session_destroy();
}

The big picture

A couple of weeks ago a customer had some troubles with the overall performance. He complained that everything is slow and (of course) nothing changed since the last few weeks. They also immediately blamed the database on their side to be the issue. The usual stuff also, I’m sure that sounds just too familiar to you.

So as they “identified” the database already as being the issue my team requested some AWR reports and as we got them I noticed a strange but all to common behavior. My team mates got the AWR reports, went to the SQL Statistics, sections “SQL ordered by Elapsed Time” and “SQL ordered by CPU Time” and identified immediately a materialized view rebuild as cause for the problem. That looked like this:

SQL ordered by Elapsed Time

  • Resources reported for PL/SQL code includes the resources used by all SQL statements called by the code.
  • % Total DB Time is the Elapsed Time of the SQL statement divided into the Total Database Time multiplied by 100
Elapsed Time (s) CPU Time (s) Executions Elap per Exec (s) % Total DB Time SQL Id SQL Module SQL Text
1,343 679 1 1342.85 33.00 f20ccnxhvbk65 DECLARE job BINARY_INTEGER := …
770 323 1 770.24 18.93 1usnr4gmcq60d /* MV_REFRESH (DEL) */ delete …
571 355 1 571.23 14.04 gz04689vd55db /* MV_REFRESH (INS) */INSERT /…
311 311 1 310.91 7.64 0vhmfumrjchnh SQL*Plus BEGIN dm_incr_symbols_post_pro…
288 9 0 7.09 bb3f2gjndvjss oracle@crptd1 (TNS V1-V3) SELECT /*+ OPAQUE_TRANSFORM */…
258 258 1 257.76 6.34 75vtwb7j4jzdm SQL*Plus INSERT INTO SYMB_EXTRACTT SELE…
149 121 1 148.80 3.66 90wtn50vy6af6 DECLARE job BINARY_INTEGER := …
114 103 1 113.75 2.80 9993mp6h7kqkp INSERT /*+ BYPASS_RECURSIVE_CH…
48 38 2 24.00 1.18 3nkcg1h5ysqss DECLARE job BINARY_INTEGER := …
48 38 2 23.98 1.18 fvb5prrr7b0c3 MERGE INTO FT_E_UPS1 UPS1 USIN…

Back to SQL Statistics
Back to Top

SQL ordered by CPU Time

  • Resources reported for PL/SQL code includes the resources used by all SQL statements called by the code.
  • % Total DB Time is the Elapsed Time of the SQL statement divided into the Total Database Time multiplied by 100
CPU Time (s) Elapsed Time (s) Executions CPU per Exec (s) % Total DB Time SQL Id SQL Module SQL Text
679 1,343 1 679.06 33.00 f20ccnxhvbk65 DECLARE job BINARY_INTEGER := …
355 571 1 355.50 14.04 gz04689vd55db /* MV_REFRESH (INS) */INSERT /…
323 770 1 323.18 18.93 1usnr4gmcq60d /* MV_REFRESH (DEL) */ delete …
311 311 1 310.89 7.64 0vhmfumrjchnh SQL*Plus BEGIN dm_incr_symbols_post_pro…
258 258 1 257.74 6.34 75vtwb7j4jzdm SQL*Plus INSERT INTO SYMB_EXTRACTT SELE…
121 149 1 121.05 3.66 90wtn50vy6af6 DECLARE job BINARY_INTEGER := …
103 114 1 102.58 2.80 9993mp6h7kqkp INSERT /*+ BYPASS_RECURSIVE_CH…
38 48 2 19.20 1.18 3nkcg1h5ysqss DECLARE job BINARY_INTEGER := …
38 48 2 19.20 1.18 fvb5prrr7b0c3 MERGE INTO FT_E_UPS1 UPS1 USIN…
29 29 12 2.38 0.70 68930z34bm3db SQL*Plus select ‘file[‘ || substr(trim(…
9 288 0 7.09 bb3f2gjndvjss oracle@dftg1 (TNS V1-V3) SELECT /*+ OPAQUE_TRANSFORM */…

So here we have a PL/SQL job which does the materialized view refresh (first line) and the statements for the refresh itself as second and third line. For completeness here the first statement:


DECLARE
 job BINARY_INTEGER := :job;
 next_date DATE := :mydate;
 broken BOOLEAN := FALSE;
BEGIN
 dbms_refresh.refresh('"ABTGHS"."DB_CREAT_STST_PWOR"');
 :mydate := next_date;
 IF broken THEN
 :b := 1;
 ELSE
 :b := 0;
 END IF;
END;

So all what they did was to go there, look the first SQL with high elapsed and cpu time and nearly reported back to them that this is the problem and they have to solve this. I call this the “lucky shot method”. Sometimes when you are lucky than the reason for the issue on the DB is a bad SQL or a bunch of bad SQLs which max out the DB on CPU power, or I/O or whatever. In such a case you just go to those sections, identify the SQLs, fix them and everything is good again – you were lucky. This works sometimes and you are a hero because it took you just 5 minutes for fixing the issue but sometimes it doesn’t and you blame it on some weird constellation of OS, network and something else so that nobody recognizes that you are just a fool and didn’t look at the big picture. Thinking of that I noticed that this is just far too common in IT. Tech admins, DBA, developers – in every section you have people like this. Looking 5 minutes into the issue and telling you then that this and that is the issue. You go ahead and fix them but still no change. So next round trip, and next, and next, and next. All could have been prevented if the person had just once a look into the big picture…

So what is the big picture, what do I mean by that?

I’m a fan of knowing what happens and why. Always analyze all the information you got, even request some more if you think that some vital information is missing and make your conclusions out of that. Stop the try and error method, the lucky shot method. The AWR report I got from the customer is an all too good example. There were 2 simple lines which made me curious:

Snap Id Snap Time Sessions Cursors/Session
Begin Snap: 17934 05-May-10 09:00:59 129 144.1
End Snap: 17935 05-May-10 10:00:12 130 143.0
Elapsed: 59.21 (mins)
DB Time: 67.81 (mins)

I’m talking about the last two. The first tells me that the snapshot which got compared are in a time range of 59.21 minutes. The DB time, so the time when the DB was actually working was 67.81 minutes. So something looks strange here. If I would be in a single core environment I wouldn’t be over my 59.21. If I would be in a dual-core environment then the database was just working 57% of the time (59.21 x 2 cores = 118.42 – DB time multiplies by the amount of cores where work was performed parallel). Next I had a look into the locks going on. If you have high locking then the DB time is also idle but the throughput is low. But it turned that this was also no issue. The next information confirmed then that the DB wasn’t the problem at all:

Operating System Statistics

Statistic Total
AVG_BUSY_TIME 33,088
AVG_IDLE_TIME 321,941

The average idle time statistic is much higher than the busy time. So I requested the amount of cores on that system and it turned out that there are 14 available. A busy database would have a DB time of max. 828.94 minutes. No locking going on. I followed up with them and it turned out as I thought: The database wasn’t the problem at all. In fact it was pretty much idle over the whole time. The issue was on the application side. A java application which memory got filled up so that it had to do full garbage collection all the time to continue processing.

Instead of doing the lucky shot we prevented us and them from a lot of headache by looking into the big picture.

C is back at number 1 position

TIOBE just released the “Tiobe programming community index” for April 2010. And guess what – C is back on top with 18.058% rating closely followed by Java (of course) with 18.051%. PL/SQL fell down to position 14 (before 12) and Objective-C made a huge jump up to position 11 (before 42) – I guess there are more people outside who want to make money with the iPhone Apps! 🙂 For everyone who doesn’t know: Objective-C is the programming language to write application on the Apple architecture. For more info about Objective-C see on Wikipedia.