Wednesday, April 2, 2008

Oracle DBA interview questions

Introduction
Last night i got a call from my friend who wants to "job-hop" to get a 200% pay raise and he wanted answers to some popular interview questions.

There is no one line objective answer to these questions. The answers get better as you gain more experience in the art called database administration. If i were asked these questions below in an interview, my answers will be like the ones in this post. For simplicity, i will assume the database in question is a 9i or a 10g database and also an OLTP one. Some of the answers may not be applicable or may have got changed in other versions.
1) Suppose if i am having RAM of 30GB ,What would be the size for SGA ? what is the ideal calculation for configuring sga?
A) I will go with the assumption that you already have a database available of a particular size and now want to use newly provided 30GB RAM to configure SGA for this existing database. I will go with 1% of the DB size for SGA. So if the database size is 200 GB, the SGA i will recommend is 2GB. Similarly if the database size is 1000GB, i will go with 10 GB SGA. I will follow this formula up to 1000GB (1 Terabyte database). It is not directly proportional afterwards. For very large databases (more than 1 Terabyte), SGA sizing has to be done based on the demand and requirement after getting valuable inputs from performance reports like Statspack reports etc. If i had a 10 TB database i would stop at 10GB SGA and check performance reports to take a suitable decision whether to increase the SGA any further. This formula is purely based on my experience. There is no hard and fast rule. You are always free to add more SGA if you have the physical memory. But read the next question below to know why it may not be of good benefit if the existing SGA is already well configured.

2) If i increase/decrease the sga size what would be the effect to the database.
A) For majority of the cases there may not be any effect. Here is why. You should take statspack or other performance reports and find if the hit ratios are above 98% (I like 99%+ hit ratios). If the LIBRARY CACHE HIT RATIO and BUFFER CACHE HIT RATIO are below 95%, its better to increase the SGA. However if there are full table scans on big tables in your queries, any amount of increase of SGA (DB_CACHE_SIZE) will be futile because the LRU algorithm puts data accessed from table scan in the "Least Recently Used" category thereby causing them to get flushed out of the DB CACHE more frequently than others like those accessed through an index path. Similarly for a query which executes frequently and using literals instead of bind variables. They fill up the SHARED POOL immediately. I have seen databases getting ORA-01403 (Shared Pool out of memory) with a 5 GB SHARED_POOL_SIZE and i have also seen databases running fine with just 80MB SHARED_POOL_SIZE ! Statspack reports are going to be your bible in a 9i database, though you may rely on other components like the ADDM in 10g. In short, i would not increase the SGA unless i go through the performance reports.

3) Say my select query is going on,but some how my db buffer cache got full,so now what will happen to the query, under this scenario from where it will read the data.
A) It will read data from disk. There will not be any failure. You can see slow response (usually only the first time) from the database because it will read data from the disk. But the buffer cache will get flushed soon, because of the LRU algorithm and soon your select query will have free space in the buffer cache (unless your select query is the culprit here doing a lot of full table scans) .
4) Under what condition should i need to increase the sga max size
A) SGA_MAX_SIZE is a "nice to have" init.ora parameter. If you want to experiment with different SGA sizes while having the database up and running, you should configure SGA_MAX_SIZE. This parameter allows you to change individual SGA component sizes like DB_CACHE_SIZE, SHARED_POOL_SIZE dynamically while your database is online. You can increase these SGA components during peak load for e.g when there is a quarter end or month end reporting job that accesses a lot of data.
5) Say a query is not performing well today,but that query was running well so far,some how today it is taking a lot much time ,as a DBA what basic strategy is needed to check the performance of the query ?

A) First get the current sql of the query. You can join v$session and v$sqltext dynamic performance views based on the sql address and get the current SQL that is running (if its active). Every sql statement gets a unique address in the SHARED_POOL_SIZE (think it like an unique identifier throughout its existence in the SHARED POOL) or else you can get it from GUI tools like OEM, SQL Navigator or Embacardo tools.Then it depends why all of a sudden your query is having a higher response time today. May be there are other jobs currently running that are taking lot of resources, or there could have been a code change yesterday night by a database developer or there could have been a sudden data load into the underlying tables since last night. There are several possibilities. I will try to check the num_rows in dba_tables and the actual number of rows using a count(*) and then decide if i want to analyze the tables. Taking a trace (level 12) is a good place to start to get to the bottom of the problem. There could also be a I/O problem in the system level. So i will also check with system administrators.
6) Say like i have executed delete from a table( it containing 5 records only),but it is not processing the delete. I check and see there are some locks in the database. How much time do i need to wait for my delete operation to succeed ? If Oracle does not release the locks, do i have to bounce the database to resolve the problem ?

A) You should find out what kind of lock is there. If its an enqueue lock you have to look at the other offending session that is blocking this session. Enqueue locks are usually a result of bad code. To let the delete statement pass through, I can call the user who is running the offending session and then remove the session from the database. However the root cause lies in the code design. In future the same delete statement may have the problem. Therefore fixing the code will be a long term fix for enqueue kind of locks.
7) When ever we run statspack,we see some top 5 wait events like "log file sync" and "db file scattered read",when these events occur and how to resolve them ?

A) I will not worry about familiar events like "db sequential read" or "sql *net message from client" etc. However i will be concerned if the events are "log file sync", "enqueue" or "file scattered read" etc. if there is huge transactional activity in the database, you should get a trend of your expected log switches per day and if the latest statspack report has this event in the top, then you should find out if any long running DML jobs were running that is causing too many waits on "log file sync". Some of the reasons could be a disk I/0 problem that is causing slow log file syncs or there may be too frequent commits in your long running jobs that is causing frequent log buffer flushes to the online redo logs. DB FILE SCATTERED READ is a scary wait event. If its in the top 5, then that means user sessions are waiting frequently on data accessed from disk. This also indicates that the database is undergoing a lot of full table scans or index fast full scans which will slow down the database.

4 comments:

Joel Garry said...

What in the world does the size of the database have to do with the SGA size? The SGA needs to be a compromize between the number and type of transactions, operating system limitations and the physical hardware.

Hit ratio is bogus, and should not be used at all for determining any performance metrics. Since I came here from a thread on OTN forums where refs to this were given, I need say no more.

Full table scans are added to the other end of the LRU from regular scans. You need to read the performance manual about v$bh and look and see what is happening, as well as understand multiple buffer pools.

Statistics! Check the statistics!

You should sometimes worry more about sequential read and less about scattered read - what is really important is are they appropriate for what the inquiry is doing? Full scans can be good things, see for example http://richardfoote.wordpress.com/2008/05/12/index-scan-or-full-table-scan-the-magic-number-magic-dance/

But then again, I can hardly ever give the answers interviewers want to hear.

word: oegvh

Srinivas Ramineni said...

>>What in the world does the size of the database have to do with the SGA size?

-------------------

You cannot have a SGA of 20GB for a database of 10GB ! Will you ? or If your database size was initially 10GB and SGA size was initially 500MB, will you leave the SGA there if your current DB size has become 5 TB ??.There can be several others factors as i mentioned. I have given a practical answer for a general OLTP database.

>>Hit ratio is bogus, and should not be used at all for determining any performance metrics

---------------------

With all due respect, you need to revise your basics. I would like to politely disagree with you on your statement 100%. I worked with statspack reports extensively over the last few years. Read my second interview question above and you will know why i have said that, especially the ORA-01403 comment.

>>You should sometimes worry more about sequential read and less about scattered read

---------------------

Yes only sometimes, not majority of the times !! Infact quite rare. Physical reads do more harm than buffer gets if SGA was sized properly. I am talking about real world databases in my interview questions, not dummy databases. So my tables have multi million rows, not a couple of dozens !

Aman.... said...

Srinivas,
just reading your blog.
I am not convinced with the answers of the questions.
You mentioned that hit ratio is better of X value and increase of sga_size is also recommended.with all due respect,you can have any number for hitratio.Why sga-size increase will give the benefit?You think that shared_pool of Xmb/gb was not enough and you got shared pool memory error and the answer is is to increase the parameter?Tell me what you think it will achive?A badly written query/application will again eat up the shared pool no?
You mentioned that statspack is a good (bible for tuning).The problem with statspack and AWR (10g) are that they work upon aggregates and in the aggregates there are very well chancesthat we are presented some thing which actualy doesn't need at all attention.The same rule is applicable to all the V$ views too.
You mentione that physical reads are issues.Well that's a conception that I seeall the time.One must be worried about the data access from "memory" rather than physical reads from the hard disk as in the memory , lots of factors kick in like latches and all that.And this is enough to kill db.
Well I shall say that check out the P/T docs guide.Not to say that the answers are all wrong but they need a little modification here and there as there are lots of "researches" done overthese topics and the new factors are little bit different from the "classic remedies".

Cheers
Aman....

Anonymous said...

Interesting take on some fundamental interview questions. Although, my interview questions were only that detailed when a DBA was among the interviewers.

This article has some great tips for new Oracle DBA's looking for that first DBA job.

Related Posts Plugin for WordPress, Blogger...