Wednesday, April 2, 2008

Oracle DBA interview questions

Introduction
Last night i got a call from my friend who wants to "job-hop" to get a 200% pay raise and he wanted answers to some popular interview questions.

There is no one line objective answer to these questions. The answers get better as you gain more experience in the art called database administration. If i were asked these questions below in an interview, my answers will be like the ones in this post. For simplicity, i will assume the database in question is a 9i or a 10g database and also an OLTP one. Some of the answers may not be applicable or may have got changed in other versions.
1) Suppose if i am having RAM of 30GB ,What would be the size for SGA ? what is the ideal calculation for configuring sga?
A) I will go with the assumption that you already have a database available of a particular size and now want to use newly provided 30GB RAM to configure SGA for this existing database. I will go with 1% of the DB size for SGA. So if the database size is 200 GB, the SGA i will recommend is 2GB. Similarly if the database size is 1000GB, i will go with 10 GB SGA. I will follow this formula up to 1000GB (1 Terabyte database). It is not directly proportional afterwards. For very large databases (more than 1 Terabyte), SGA sizing has to be done based on the demand and requirement after getting valuable inputs from performance reports like Statspack reports etc. If i had a 10 TB database i would stop at 10GB SGA and check performance reports to take a suitable decision whether to increase the SGA any further. This formula is purely based on my experience. There is no hard and fast rule. You are always free to add more SGA if you have the physical memory. But read the next question below to know why it may not be of good benefit if the existing SGA is already well configured.

2) If i increase/decrease the sga size what would be the effect to the database.
A) For majority of the cases there may not be any effect. Here is why. You should take statspack or other performance reports and find if the hit ratios are above 98% (I like 99%+ hit ratios). If the LIBRARY CACHE HIT RATIO and BUFFER CACHE HIT RATIO are below 95%, its better to increase the SGA. However if there are full table scans on big tables in your queries, any amount of increase of SGA (DB_CACHE_SIZE) will be futile because the LRU algorithm puts data accessed from table scan in the "Least Recently Used" category thereby causing them to get flushed out of the DB CACHE more frequently than others like those accessed through an index path. Similarly for a query which executes frequently and using literals instead of bind variables. They fill up the SHARED POOL immediately. I have seen databases getting ORA-01403 (Shared Pool out of memory) with a 5 GB SHARED_POOL_SIZE and i have also seen databases running fine with just 80MB SHARED_POOL_SIZE ! Statspack reports are going to be your bible in a 9i database, though you may rely on other components like the ADDM in 10g. In short, i would not increase the SGA unless i go through the performance reports.

3) Say my select query is going on,but some how my db buffer cache got full,so now what will happen to the query, under this scenario from where it will read the data.
A) It will read data from disk. There will not be any failure. You can see slow response (usually only the first time) from the database because it will read data from the disk. But the buffer cache will get flushed soon, because of the LRU algorithm and soon your select query will have free space in the buffer cache (unless your select query is the culprit here doing a lot of full table scans) .
4) Under what condition should i need to increase the sga max size
A) SGA_MAX_SIZE is a "nice to have" init.ora parameter. If you want to experiment with different SGA sizes while having the database up and running, you should configure SGA_MAX_SIZE. This parameter allows you to change individual SGA component sizes like DB_CACHE_SIZE, SHARED_POOL_SIZE dynamically while your database is online. You can increase these SGA components during peak load for e.g when there is a quarter end or month end reporting job that accesses a lot of data.
5) Say a query is not performing well today,but that query was running well so far,some how today it is taking a lot much time ,as a DBA what basic strategy is needed to check the performance of the query ?

A) First get the current sql of the query. You can join v$session and v$sqltext dynamic performance views based on the sql address and get the current SQL that is running (if its active). Every sql statement gets a unique address in the SHARED_POOL_SIZE (think it like an unique identifier throughout its existence in the SHARED POOL) or else you can get it from GUI tools like OEM, SQL Navigator or Embacardo tools.Then it depends why all of a sudden your query is having a higher response time today. May be there are other jobs currently running that are taking lot of resources, or there could have been a code change yesterday night by a database developer or there could have been a sudden data load into the underlying tables since last night. There are several possibilities. I will try to check the num_rows in dba_tables and the actual number of rows using a count(*) and then decide if i want to analyze the tables. Taking a trace (level 12) is a good place to start to get to the bottom of the problem. There could also be a I/O problem in the system level. So i will also check with system administrators.
6) Say like i have executed delete from a table( it containing 5 records only),but it is not processing the delete. I check and see there are some locks in the database. How much time do i need to wait for my delete operation to succeed ? If Oracle does not release the locks, do i have to bounce the database to resolve the problem ?

A) You should find out what kind of lock is there. If its an enqueue lock you have to look at the other offending session that is blocking this session. Enqueue locks are usually a result of bad code. To let the delete statement pass through, I can call the user who is running the offending session and then remove the session from the database. However the root cause lies in the code design. In future the same delete statement may have the problem. Therefore fixing the code will be a long term fix for enqueue kind of locks.
7) When ever we run statspack,we see some top 5 wait events like "log file sync" and "db file scattered read",when these events occur and how to resolve them ?

A) I will not worry about familiar events like "db sequential read" or "sql *net message from client" etc. However i will be concerned if the events are "log file sync", "enqueue" or "file scattered read" etc. if there is huge transactional activity in the database, you should get a trend of your expected log switches per day and if the latest statspack report has this event in the top, then you should find out if any long running DML jobs were running that is causing too many waits on "log file sync". Some of the reasons could be a disk I/0 problem that is causing slow log file syncs or there may be too frequent commits in your long running jobs that is causing frequent log buffer flushes to the online redo logs. DB FILE SCATTERED READ is a scary wait event. If its in the top 5, then that means user sessions are waiting frequently on data accessed from disk. This also indicates that the database is undergoing a lot of full table scans or index fast full scans which will slow down the database.
Related Posts Plugin for WordPress, Blogger...