Wednesday, April 2, 2008

Oracle DBA interview questions

Introduction
Last night i got a call from my friend who wants to "job-hop" to get a 200% pay raise and he wanted answers to some popular interview questions.

There is no one line objective answer to these questions. The answers get better as you gain more experience in the art called database administration. If i were asked these questions below in an interview, my answers will be like the ones in this post. For simplicity, i will assume the database in question is a 9i or a 10g database and also an OLTP one. Some of the answers may not be applicable or may have got changed in other versions.
1) Suppose if i am having RAM of 30GB ,What would be the size for SGA ? what is the ideal calculation for configuring sga?
A) I will go with the assumption that you already have a database available of a particular size and now want to use newly provided 30GB RAM to configure SGA for this existing database. I will go with 1% of the DB size for SGA. So if the database size is 200 GB, the SGA i will recommend is 2GB. Similarly if the database size is 1000GB, i will go with 10 GB SGA. I will follow this formula up to 1000GB (1 Terabyte database). It is not directly proportional afterwards. For very large databases (more than 1 Terabyte), SGA sizing has to be done based on the demand and requirement after getting valuable inputs from performance reports like Statspack reports etc. If i had a 10 TB database i would stop at 10GB SGA and check performance reports to take a suitable decision whether to increase the SGA any further. This formula is purely based on my experience. There is no hard and fast rule. You are always free to add more SGA if you have the physical memory. But read the next question below to know why it may not be of good benefit if the existing SGA is already well configured.

2) If i increase/decrease the sga size what would be the effect to the database.
A) For majority of the cases there may not be any effect. Here is why. You should take statspack or other performance reports and find if the hit ratios are above 98% (I like 99%+ hit ratios). If the LIBRARY CACHE HIT RATIO and BUFFER CACHE HIT RATIO are below 95%, its better to increase the SGA. However if there are full table scans on big tables in your queries, any amount of increase of SGA (DB_CACHE_SIZE) will be futile because the LRU algorithm puts data accessed from table scan in the "Least Recently Used" category thereby causing them to get flushed out of the DB CACHE more frequently than others like those accessed through an index path. Similarly for a query which executes frequently and using literals instead of bind variables. They fill up the SHARED POOL immediately. I have seen databases getting ORA-01403 (Shared Pool out of memory) with a 5 GB SHARED_POOL_SIZE and i have also seen databases running fine with just 80MB SHARED_POOL_SIZE ! Statspack reports are going to be your bible in a 9i database, though you may rely on other components like the ADDM in 10g. In short, i would not increase the SGA unless i go through the performance reports.

3) Say my select query is going on,but some how my db buffer cache got full,so now what will happen to the query, under this scenario from where it will read the data.
A) It will read data from disk. There will not be any failure. You can see slow response (usually only the first time) from the database because it will read data from the disk. But the buffer cache will get flushed soon, because of the LRU algorithm and soon your select query will have free space in the buffer cache (unless your select query is the culprit here doing a lot of full table scans) .
4) Under what condition should i need to increase the sga max size
A) SGA_MAX_SIZE is a "nice to have" init.ora parameter. If you want to experiment with different SGA sizes while having the database up and running, you should configure SGA_MAX_SIZE. This parameter allows you to change individual SGA component sizes like DB_CACHE_SIZE, SHARED_POOL_SIZE dynamically while your database is online. You can increase these SGA components during peak load for e.g when there is a quarter end or month end reporting job that accesses a lot of data.
5) Say a query is not performing well today,but that query was running well so far,some how today it is taking a lot much time ,as a DBA what basic strategy is needed to check the performance of the query ?

A) First get the current sql of the query. You can join v$session and v$sqltext dynamic performance views based on the sql address and get the current SQL that is running (if its active). Every sql statement gets a unique address in the SHARED_POOL_SIZE (think it like an unique identifier throughout its existence in the SHARED POOL) or else you can get it from GUI tools like OEM, SQL Navigator or Embacardo tools.Then it depends why all of a sudden your query is having a higher response time today. May be there are other jobs currently running that are taking lot of resources, or there could have been a code change yesterday night by a database developer or there could have been a sudden data load into the underlying tables since last night. There are several possibilities. I will try to check the num_rows in dba_tables and the actual number of rows using a count(*) and then decide if i want to analyze the tables. Taking a trace (level 12) is a good place to start to get to the bottom of the problem. There could also be a I/O problem in the system level. So i will also check with system administrators.
6) Say like i have executed delete from a table( it containing 5 records only),but it is not processing the delete. I check and see there are some locks in the database. How much time do i need to wait for my delete operation to succeed ? If Oracle does not release the locks, do i have to bounce the database to resolve the problem ?

A) You should find out what kind of lock is there. If its an enqueue lock you have to look at the other offending session that is blocking this session. Enqueue locks are usually a result of bad code. To let the delete statement pass through, I can call the user who is running the offending session and then remove the session from the database. However the root cause lies in the code design. In future the same delete statement may have the problem. Therefore fixing the code will be a long term fix for enqueue kind of locks.
7) When ever we run statspack,we see some top 5 wait events like "log file sync" and "db file scattered read",when these events occur and how to resolve them ?

A) I will not worry about familiar events like "db sequential read" or "sql *net message from client" etc. However i will be concerned if the events are "log file sync", "enqueue" or "file scattered read" etc. if there is huge transactional activity in the database, you should get a trend of your expected log switches per day and if the latest statspack report has this event in the top, then you should find out if any long running DML jobs were running that is causing too many waits on "log file sync". Some of the reasons could be a disk I/0 problem that is causing slow log file syncs or there may be too frequent commits in your long running jobs that is causing frequent log buffer flushes to the online redo logs. DB FILE SCATTERED READ is a scary wait event. If its in the top 5, then that means user sessions are waiting frequently on data accessed from disk. This also indicates that the database is undergoing a lot of full table scans or index fast full scans which will slow down the database.

Friday, March 14, 2008

Oracle Database is the skill with the biggest demand

Since the last few days i am only hearing discouraging news about the US economic slowdown and a possible recession in the coming days , but there is one news item that will bring a smile to all the Oracle DBAs !

According to The Dice Report, the latest snapshot on the technology job market published by dice.com for the month of March 2008, Oracle Database is the skill thats the most in demand. This is certainly heartening especially for the Oracle DBA community.

Click here for more details....

Oracle Business Intelligence (Discoverer 10g) Linux Migration

Introduction

As part of the ongoing Linux Migration project, we successfully migrated the Oracle Business Intelligence (Discoverer 10g) from HPUX to Linux last weekend. The Discoverer migration is straightforward for the most part and many steps can be done well in advance to avoid increased system downtime during the cutover. Most of the procedure involves a new linux installation and configuration and only a part of the procedure involves “actual migration” of files from the existing source HPUX platform to the target linux platform. Below are the high level steps that we followed.
Advance Steps
  1. Our existing version on HPUX is 10.1.2.0.2. We downloaded the same Linux x86 version from the OTN site. Oracle Discoverer 10g is also called Oracle Business Intelligence SE (Standard Edition).
  2. Install the Linux 10.1.2.0.2 software on the target linux host.
  3. Create $ORACLE_HOME/discoverer/secure directory
  4. To provide transparency and a much more friendly user URL to end users, we got a DNS alias for the linux host. After the installation, $ORACLE_HOME/chgip/scripts/chgiphost.sh -mid is run to update the discoverer installation's configuration files with the DNS alias (Compare it with adautocfg.sh in Oracle E-Business Suite 11i.
Cutover Steps (Downtime)
  1. Shutdown the discoverer processes on the existing HPUX platform.
  2. Migrate the discoverer preferences from the source HPUX to the target Linux platform. Discoverer preferences are settings that define the Discoverer environment and control the behavior of Discoverer Plus and Discoverer Viewer. Whenever an individual user changes preferences in discoverer, thet are stored in a hidden file $ORACLE_HOME/discoverer/.reg_key.dc. This file contains values in a platform specific format and therefore needs to be converted to the target platform format.
  • Backup the target linux platform’s $ORACLE_HOME/discoverer/.reg_key.dc
  • Copy the $ORACLE_HOME/discoverer/.reg_key.dc from source platform to Linux as hpux_reg_key.dc
  • On Linux , cd $ORACLE_HOME/discoverer/util and run the below conversion script
  • perl convertreg.pl hpux_reg_key.dc $ORACLE_HOME/discoverer/.reg_key.dc

3. Apply any one-off patches on Linux that have been applied in source platform

4. Migrate tnsnames.ora and dbc files from source to Linux.

5. Start the linux discoverer processes (opmnctl startall and emctl start iasconsole)

6. Oracle Discoverer is now LIVE on Linux !

Friday, February 29, 2008

tar gzip and gunzip command examples

Introduction
 
This article gives you some examples of tar, gzip and gunzip commands on Linux operating system.
 
How do you take an export of a database that is 100GB in size whereas the disk space available to store the export dump is only 20GB ? Similarly how do you make a tar ball of the E-Business Suite APPL_TOP that is already 50GB in size whereas the disk space available is only 15GB ?. We are in the process of migrating our existing Oracle installations from the HPUX platform to Linux and for the most part, we spend time in transferring files between the platforms. 'Create and Compress' is one option that i have used successfully and i will illustrate the syntax in this post.

Full database export using parallel compression
  1. Login to source HPUX database host
  2. mknod db_full_exp.dmp p
  3. gzip < db_full_exp.dmp > db_full_exp.dmp.gz & nohup exp system/***** parfile=params.lst file=db_full_exp.dmp &
  4. Copy the above db_full_exp.dmp.gz to the target Linux database server
  5. gunzip db_full_exp.dmp.gz
  6. imp system/***** parfile=params.lst file=db_full_exp.dmp
Tar file and parallel compression
 
The Oracle Applications 11i E-Business Suite Linux Migration involves copying several source directories from the source platform to the target Linux platform. The below example deals with the $APPL_TOP copy. Same syntax can be used for other directories that should be migrated. Please replace the "PIPE" with the actual PIPE symbol on the keyboard.
  1. On source HPUX node, cd $APPL_TOP
  2. tar cvf - . PIPE  gzip > /any_directory_having_free_space/appl_top_files.tar.gz
  3. Copy the appl_top_files.tar.gz to the target Linux node
  4. On target Linux node, cd $APPL_TOP (directory should be empty)
  5. gunzip -c appl_top_files.tar.gz  PIPE tar xvf -
Conclusion

A database administrator should automate processes and tasks and that extra knowledge in programming comes a long way in accomplishing these goals. Mastery in advanced unix commands like awk and sed will also help eliminate manual tasks for the most part. Ease in shell scripting or perl scripting will help one spend less time in routine tasks thereby giving more time for other important and priority activities.

Tuesday, February 19, 2008

Oracle Index Monitoring Benefits

Introduction

There are many new features available in Oracle 9i and 10g database releases. Out of the these many new features, a few standout from the crowd and 'Index Monitoring' is one of them. It is introduced in the 9i release and its one of my favorites in Oracle. Whereas the benefits are enormous, the risks are minimal to zero.

All database indexes need to be maintained in Oracle whether they are being used by a SQL query execution plan. The primary purpose of creating indexes is to reduce the cost of an sql operation thereby increasing performance and response times in the database. Sometimes database developers create indexes that will be never be used by Oracle.

Example 1 :- Creating index on a column that will be used only in the select clause and never used in the where clause of an sql statement (non-driving columns).

Example 2 :- Another common example is when the cost based optimizer decides that a table access or an alternate index/composite index is a better access path than using the index in question .

One of the most easy to implement Oracle 9i features (and 10g) has been 'Index Monitoring'. The feature identifies all unused indexes in the database and updates a data dictionary view with this information. A DBA in coordination with the development team can review these results and proceed to drop these indexes.

Benefits of dropping unused indexes
1. Faster DML operations
Database table insert/update/delete statements can be executed quickly without the need for Oracle to insert/update/delete the corresponding indexes in the background.
2. Reduced Database size
Indexes occupy the next highest storage after tables in most transactional databases. If large indexes are found unused, they can help reduce the overall database size.
3. Less RedoLog consumption
All indexes update/insert/delete get logged (if they are created with the default LOGGING option - which is the norm rather than an exception ) in the redologs. If unwanted/redundant indexes are dropped, less redologs are switched per day thereby increasing overall database performance by a good multiple.
4. Reduced archive space
More redologs caused by unused indexes mean more archivelog files.
5. Faster database backups
Dropping unused indexes will decrease the database size. Lesser sized databases take less time for their backup.
6. Reduced recovery time
Less archive logs translates to lesser archive logs to apply in database recovery operations.
Implementation Procedure
1. Identify all hot custom indexes which are of greater than a threshold size (e.g 100MB)
2. For each of the above indexes, the following command need to be executed
alter index monitoring usage;


3. Let Oracle monitor the indexes usage for 24 hrs. The data dictionary view v$object_usage gets updated if the index has been used atleast once in the last 24 hrs. (Here 24 hrs is just an example. Some sites may need 48 hrs of monitoring or even upto 7 days of monitoring. )

4. Turn off index monitoring using executing the following command
alter index nomonitoring usage;

5. Get all unused indexes reported in the view v$object_usage
Are there any dependencies ? Is downtime needed for Index Monitoring ?
Index Monitoring has no dependencies with any database operations. It can be dynamically implemented with a very little overhead. There is also no downtime required for the implementation.
Caveats

1. Implementing Index Monitoring in development and test databases may not yield good results or may not be used as a basis for production implementation. Whereas several indexes may be shown as unused in development databases, the same indexes may be used more frequently in production because of the higher transaction rate.

2. Only custom indexes need to be considered for index monitoring. Its not advisable to monitor indexes that come in a pre-seeded database similar to the Oracle 11i E-Business Suite database.

3. The actual Index Monitoring is entirely done by Oracle without a DBA intervention. In ideal cases, Oracle may not even report any unused/redundant index. But this is very rare and happens only in perfect database design cases. So the index monitoring exercise may end up without any unused/redundant indexes in perfect design databases.

4. Sometimes an index is used exclusively for a weekend job or a monthend job to improve its performance. Exercise sufficient caution before dropping unused indexes reported in v$object_usage. Confirmation has to be taken from the index owner that there is no special purpose for the concerned index.
Conclusion
Database upgrades and migrations to new releases have to be performed with a pro-active frame of mind instead of a reactive frame of mind. I have seen many Oracle upgrades and migrations performed just to meet Oracle support deadlines or to meet a prerequisite for other software components. New features like Index Monitoring come with the least risks and lot of benefits. If you are starting a 'reorg' exercise in your database, Index Monitoring can be a good starting point.

Friday, February 1, 2008

Oracle Applications 11i Middle Tiers Linux Migration

We successfully migrated our Oracle Applications 11i (11.5.10.2) middle tiers (forms and web) from HP-UX to Linux during the weekend. We followed the standard Metalink Note :- 238276.1 for the Linux Migration. Our admin tiers and database instances were on HP-UX and these were left alone during the migration. Overall it has been a smooth cutover, however we did face a SEV 1 issue during the GOLIVE activities. We were able to get help within the maintenance window and thereby were able to release the instance on time. The details are below  

ORA-01403: No Data Found... 

After completing all the tasks and starting the applications on the target linux servers, none of the forms were opening. Every click resulted in a "ORA-01403: No Data Found" error. We reviewed all the tasks and everything was run fine. However we were convinced that this error could be a result of just one file or one wrong parameter. Soon after a thorough search on the Oracle Metalink site, we found an article that says an incorrect custom.pll can cause this error while trying to open any form or click on any form button. After getting this vital hint, we were able to demonstrate that its indeed the custom code that is causing the problem. The following method was used to confirm this.

  1. Login to forms
  2. If "ORA-01403: No Data Found" occurs on every form click or form button click, then escape the error window, Goto Help menu => Diagnostics => custom code => The option here will be NORMAL. Choose OFF.
  3. Now navigate forms again. The error no longer showed up after turning off custom code.

After narrowing down to the bottom of the problem, we soon found that a pll file added along with CUSTOM.pll for some changes was not compiled on linux. The problem went away after compiling this pll too.

Conclusion

adadmin program compiles only the standard code and hence custom code compilation has to be taken care of seperately after the standard code compilation. Failure to compile any custom code files can result in the above problem.

Friday, January 11, 2008

Oracle STATSPACK recreation

Introduction

The PERFSTAT schema (and tablespace) contains all the database’s performance statistics. However over a period of time, this tablespace can grow to a large size and the stored statistical data can be purged to make room for more freespace in this tablespace. Occasionally a complete reorganization of the PERFSTAT tablespace can be performed to delete all the historical statistics data, thus reclaiming disk space consumed for this tablespace.

At first glance, it is easy to think that this reorg procedure can be done hot (while the database applications are up and running). But its not quite true. This post explains the "behind the scenes" part of STATSPACK reorganization and why its not a straightforward procedure.

Background

Our 9.2.0.6 Oracle Applications database's PERFSTAT tablespace grew more than 50GB in size with all the thousands of statspack snaps in the schema and its just the right time for a purge and reorganization of this tablespace. Since perfstat schema is not used by any other jobs except the statspack.snap cronjob, it looked like a straight forward operation and which can be done online. Based on this flawed reasoning, the reorg was completed in our development instance. Soon there were users complaining that they were getting java exceptions once they login to Oracle Applications 11i. All the exceptions were due to the DBMS_UTILITY package getting invalidated. Because statspack could not have done this (this was the flawed assumption until then), we guessed it could have been a result of bad code migration from the applications team.

Whats the unexpected
There were no changes nor were any code migrations from the applications team. Upon further investigation, it was identified that the reinstallation of STATSPACK environment (spcreate.sql) invalidates critical DBMS packages including DBMS_UTILITY, DBMS_SNAPSHOT, DBMS_IREFRESH, UTL_RECOMP, DBMS_STATS, DBMS_PRVTAQIP, DBMS_PCLXUTIL etc and we recompiled them immediately to solve the problem. The metalink note 392352.1 mentions this is an expected outcome, but nothing of that sort is written in the official guide.
  1. Take a list of the invalid objects from dba_objects view
  2. Reorg the statspack environment (drop it and recreate it)
  3. Take a new list of the invalid objects and compare it with the list taken above. Recompile the new objects that are invalid after the reorg.
Conclusion

There is no "big change" or "small change". Every change should be given due respect and tested properly before GOLIVE or else expect the unexpected !