Friday, June 26, 2009

ASM Disk Controller numbers can get changed if the SAN Switch is replaced

Introduction


We have setup our Oracle E-Business Suite database to be highly available by using Oracle 10g RAC technology using the ASM feature. We use EMC Symmetrix for storage and a Brocade device for the SAN switch. Our SAN fabric was recently upgraded and among many changes, the legacy Brocade SAN switches were replaced by Cisco switches.


Problem


Due to this change, we rebooted both the Oracle RAC database nodes. But the database would not start because the ASM diskgroup could not be mounted. We have hardcoded the asm_diskstring parameter to a specific set of disk controller numbers. While mounting a diskgroup, the ASM instance will discover only those devices that match the asm_diskgroup parameter. Upon further investigation, we found out that the disk controller numbers got changed after the new switch installation.



Note the changed disk controller numbers in the below image



Solution


We use EMC Symmetrix as our SAN storage solution. During the initial ASM conversion project, we managed to save the syminq numbers of all our ASM devices. Because the syminq numbers do NOT change even upon a switch replacement, we could use the same syminq numbers to find out the new disk controller numbers. We then updated the asm_diskgroup initialization parameter with these new controller numbers and then the ASM diskgroup was successfully mounted and the database started.

Sunday, June 7, 2009

How to avoid username conflict in OID during AD synchronization

Introduction

Our master source of truth for authenticating Single Sign On (SSO) users for access to the E-Business Suite is the Active Directory (AD). We periodically syncrhonize data from AD to the Oracle Internet Directory (OID) using the syncrhonization profiles.

Because Oracle SSO can work with only OID, we cannot afford any failures/mismatches between AD and OID and hence we do not let the syncrhonization to continue if an error is reported. The 'Continue after error' is set to FALSE.

One common problem in a directory is managing usernames for people with identical names. For example, an organization can have two employees with the same name viz. John Smith and John Smith respectively. The AD administrator creates these accounts as JSMITH and JSMITH1 respectively because the username has to be unique across the organization. During initial propagation to OID, these accounts will have two seperate entries in OID as the same JSMITH and JSMITH1 respectively. When JSMITH leaves the organization, the other John Smith can request for a change in his username from JSMITH1 to JSMITH (JSMITH looks good compared to JSMITH1).


Problem

The update of JSMITH1 to JSMITH generates a new change number in AD and the next syncrhonization cycle will attempt to make the corresponding change in OID too. We had such an update in AD yesterday and OID failed to process the change. The following is the error that is reported in the syncrhonization profile logfile.



Exception Doing ModRDN operation : javax.naming.NameAlreadyBoundException: [LDAP: error code 68 - Entry Already Exists]; remaining name 'cn=jsmith1,ou=us,dc=mycompany,dc=com'
Ignore modrdn.

Solution

At the time of processing the update, OID had the following two entries

'cn=jsmith1,ou=us,dc=mycompany,dc=com'
'cn=jsmith,ou=us,dc=mycompany,dc=com'

AD had only one entry JSMITH (after the username change by the administrator).

'cn=jsmith,ou=us,dc=mycompany,dc=com'

Because OID has an jsmith entry already, the update of JSMITH1 to JSMITH fails with the [LDAP: error code 68 - Entry Already Exists] error.

I solved the problem by deleting both the entries jsmith and jsmith1 in OID and making a fake update in AD (removed the telephone number of jsmith). The next syncrhonization job picked up the telephone number change in AD and created a new jsmith entry in OID. Once the OID processing has become successful, I corrected the telephone number in AD.

Saturday, April 4, 2009

Unable to login to R12 after SSO integration

Introduction

I am now working on the E-Business Suite Release 12 upgrade project. The login page broke after integrating the R12 development system with Oracle Single Sign-On. We recieved 'HTTP 500 Internal Server Error'. Furthermore, the local login page (/OA_HTML/AppsLocalLogin.jsp) too was not getting rendered.

Problem

This certainly seemed to a java servlet initialization problem in the Oracle OC4J. I noticed the following errors in the $INST_TOP/logs/ora/10.1.3/j2ee/oacore/oacore_default_group_1/application.log

Error initializing servletjava.lang.NoClassDefFoundError: Could not initialize class oracle.apps.fnd.profiles.Profiles
html: chain failed javax.servlet.ServletException at com.evermind[Oracle Containers for J2EE 10g

I could not a direct answer after searching the Oracle Support website. I was able to better understand the problem after reviewing the $INST_TOP/logs/ora/10.1.3/opmn/oacore_default_group_1/oacorestd.err logfile.

Exception in static block of jtf.cache.CacheManager. Stack trace is: oracle.apps.jtf.base.resources.FrameworkException: IAS Cache initialization failed.

Solution

The problem has something to do with the Java Cache. I went ahead and disabled the cache mechanism by setting the parameter LONG_RUNNING_JVM to false in the $INST_TOP/ora/10.1.3/j2ee/oacore/config/oc4j.properties and this solved the problem.I will update this article once I know the root cause why the Cache failed to initialize after the SSO integration.

Thursday, March 5, 2009

Re-Registering the Oracle E-Business Suite after an SSO database RAC conversion

Introduction

After converting the Oracle SSO database to RAC for High Availability, one has to re-register the provisioning applications such as the E-Business Suite to let them know about the new database configuration. In this case, to update the provisioning applications with the new RAC TNS string.  

Problem
 
One of our provisioning applications is the Oracle E-Business Suite. The provisioning application has to be deregistered first with the txkrun.pl -deregister option. We recieved the following error during the deregistration process.

txkrun.pl -script=SetSSOReg -deregister=Yes

txkSetSSOReg_Fri_Feb_27_15_03_19_2009.log FUNCTION: TXK::advconfig::SSO::validateParams [ Level 1 ]ERRORMSG: Invalid ORASSO database user credentials.Connecting to ORASSO schema using dbhost.mycompany.com:1521:OIDDB failed.  

Solution

The ORASSO password was correct. So the above error has nothing to do with the password. Upon further investigation, I found out the script is unable to establish the SSO database connection. Because the database has been converted to RAC, the current SID OIDDB is not applicable anymore. During the RAC conversion, a new SID has been generated for each of the two instances in the RAC viz. OIDDB1 and OIDDB2 respectively. The service name is still the same OIDDB, but Oracle is trying to establish a connection using OIDDB as a SID, resulting in the above error.

More often that not, Oracle utility scripts show the input arguments so we can choose the right argument for the database connection string and pass it to the script. However, txkrun.pl does not seem to be good at it. txkrun.pl -help too does not provide the input argument list. The only way to find the txkrun.pl input arguments is to check its failed logfile in $COMMON_TOP/rgf/$CONTEXT_NAME/sso/txkSetSSOReg*.log file ( infraconnstr )  

txkrun.pl -script=SetSSOReg -deregister=Yes -infraconnstr="(Description =(ADDRESS_LIST =(LOAD_BALANCE = yes)(ADDRESS = (PROTOCOL = TCP)(HOST = node1-vip)(PORT = 1521))(ADDRESS = (PROTOCOL = TCP)(HOST = node2-vip)(PORT = 1521)))(CONNECT_DATA =(SERVICE_NAME = OIDDB)))"  

Conclusion

I hope txkrun.pl provides a better way to find its input argument list in the future. This can be an enhancement request for Oracle. Moreover, the script should prompt for the service name instead of the database SID currently (Enter the Oracle iAS Infrastructure database SID ? ) if one does not intput the infraconnstr argument. Service name is a much better way of referencing a database instead of a SID, because the former remains constant across RAC conversions, not the latter.

Wednesday, February 25, 2009

OID connection not available. OID server may be down

Introduction


In addition to the SSO and MetaData Repository components, we are also configuring the OID server to be highly available, thus enabling all the 3 components Viz. SSO webserver, MetaData Repository, OID server for High Availability. As part of the OID server high availability setup, I have installed a new Infrastructure node oid_serv2 with the OID and DIP components along with 'High Availability and Replication' in the installation option list.

Problem


A good way to test whether the new node serves Single Sign-On (SSO) requests is to shutdown the other existing OID server oid_serv1 in the Identity Management (IM) cluster. In addition, the OID server name has to be changed from oid_serv1 to oid_serv2 in the MetaData Repository database by running $ORACLE_HOME/sso/admin/plsql/sso/ssooconf.sql on the SSO server. However, I started recieving 'Error:Internal Server Error. Please try the operation later'.

Upon further investigation, I found the following errors in the $ORACLE_HOME/sso/log/ssoServer.log

Wed Feb 25 15:44:14 CST 2009 [ERROR] AJPRequestHandler-ApplicationServerThread-7 DirContextPool: OID connection not available. OID Server, oid_serv1.mycompany.com:636 may be down ...

Wed Feb 25 15:44:14 CST 2009 [ERROR] AJPRequestHandler-ApplicationServerThread-7 Communication Exception received. Cleaning up the stale connectionjavax.naming.CommunicationException: Could not get the connection. OID Server may be down

Solution


Changing the OID server name in $ORACLE_HOME/sso/admin/plsql/sso/ssooconf.sql alone does not seem to be enough for the SSO webserver to recognize the new OID server. I searched the configuration files in the SSO ORACLE_HOME and found a couple of additional references to the existing OID server, needing replacement with the newly installed OID server oid_serv2.

On the SSO ORACLE_HOME, change the following files, replacing oid_serv1 with oid_serv2 and then bouncing the processes using the opmnctl command.

  • $ORACLE_HOME/config/ias.properties (OIDhost)
  • $ORACLE_HOME/ldap/admin/ldap.ora (DIRECTORY_SERVERS)

Conclusion


The SSO Administration Guide mentions that running ssooconf.sql alone is enough to recognize the new OID server. However, as mentioned above, the SSO server does not seem to read the change from the database, but instead from its configuration files to get this working.

Friday, February 20, 2009

ONS Port and CRS Installation

Introduction

Most installations performed by the Oracle Universal Installer (OUI) selects port numbers only that are freely available on the host. However, one service that always fixes its port number during installation is the ONS service (Oracle Notification Service). As part of setting up the High Availability features for the Single Sign-On system, I have installed Oracle Cluster Ready Services (CRS) on Linux platform to convert the existing Metadata repository into a highly available database cluster. The installation completed successfully, however configuring the Virtual IPs was a problem (vipca). The ons service on the Node2 refuses to start.
Upon further investigation, I found the OUI assigned 6200 as the ONS port ( $CRS_ORACLE_HOME/opmn/conf/ons.config ), which is already in use by a different Oracle software on the same host.
Solution
  • Login to the cluster node2 where ONS service did not start
  • srvctl stop nodeapps -n node2
  • $CRS_ORACLE_HOME/bin/racgons.bin remove_config node2:6200
  • $CRS_ORACLE_HOME/bin/racgons.bin add_config node2:6299 ( non-default port )
  • Update $CRS_ORACLE_HOME/opmn/conf/ons.config and replace the remoteport=6200 with remoteport=6299
  • Run $CRS_ORACLE_HOME/bin/vipca again (It should complete successfully this time)
  • $CRS_ORACLE_HOME/crs_stat -t ( Verify ONS service is ONLINE )
Conclusion
As a best practice, one should install new Oracle software with all existing applications up and running so that the OUI only chooses port numbers that are not in use. However, it does not seem to be the case for the ONS port, because the OUI always sticks to the 6200. As a good practice, one can avoid problems during future Oracle installations on a given host by changing the ONS port from the default 6200 to a non-default value in the current installation.

Thursday, January 22, 2009

Discoverer 10g login and password expiry in an SSO environment

Introduction

In an E-Business Suite 11i application, one can enable Password Expiration for users created locally using the DEFINE USER form. This value is stored in the PASSWORD_LIFESPAN_DAYS column. If the 11i application is integrated with Oracle Single Sign-On (SSO), the 11i SSO login mechanism ignores this column as expected because the password policy is defined in the LDAP Directory. However, this does not seem to be the case for Discoverer 10g SSO Login functionality. The Discoverer login code seems to be explicitly checking for 'PASSWORD_LIFESPAN_DAYS' column in the FND_USER table even though it is SSO enabled. This behavior is consistent in both Discoverer Plus and Viewer components.

For example, if FND_USER.PASSWORD_DATE is 22-NOV-08,FND_USER.PASSWORD_LIFESPAN_DAYS 30 and if SYSDATE is '22-JAN-09' for user JSMITH, the Discoverer Login will fail with the above error (whenever PASSWORD_LIFESPAN_DAYS is less than SYSDATE-PASSWORD_DATE) .

Solution

Update the PASSWORD_LIFESPAN_DAYS column to NULL for all rows in the FND_USER table after the 11i application is integrated with SSO.

Monday, January 12, 2009

High Availability Architecture for SSO and OID database components
























The architecture diagram contains high availability configuration details for the SSO webserver and OID database components (Metadata Repository) in an Oracle Identity Management setup.
Related Posts Plugin for WordPress, Blogger...