Pages

Saturday, January 11, 2014

ORACLE R12.2 ADOP LOGFILES and STATUS CHECK

To debug Online Patching issues which utilize the adop (AD Online Patching) tool we need to collect adop log files located in the <INSTALL BASE>/fs_ne/EBSapps/log/adop directory
e.g. /u01/PROD/fs_ne/EBSapps/log/adop

Each cycle of adop creates a subdirectory corresponding to the patch session ID, e.g.

/u01/PROD/fs_ne/EBSapps/log/adop/1
/u01/PROD/fs_ne/EBSapps/log/adop/2
etc..etc

When running adop the on-screen terminal output will mention which adop session ID  is in use.
e.g. /u01/PROD/fs_ne/EBSapps/log/adop/9/apply_20121011_024437

STEP 1:  Obtain all of the log files within the session ID directory. It is best to obtain a zip of the entire directory.

e.g. obtain a zip of /u01/PROD/fs_ne/EBSapps/log/adop/9

The session ID directory will contain:

a) A trace file for each phase;
e.g.
adop_20130316_085026.log 
adop_20130316_091340.log 
adop_20130316_210950.log 

b) Logs grouped in phase directories
e.g.
prepare_20130316_085026
apply_20130316_091340 
cutover_20130316_210950 

You can see the timestamps match between each log directory and the respective trace file.
prepare_20130316_085026 directory matches adop_20130316_085026.log
apply_20130316_091340 directory matches adop_20130316_091340.log
etc etc

The same goes for fs_clone activities. i.e. you will see something like this
fs_clone_20130319_233614 --> this is a directory
adop_20130319_233614.log

In almost all cases, when debugging an adop failure the patch log directory will contain the information you need to determine root cause
e.g.
/u01/PROD/fs_ne/EBSapps/log/adop/2/cutover_20130316_210950/

In some cases the onscreen error will indicate which subroutine has failed. You will likely find a sub directory containing log files which matches the failing routine. This should be where you focus your attention
e.g.
/u01/PROD/fs_ne/EBSapps/log/adop/2/cutover_20130316_210950/VIS_ufc/TXK_CTRL_forceshutdown_Sat_Mar_16_21_14_02_2013

STEP 2: Run the following SQL statements
This will show you the status for each adop phase along with its corresponding session id. This is effectively a history of online patching in an environment.

a) select ADOP_SESSION_ID,PREPARE_STATUS,APPLY_STATUS,FINALIZE_STATUS,CUTOVER_STATUS,CLEANUP_STATUS,ABORT_STATUS,STATUS,ABANDON_FLAG,NODE_NAME from AD_ADOP_SESSIONS order by ADOP_SESSION_ID;

Note:
       Y denotes that the phase is done
       N denotes that the phase has not been completed
       X denotes that the phase is not applicable
       R denotes that the phase is running (in progress)
       F denotes that the phase has failed
       P (is applicable only to APPLY phase) denotes at least one patch is already applied for the session id
       C denotes that the status of this ADOP session has completed

Note: Numerical statuses are only relevant for the cutover phase

STEP 3: Check the current status of the adop cycle
Source the run filesystem environment file and run command
adop -status

usage
adop -status  generates a summary report
adop -status <sessionID> generates a summary report for that session ID
adop -status -detail generates a detailed report

General Problems

A.1 General problems with Finalize phase and Abort command
For phase=finalize issues run the following command and attach the adzdshowlog.out file generated:
sqlplus <apps_schema_name>/<apps_Schema_password> @$AD_TOP/sql/ADZDSHOWLOG.sql
Note: the contents of the table will be truncated every time cleanup/prepare phase is run.
Older log information is stored in adzdshowlog.out

or

select * from ad_zd_logs order by log_sequence desc;

A.2 'Duplicate keys found' during Finalize

ERROR at line 1:
ORA-20001: Error: while calling ad_zd.finalize .ORA-01452: cannot CREATE UNIQUE
INDEX; duplicate keys found
ORA-06512: at line 8

Run the following:

sqlplus <apps_schema_name>/<apps_Schema_password> @$AD_TOP/sql/ADZDSHOWLOG.sql

The output will highlight the unique index which is failing. You can then use the following sql to identify the duplicate key values

select <list of columns for which unique index creation failed>, count(*)
from <schema_name>.<table_name> group by <list of columns again> having count(*)>1
e.g.
select REPRESENTATION_CODE, TRX_NUMBER#2, ORG_ID, count(*) from OKL.OKL_TRX_CONTRACTS_ALL group by REPRESENTATION_CODE, TRX_NUMBER#2, ORG_ID having count(*)>1

A.3 Problems with Online Enablement
For failed attempts at Online Enablement i.e. failures of patch 13543062, gather the following information:
1. Patch log
2. Worker logs
3. Output of
select * from ad_zd_logs order by log_sequence desc;

4. output from:
@$AD_TOP/sql/ADZDSHOWDDLS.sql

If enablement appears to be having performance issues or seems to have hanged you can run the following script to determine if enablement is progressing or has hanged:
select count(1) from ad_zd_logs;
Run the scripts every 5 or  10 minutes. If the count is increasing then enablement is progressing.If enablement is progressing but very slowly ensure the DB initialization parameters are set as per the DB 11.2.0.3 and 12.2 requirements.

A.4 To show which patches were applied in each ADOP_SESSION_ID (patching cycle)

select * from ad_adop_session_patches order by end_date desc;

or

set pagesize 200;
set linesize 160;
column adop_session_id format 999999999999;
column bug_number format a15;
column status format a15;
column applied_file_system_base format a23;
column patch_file_system_base format a23;
column adpatch_options format a15;
column node_name format a15;
column end_date format a15;
column clone_status format a15;

select ADOP_SESSION_ID, BUG_NUMBER, STATUS, APPLIED_FILE_SYSTEM_BASE, PATCH_FILE_SYSTEM_BASE, ADPATCH_OPTIONS, NODE_NAME, END_DATE, CLONE_STATUS
from ad_adop_session_patches
order by end_date desc;

Note: STATUS
N - Not Applied In the current node but applied in other nodes
R - Patch Application is going on.
H - Patch failed in the middle. (Hard Failure)
F - Patch failed in the middle but user tried to skip some failures.
S - Patch Application succeeded after skipping the failed jobs.
Y - Patch Application succeeded.
C - Reserved for clone and config_clone. Indicates clone completed

A.5 Issues with FS_CLONE

a) select ADOP_SESSION_ID,BUG_NUMBER,CLONE_STATUS,STATUS,NODE_NAME from AD_ADOP_SESSION_PATCHES order by ADOP_SESSION_ID;
b) sqlplus <apps_schema_name>/<apps_Schema_password> @$AD_TOP/sql/ADZDSHOWLOG.sql
c) zip of files in fs1/EBSapps/comn/clone/FMW/logs, fs2/EBSapps/comn/clone/FMW/logs. These log files are also useful for clone issues during the prepare phase. ( this captures cloning failures for WLS)
d) zip of directory $APPLRGF/TXK/ohsCloneLog ( this captures cloning failures for OHS )

you can check  for fsclone issues in  Prepare phase error link

Other key things to consider in multi web node instances.

- SSH should be enabled to allow communication from the primary node to the secondary nodes. Use txkRunSSHSetup.pl to enable SSH.

- If SSH is not enabled pay very special attention to the specific requirements for running adop in this case (see the maintenance guide)

- When patching ensure you copy patches to the patch home directory of all application tiers e.g. fs_ne/EBSapps/patch

- When debugging issues make sure to you review the adop logs on all application tiers

ORACLE R12.2 ADOP ERRORS

adop errors at fsclone Stage.

Error :

1. Error occurred while executing <perl /sid/applmgr/fs2/EBSapps/appl/ad/12.0.0/patch/115/bin/txkADOPPreparePhaseSynchronize.pl
2. Error occurred while CONFIG_CLONE Patch File System from Run File System using command

Check the adop logfile for errors, and also verify the errors in fsclone logfiles
location : /sid/applmgr/fs_ne/EBSapps/log/adop/<adop session id>/prepare_20140110_003210/ohs1/TXK_SYNC_migrate_Fri_Jan_10_00_40_18_2014

Solutions for different issues.

Error 1:  (/sid/applmgr/fs_ne/EBSapps/log/adop/<adop session id>/adop_20140110_003210.log)

Running [filesystem]/fs1/EBSapps/comn/clone/FMW/pasteBinary.sh -javaHome [filesystem]/fs1/EBSapps/comn/clone/FMW/t2pjdk -al [filesystem]/fs1/EBSapps/comn/clone/FMW/FMW_Home.jar -tl [filesystem]/fs2/FMW_Home -invPtrLoc /etc/oraInst.loc -ldl [filesystem]/fs1/EBSapps/comn/clone/FMW/logs -silent true -executeSysPrereqs false
Script Executed in 7200015 milliseconds, returning status -1
Script timed out.

Solution

1. Open a new terminal.

2. Execute :
      export TIMEDPROCESS_TIMEOUT=-1
      export CONFIG_JVM_ARGS="-Xms1024m -Xmx2048m"

3. Execute :
      adop phase=fs_clone

Error 2:  (/sid/R122UPGR/fs1/EBSapps/comn/clone/FMW/logs)

FSCloneStageAppsTier_<datestamp>.log, the following error will be seen:

/d01/R122UPGR/fs1/FMW_Home/oracle_common/bin/copyBinary.sh -javaHome /d01/R122UPGR/fs1/EBSapps/comn/clone/FMW/t2pjdk -al /d01/R122UPGR/fs1/EBSapps/comn/clone/FMW/FMW_Home.jar -smw /d01/R122UPGR/fs1/FMW_Home -ldl /d01/R122UPGR/fs1/EBSapps/comn/clone/FMW/logs -invPtrLoc /etc/oraInst.loc -silent true
Script Executed in 28420 milliseconds, returning status 255
Script failed, exit code 255

In:  /d01/R122UPGR/fs1/EBSapps/comn/clone/FMW/logs the following error is seen:

SEVERE : Dec 4, 2013 6:17:41 PM - ERROR - CLONE-20275   Insufficient space to create /tmp/CLONINGCLIENT3005869153400434600

Solution

Create space in /tmp or assign more disk storage to /tmp.
OR
assign the diffrent mountpoint as temp directory

export T2P_JAVA_OPTIONS="-Djava.io.tmpdir=/<sid>/applmgr/temp"

Error 3 (/sid/applmgr/fs_ne/EBSapps/log/adop/<adop session id>/prepare_20140110_003210/ohs1/TXK_SYNC_migrate_Fri_Jan_10_00_40_18_2014/wlsTechStack_FSCloneStageAppsTier_01100041.log)

Edit session is enabled in domain. Current Editor = weblogic
Please activate all of your changes before proceed clone

Solution

Login to weblogic console, in home page click on Release configuration button and rerun the adop preprae phase

Error 4 (/sid/applmgr/fs_ne/EBSapps/log/adop/<adop session id>/adop_20140110_003210.log)

adop=prepare encountering [ERROR] Error: While executing on all other nodes

Solution

Since the all nodes are not currently implemented the issue is cause by this autoconfig variable setting.

<shared_file_system oa_var="s_shared_file_system">false </shared_file_system>

It seems your half way to enabling another app node hence this setting

Since there is no extra app node registered.
What happens is that adop then looks for another app node and cannot find the node and it fails .

Please change the setting to false, run autoconfig, and re-test adop prepare

Issue should be resolved