To debug Online Patching issues which utilize the adop (AD Online Patching) tool we need to collect adop log files located in the <INSTALL BASE>/fs_ne/EBSapps/log/adop directory
e.g. /u01/PROD/fs_ne/EBSapps/log/adop
Each cycle of adop creates a subdirectory corresponding to the patch session ID, e.g.
/u01/PROD/fs_ne/EBSapps/log/adop/1
/u01/PROD/fs_ne/EBSapps/log/adop/2
etc..etc
When running adop the on-screen terminal output will mention which adop session ID is in use.
e.g. /u01/PROD/fs_ne/EBSapps/log/adop/9/apply_20121011_024437
STEP 1: Obtain all of the log files within the session ID directory. It is best to obtain a zip of the entire directory.
e.g. obtain a zip of /u01/PROD/fs_ne/EBSapps/log/adop/9
The session ID directory will contain:
a) A trace file for each phase;
e.g.
adop_20130316_085026.log
adop_20130316_091340.log
adop_20130316_210950.log
b) Logs grouped in phase directories
e.g.
prepare_20130316_085026
apply_20130316_091340
cutover_20130316_210950
You can see the timestamps match between each log directory and the respective trace file.
prepare_20130316_085026 directory matches adop_20130316_085026.log
apply_20130316_091340 directory matches adop_20130316_091340.log
etc etc
The same goes for fs_clone activities. i.e. you will see something like this
fs_clone_20130319_233614 --> this is a directory
adop_20130319_233614.log
In almost all cases, when debugging an adop failure the patch log directory will contain the information you need to determine root cause
e.g.
/u01/PROD/fs_ne/EBSapps/log/adop/2/cutover_20130316_210950/
In some cases the onscreen error will indicate which subroutine has failed. You will likely find a sub directory containing log files which matches the failing routine. This should be where you focus your attention
e.g.
/u01/PROD/fs_ne/EBSapps/log/adop/2/cutover_20130316_210950/VIS_ufc/TXK_CTRL_forceshutdown_Sat_Mar_16_21_14_02_2013
STEP 2: Run the following SQL statements
This will show you the status for each adop phase along with its corresponding session id. This is effectively a history of online patching in an environment.
a) select ADOP_SESSION_ID,PREPARE_STATUS,APPLY_STATUS,FINALIZE_STATUS,CUTOVER_STATUS,CLEANUP_STATUS,ABORT_STATUS,STATUS,ABANDON_FLAG,NODE_NAME from AD_ADOP_SESSIONS order by ADOP_SESSION_ID;
Note:
Y denotes that the phase is done
N denotes that the phase has not been completed
X denotes that the phase is not applicable
R denotes that the phase is running (in progress)
F denotes that the phase has failed
P (is applicable only to APPLY phase) denotes at least one patch is already applied for the session id
C denotes that the status of this ADOP session has completed
Note: Numerical statuses are only relevant for the cutover phase
STEP 3: Check the current status of the adop cycle
Source the run filesystem environment file and run command
adop -status
usage
adop -status generates a summary report
adop -status <sessionID> generates a summary report for that session ID
adop -status -detail generates a detailed report
General Problems
A.1 General problems with Finalize phase and Abort command
For phase=finalize issues run the following command and attach the adzdshowlog.out file generated:
sqlplus <apps_schema_name>/<apps_Schema_password> @$AD_TOP/sql/ADZDSHOWLOG.sql
Note: the contents of the table will be truncated every time cleanup/prepare phase is run.
Older log information is stored in adzdshowlog.out
or
select * from ad_zd_logs order by log_sequence desc;
A.2 'Duplicate keys found' during Finalize
ERROR at line 1:
ORA-20001: Error: while calling ad_zd.finalize .ORA-01452: cannot CREATE UNIQUE
INDEX; duplicate keys found
ORA-06512: at line 8
Run the following:
sqlplus <apps_schema_name>/<apps_Schema_password> @$AD_TOP/sql/ADZDSHOWLOG.sql
The output will highlight the unique index which is failing. You can then use the following sql to identify the duplicate key values
select <list of columns for which unique index creation failed>, count(*)
from <schema_name>.<table_name> group by <list of columns again> having count(*)>1
e.g.
select REPRESENTATION_CODE, TRX_NUMBER#2, ORG_ID, count(*) from OKL.OKL_TRX_CONTRACTS_ALL group by REPRESENTATION_CODE, TRX_NUMBER#2, ORG_ID having count(*)>1
A.3 Problems with Online Enablement
For failed attempts at Online Enablement i.e. failures of patch 13543062, gather the following information:
1. Patch log
2. Worker logs
3. Output of
select * from ad_zd_logs order by log_sequence desc;
4. output from:
@$AD_TOP/sql/ADZDSHOWDDLS.sql
If enablement appears to be having performance issues or seems to have hanged you can run the following script to determine if enablement is progressing or has hanged:
select count(1) from ad_zd_logs;
Run the scripts every 5 or 10 minutes. If the count is increasing then enablement is progressing.If enablement is progressing but very slowly ensure the DB initialization parameters are set as per the DB 11.2.0.3 and 12.2 requirements.
A.4 To show which patches were applied in each ADOP_SESSION_ID (patching cycle)
select * from ad_adop_session_patches order by end_date desc;
or
set pagesize 200;
set linesize 160;
column adop_session_id format 999999999999;
column bug_number format a15;
column status format a15;
column applied_file_system_base format a23;
column patch_file_system_base format a23;
column adpatch_options format a15;
column node_name format a15;
column end_date format a15;
column clone_status format a15;
select ADOP_SESSION_ID, BUG_NUMBER, STATUS, APPLIED_FILE_SYSTEM_BASE, PATCH_FILE_SYSTEM_BASE, ADPATCH_OPTIONS, NODE_NAME, END_DATE, CLONE_STATUS
from ad_adop_session_patches
order by end_date desc;
Note: STATUS
N - Not Applied In the current node but applied in other nodes
R - Patch Application is going on.
H - Patch failed in the middle. (Hard Failure)
F - Patch failed in the middle but user tried to skip some failures.
S - Patch Application succeeded after skipping the failed jobs.
Y - Patch Application succeeded.
C - Reserved for clone and config_clone. Indicates clone completed
A.5 Issues with FS_CLONE
a) select ADOP_SESSION_ID,BUG_NUMBER,CLONE_STATUS,STATUS,NODE_NAME from AD_ADOP_SESSION_PATCHES order by ADOP_SESSION_ID;
b) sqlplus <apps_schema_name>/<apps_Schema_password> @$AD_TOP/sql/ADZDSHOWLOG.sql
c) zip of files in fs1/EBSapps/comn/clone/FMW/logs, fs2/EBSapps/comn/clone/FMW/logs. These log files are also useful for clone issues during the prepare phase. ( this captures cloning failures for WLS)
d) zip of directory $APPLRGF/TXK/ohsCloneLog ( this captures cloning failures for OHS )
you can check for fsclone issues in Prepare phase error link
Other key things to consider in multi web node instances.
- SSH should be enabled to allow communication from the primary node to the secondary nodes. Use txkRunSSHSetup.pl to enable SSH.
- If SSH is not enabled pay very special attention to the specific requirements for running adop in this case (see the maintenance guide)
- When patching ensure you copy patches to the patch home directory of all application tiers e.g. fs_ne/EBSapps/patch
- When debugging issues make sure to you review the adop logs on all application tiers
e.g. /u01/PROD/fs_ne/EBSapps/log/adop
Each cycle of adop creates a subdirectory corresponding to the patch session ID, e.g.
/u01/PROD/fs_ne/EBSapps/log/adop/1
/u01/PROD/fs_ne/EBSapps/log/adop/2
etc..etc
When running adop the on-screen terminal output will mention which adop session ID is in use.
e.g. /u01/PROD/fs_ne/EBSapps/log/adop/9/apply_20121011_024437
STEP 1: Obtain all of the log files within the session ID directory. It is best to obtain a zip of the entire directory.
e.g. obtain a zip of /u01/PROD/fs_ne/EBSapps/log/adop/9
The session ID directory will contain:
a) A trace file for each phase;
e.g.
adop_20130316_085026.log
adop_20130316_091340.log
adop_20130316_210950.log
b) Logs grouped in phase directories
e.g.
prepare_20130316_085026
apply_20130316_091340
cutover_20130316_210950
You can see the timestamps match between each log directory and the respective trace file.
prepare_20130316_085026 directory matches adop_20130316_085026.log
apply_20130316_091340 directory matches adop_20130316_091340.log
etc etc
The same goes for fs_clone activities. i.e. you will see something like this
fs_clone_20130319_233614 --> this is a directory
adop_20130319_233614.log
In almost all cases, when debugging an adop failure the patch log directory will contain the information you need to determine root cause
e.g.
/u01/PROD/fs_ne/EBSapps/log/adop/2/cutover_20130316_210950/
In some cases the onscreen error will indicate which subroutine has failed. You will likely find a sub directory containing log files which matches the failing routine. This should be where you focus your attention
e.g.
/u01/PROD/fs_ne/EBSapps/log/adop/2/cutover_20130316_210950/VIS_ufc/TXK_CTRL_forceshutdown_Sat_Mar_16_21_14_02_2013
STEP 2: Run the following SQL statements
This will show you the status for each adop phase along with its corresponding session id. This is effectively a history of online patching in an environment.
a) select ADOP_SESSION_ID,PREPARE_STATUS,APPLY_STATUS,FINALIZE_STATUS,CUTOVER_STATUS,CLEANUP_STATUS,ABORT_STATUS,STATUS,ABANDON_FLAG,NODE_NAME from AD_ADOP_SESSIONS order by ADOP_SESSION_ID;
Note:
Y denotes that the phase is done
N denotes that the phase has not been completed
X denotes that the phase is not applicable
R denotes that the phase is running (in progress)
F denotes that the phase has failed
P (is applicable only to APPLY phase) denotes at least one patch is already applied for the session id
C denotes that the status of this ADOP session has completed
Note: Numerical statuses are only relevant for the cutover phase
STEP 3: Check the current status of the adop cycle
Source the run filesystem environment file and run command
adop -status
usage
adop -status generates a summary report
adop -status <sessionID> generates a summary report for that session ID
adop -status -detail generates a detailed report
General Problems
A.1 General problems with Finalize phase and Abort command
For phase=finalize issues run the following command and attach the adzdshowlog.out file generated:
sqlplus <apps_schema_name>/<apps_Schema_password> @$AD_TOP/sql/ADZDSHOWLOG.sql
Note: the contents of the table will be truncated every time cleanup/prepare phase is run.
Older log information is stored in adzdshowlog.out
or
select * from ad_zd_logs order by log_sequence desc;
A.2 'Duplicate keys found' during Finalize
ERROR at line 1:
ORA-20001: Error: while calling ad_zd.finalize .ORA-01452: cannot CREATE UNIQUE
INDEX; duplicate keys found
ORA-06512: at line 8
Run the following:
sqlplus <apps_schema_name>/<apps_Schema_password> @$AD_TOP/sql/ADZDSHOWLOG.sql
The output will highlight the unique index which is failing. You can then use the following sql to identify the duplicate key values
select <list of columns for which unique index creation failed>, count(*)
from <schema_name>.<table_name> group by <list of columns again> having count(*)>1
e.g.
select REPRESENTATION_CODE, TRX_NUMBER#2, ORG_ID, count(*) from OKL.OKL_TRX_CONTRACTS_ALL group by REPRESENTATION_CODE, TRX_NUMBER#2, ORG_ID having count(*)>1
A.3 Problems with Online Enablement
For failed attempts at Online Enablement i.e. failures of patch 13543062, gather the following information:
1. Patch log
2. Worker logs
3. Output of
select * from ad_zd_logs order by log_sequence desc;
4. output from:
@$AD_TOP/sql/ADZDSHOWDDLS.sql
If enablement appears to be having performance issues or seems to have hanged you can run the following script to determine if enablement is progressing or has hanged:
select count(1) from ad_zd_logs;
Run the scripts every 5 or 10 minutes. If the count is increasing then enablement is progressing.If enablement is progressing but very slowly ensure the DB initialization parameters are set as per the DB 11.2.0.3 and 12.2 requirements.
A.4 To show which patches were applied in each ADOP_SESSION_ID (patching cycle)
select * from ad_adop_session_patches order by end_date desc;
or
set pagesize 200;
set linesize 160;
column adop_session_id format 999999999999;
column bug_number format a15;
column status format a15;
column applied_file_system_base format a23;
column patch_file_system_base format a23;
column adpatch_options format a15;
column node_name format a15;
column end_date format a15;
column clone_status format a15;
select ADOP_SESSION_ID, BUG_NUMBER, STATUS, APPLIED_FILE_SYSTEM_BASE, PATCH_FILE_SYSTEM_BASE, ADPATCH_OPTIONS, NODE_NAME, END_DATE, CLONE_STATUS
from ad_adop_session_patches
order by end_date desc;
Note: STATUS
N - Not Applied In the current node but applied in other nodes
R - Patch Application is going on.
H - Patch failed in the middle. (Hard Failure)
F - Patch failed in the middle but user tried to skip some failures.
S - Patch Application succeeded after skipping the failed jobs.
Y - Patch Application succeeded.
C - Reserved for clone and config_clone. Indicates clone completed
A.5 Issues with FS_CLONE
a) select ADOP_SESSION_ID,BUG_NUMBER,CLONE_STATUS,STATUS,NODE_NAME from AD_ADOP_SESSION_PATCHES order by ADOP_SESSION_ID;
b) sqlplus <apps_schema_name>/<apps_Schema_password> @$AD_TOP/sql/ADZDSHOWLOG.sql
c) zip of files in fs1/EBSapps/comn/clone/FMW/logs, fs2/EBSapps/comn/clone/FMW/logs. These log files are also useful for clone issues during the prepare phase. ( this captures cloning failures for WLS)
d) zip of directory $APPLRGF/TXK/ohsCloneLog ( this captures cloning failures for OHS )
you can check for fsclone issues in Prepare phase error link
Other key things to consider in multi web node instances.
- SSH should be enabled to allow communication from the primary node to the secondary nodes. Use txkRunSSHSetup.pl to enable SSH.
- If SSH is not enabled pay very special attention to the specific requirements for running adop in this case (see the maintenance guide)
- When patching ensure you copy patches to the patch home directory of all application tiers e.g. fs_ne/EBSapps/patch
- When debugging issues make sure to you review the adop logs on all application tiers
Thank you. Helpful particularly as I was running enablement in 13543062.
ReplyDeleteWhat needs to be done when adop apply phase doesn't write anything in the log file location
ReplyDelete