Friday, March 4, 2016

Cluvfy returns "Unsuccessful" for most commands; trace has exectask.sh: "cannot execute" / "permission denied", or scp: "not found" [ID 549667.1]


________________________________________
       
In this Document
  Symptoms
     Examples:
  Changes
  Cause
     Cause #1 - Wrong permissions on exectask
     Cause #2 - scp in wrong location
  Solution
     Solution #1 - Adjust permissions on exectask.
     Solution #2 - Create a symlink to scp in the expected location.
     Scalability RAC Community
  References
________________________________________


Applies to:
Oracle Server - Standard Edition - Version: 10.2.0.1 to 11.2.0.2 - Release: 10.2 to 11.2
Oracle Server - Enterprise Edition - Version: 10.2.0.1 to 11.2.0.2   [Release: 10.2 to 11.2]
Information in this document applies to any platform.
***Checked for relevance on 19-Oct-2010***
Symptoms

Cluster is functioning perfectly, but running the Cluster Verification Utility (cluvfy) with any commands always returns "unsuccessful" with no further details, even with -verbose switch.
Examples:
1.
#cluvfy comp admprv -n mynode1,mynode2 -verbose -o crs_inst
Verifying administrative privileges
Checking user equivalence...
Check: User equivalence for user "root"
  Node Name                             Comment
  ------------------------------------  ------------------------
  mynode1                               passed
  mynode2                               passed
Result: User equivalence check passed for user "root".
Verification of administrative privileges was unsuccessful on all the nodes.

2.
#cluvfy comp nodecon -n mynode1,mynode2 -verbose
Verifying node connectivity
Verification of node connectivity was unsuccessful on all the nodes.

3.
#cluvfy comp crs -n mynode1,mynode2
Verifying CRS integrity
Verification of CRS integrity was unsuccessful on all the nodes.
4.
 ./runcluvfy.sh stage -post crsinst -n mynode1,mynode2 -verbose
Performing post-checks for cluster services setup

Checking node reachability...

Check: Node reachability from node "mynode1"
Destination                          Node Reachable?
------------------------------------ ------------------------
mynode1                              yes
mynode2                              yes
Result: Node reachability check passed from node "mynode1".


Checking user equivalence...

Check: User equivalence for user "oracle"
Node Name                        Comment
-------------------------------- ------------------------
mynode1                         passed
mynode2                         passed
Result: User equivalence check passed for user "oracle".

Post-check for cluster services setup was unsuccessful on all the nodes.
Changes
The issue can occur before, during or after installation.  After installation, the triggering event can be the application of a patch, or a change in the OS environment such as permissions change.  In some cases on 10.2 and 11.1 the issue was simply not noticed until after installation.
Cause
There are a couple of possible causes for this reason. To check for which cause you may be encountering, trace cvu and examine cvutrace.log .

For instructions on tracing cluvfy, please see this note:
Note 316817.1 - CLUSTER VERIFICATION UTILITY FAQ
under the section "How do I turn on tracing?"

If using runcluvfy.sh, see this note for tracing instructions:
Note 986822.1 - How to Collect CVU Trace / Debug Output Generated by RUNCLUVFY.SH
Cause #1 - Wrong permissions on exectask
After turning CVU tracing on, the following message can be seen in the cvutrace.log :
ksh: /tmp/CVU_10.2.0.1.0.1_dba/exectask.sh: cannot execute
Or, in an 11.2 install, in which the installer runs cluvfy, the installActionsDATE.log shows "exectask.sh: Execute permission denied", for example:
sh: /var/tmp/CVU_11.2.0.2.0_grid/exectask.sh: Execute permission denied.
Cause:
a) For already installed clusterware: Some CRS patches may change the permissions on $CRS_HOME/cv/remenv/exectask* .  If $CRS_HOME/cv/remenv/exectask* is not executable, then cluvfy will fail.
b) Previous runs of cvu can leave /tmp/CVU* or /var/tmp/CVU* directories in place; the permissions of exectask  must be correct in this location too, or cluvfy will fail.
Cause #2 - scp in wrong location
After turning CVU tracing on, the following message can be seen in the cvutrace.log :
exec exception:/usr/local/bin/scp: not found;
Cause: scp is not located in /usr/local/bin/scp but in a different place on the server.
Solution
Use either solution 1 or solution 2, depending on which of the above causes applies to your situation.
Solution #1 - Adjust permissions on exectask.

Steps:
1. Check the permissions with:
ls -l $CRS_HOME/cv/remenv/exectask*

2. Ensure that the files are owned by
oracle:oinstall
AND
permissions are 744 or 755

3.  Check for the existence of a /tmp/CVU* or /var/tmp/CVU* directory created by previous, failed run of cluvfy.  If the directory exists and exectask* exists in this directory, then make the same change to the permissions on these exectask* files as well.
Solution #2 - Create a symlink to scp in the expected location.

Steps:
1. Find scp
2. As root user, create a symlink to /usr/local/bin

eg. if scp is located in /usr/bin, create a symlink as follows:
# ln -s /usr/bin/scp /usr/local/bin/scp

Thursday, June 25, 2015

Error connecting ASM ORA-15055: unable to connect to ASM instance Fatal NI connect error 12547 ORA-12547: TNS:lost contact

TNS-12547: TNS:lost contact
    ns secondary err code: 12560
    nt main err code: 517
ORA-15055: unable to connect to ASM instance
ORA-12547: TNS:lost contact
TNS-12545: Connect failed because target host or object does not exist


Cause:

$GRID_HOME/bin/oracle or $ORACLE_HOME/bin/oracle permission has been changed

ls -al $GRID_HOME/bin/oracle
-rwxr-x--x 1 grid oinstall 200678464 Jun 28 14:54 oracle 

ls -al $ORACLE_HOME/bin/oracle
-rwxr-x--x 1 oracle asmadmin 228886191 Jun 28 15:41 oracle

Solution:

Change permissions as below 

cd $GRID_HOME/bin
chmod 6751 oracle

cd $ORACLE_HOME/bin
chmod 6751 oracle

Once it looks as below the problem is solved...

ls -l $GRID_HOME/bin/oracle
-rwsr-s--x 1 grid oinstall 203974257 Jun 29 09:30 oracle

ls -l $ORACLE_HOME/bin/oracle
-rwsr-s--x 1 oracle oinstall 232399431 Jun 29 13:47 oracle

Monday, June 15, 2015

ORA-02097: parameter cannot be modified because specified value is invalid

Alert log Entry

ORA-02097: parameter cannot be modified because specified value is invalid
ORA-00068: invalid value 4000 for parameter parallel_max_servers, must be between 0 and 3600
CKPT (ospid: 3277292): terminating the instance due to error 2097
Mon Jun 15 12:30:13 

Cause 1
It might be caused by the current system level change on the number of CPU which will shut down all the instances on the same server.

Solution

Just start the instances no need to worry oracle will take care of itself.

Cause 2

Related to parallel_max_servers wich is calculated 

PARALLEL_THREADS_PER_CPU * CPU_COUNT * concurrent_parallel_users * 5

and its value should range between 0 and 300.

Solution

Modify its value reasonably using alter system.

Saturday, June 13, 2015

Sysman password changing in oracle 11g

We may need to change the passwords of different users including sysman as a security measure or for other reason.But after changing the sysman password the enterprise manager console may not work, because we need to update the password using the following setps

step 1. Stop the dbconsole

export ORACLE_UNQNAME=DBNAME
emctl stop dbconsole

step 2. connect as sysdba and change the passowrd of sysman as required

SQL> alter user sysman idntified by XXXXX;

step 3. Update the password

emctl setpasswd dbconlole

step 4. Start the dbconsole

export ORACLE_UNQNAME=DBNAME
emctl start dbconsole


Now the enterprise manager works properly

Friday, April 24, 2015

ORA-17628: Oracle error 19505 returned by remote Oracle server


RMAN error

released channel: t1
released channel: t2
released channel: t3
released channel: t4
released channel: t5
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of Duplicate Db command at 03/01/2015 09:19:11
RMAN-05501: aborting duplication of target database
RMAN-03015: error occurred in stored script Memory Script
RMAN-03009: failure of backup command on t1 channel at 03/01/2015 09:19:11
ORA-17628: Oracle error 19505 returned by remote Oracle server



CAUSE

1. Shortage of space in the destination directory which will be accompanied by alertlog enter similar to the following:

ORA-19505: failed to identify file "+data"
ORA-17502: ksfdcre:4 Failed to create file +data
ORA-15041: diskgroup "DATA" space exhausted

Solution

Check the destination directory for data and archive files specified by DB_CREATE_FILE_DEST and DB_RECOVERY_FILE_DEST parameters respectively. Then make space in the specified diskgroup as manifested in the alert log ifle as exhausted,either by adding disks to it or deleting unwanted files from it.


2.Missing destination directory for DB_CREATE_FILE_DEST and DB_RECOVERY_FILE_DEST

Solution

Specify appropriate valid destinations for DB_CREATE_FILE_DEST and DB_RECOVERY_FILE_DEST to match with DB_FILE_NAME_CONVERT and LOG_FILE_NAME_CONVERT.

Remember sometimes this error requires us to create destinations manually

for instance:
mkidr ARCHIVELOG/ CONTROLFILE/ DATAFILE/ ONLINELOG/ PARAMETER/ TEMPFILE

Thursday, April 16, 2015

ORA-1652: unable to extend temp segment by 128 in tablespace TEMP

Reference Note 748251.1

Possible Cause 

ISSUE IS LIKELY CAUSED BY 
Bug 5689290 - VIEW V$RMAN_BACKUP_JOB_DETAILS SLOW CAUSING DATABASE CONSOLE LOGIN TIMEOUTS
which identified an issue with excessive temp space usage by emagent.

Bug 5689290 was closed as a duplicate of
Bug 5466436 - VIEW V$RMAN_BACKUP_JOB_DETAILS SLOW CAUSING DATABASE CONSOLE LOGIN TIMEOUTS
which was subsequently closed as a duplicate of:
Bug 8434467 - SLOW PERFORMANCE FOR QUERY ON V$RMAN_BACKUP_JOB_DETAILS

Solution

1.  Upgrade to any fixed version, e.g.
12.1 (Future Release)
11.2.0.2 (Server Patch Set)
11.2.0.1 Patch 4 on Windows Platforms
11.1.0.7 Patch 37 on Windows Platforms
10.2.0.5 Patch 4 on Windows Platforms
10.2.0.4 Patch 41 on Windows Platforms
- OR -

2. Apply Patch 8434467 as available for the your actual platform and Oracle version.
- OR -

3. Use the following workaround to reduce the temp usage by emagent / dbsnmp:
exec dbms_stats.DELETE_TABLE_STATS('SYS','X$KCCRSR'); ==> deletes the statistics on the fixed object
exec dbms_stats.LOCK_TABLE_STATS('SYS','X$KCCRSR'); ==> lock that object so that statistics will not be collected in future
alter system flush shared_pool;  ==> cannot be skipped

Tuesday, March 10, 2015

ORA-01580: error creating control backup

ERROR

Soon after launching the duplication or restore rman tries to restore the controlfile before it mounts the auxiliary database,which is not possible due to the rman configuration for snapshot controlfile which we can see with 
RMAN> show all;

or specifically

RMAN>show snopshot controlfile name;

Therefore the rman session hangs for sometime and throws the following error

RMAN-00571: ======================================================
RMAN-00569: ======== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ======================================================
RMAN-03002: failure of show command at 02/01/2015 01:35:50
RMAN-03014: implicit resync of recovery catalog failed
RMAN-03009: failure of full resync command on default channel at 11/08/2011 01:35:50
ORA-01580: error creating control backup file /u01/snapcf_ggdb1.f


SOLUTION

RMAN> CONFIGURE SNAPSHOT CONTROLFILE NAME CLEAR;

sqlplus "/ as sysdba"
sys@db1> EXECUTE SYS.DBMS_BACKUP_RESTORE.CFILESETSNAPSHOTNAME('/u01/app/oracle/product/11.2/db/dbs/snapcf_db1.f');
PL/SQL procedure successfully completed.

And then restart the duplication or restore.