Thursday 22 August 2013

ORACLE RAC 11g R2 Commands


This is collection of commands works on 11g R2. I found that lots of command have changed from 11G R1 to 11G R2 so this list can be used to quickly find relevant command.

1. To register a listener in CRS

$srvctl add listener -l LISTENER_TEST1 -p TCP:1521/IPC:LISTENER_TEST1 -o /ora/grid/product/11.2.0.3

2. To start the new listener

$srvctl start listener -l LISTENER_TEST1

3. To register a database with new listener

SQL> sqlplus / as sysdba
connected.
SQL> alter system set local_listener='(DESCRIPTION=(ADDRESS=(PROTOCOL=IPC)(KEY=LISTENER_TEST1)))' scope=both;

SQL> show parameter local_listener



4. The most used to command "crs_stat -t" is gone (deprecated) on 11gR2. Instead you

crsctl status resource [the output will be status of all resources]

You can still use the "-t" option to get a tabular report though ;)

crsctl status resource -t

Specify resource name "oracle.mydb.vip" to get the status of any particular resource -
crsctl status resource oracle.mydb.vip
 
 

Deprecated Command
Replacement Commands
crs_stat

crsctl check cluster -all
crsctl stat res -t
crs_register



crsctl add resource
crsctl add type
crsctl modify resource
crsctl modify type
crs_unregister
crsctl stop resource

crsctl modify resource resource_name -attr
"AUTO_START=never"
crs_start


crsctl start resource
crsctl start crs
crsctl start cluster
crs_stop


crsctl stop resource
crsctl stop crs
crsctl stop cluster
crs_getperm

crsctl getperm resource
crsctl getperm type
crs_profile





crsctl add resource
crsctl add type
crsctl status resource
crsctl status type
crsctl modify resource
crsctl modify type
crs_relocate
crsctl relocate resource
crs_setperm

crsctl setperm resource
crsctl setperm type
crsctl check crsd
crsctl check crs
crsctl check cssd
crsctl check css
crsctl check evmd
crsctl check evm
crsctl debug res log resource_name:level
crsctl set log
crsctl set css votedisk
crsctl add css votedisk

crsctl delete css votedisk

crsctl query css votedisk

crsctl replace css votedisk
crsctl start resources
crsctl start resource -all
crsctl stop resources
crsctl stop resource -all
 

Wednesday 21 August 2013

To Calculate and find reason of High Redo generation

To know the exact reason for the high redo, we need information about the redo activity and the details of the load. Following information need to be collected for the duration of high redo generation.


1] To know the trend of log switches below queries can be used.

SQL> alter session set NLS_DATE_FORMAT='DD-MON-YYYY HH24:MI:SS';

Session altered.

SQL> select trunc(first_time, 'HH') , count(*)
2 from v$loghist
3 group by trunc(first_time, 'HH')
4 order by trunc(first_time, 'HH');


We can also get the information about the log switches from alert log (by looking at the messages 'Thread 1 advanced to log sequence' and counting them for the duration), AWR report.


a) AWR Report

-- Create an AWR snapshot when you are able to reproduce the issue:
SQL> exec DBMS_WORKLOAD_REPOSITORY.CREATE_SNAPSHOT ();

-- After 30 minutes, create a new snapshot:
SQL> exec DBMS_WORKLOAD_REPOSITORY.CREATE_SNAPSHOT ();

-- Now run
$ORACLE_HOME/rdbms/admin/awrrpt.sql


b) Statspack Report

SQL> connect perfstat/
SQL> execute statspack.snap;

-- After 30 minutes
SQL> execute statspack.snap;
SQL> @?/rdbms/admin/spreport


In the AWR/Statspack report look out for queries with highest gets/execution. You can check in the "load profile" section for "Redo size" and compare it with non-problematic duration.


2] We need to mine the archivelogs generated during the time frame of high redo generation.

-- Use the DBMS_LOGMNR.ADD_LOGFILE procedure to create the list of logs to be analyzed:

SQL> execute DBMS_LOGMNR.ADD_LOGFILE('',options => dbms_logmnr.new);
SQL> execute DBMS_LOGMNR.ADD_LOGFILE('',options => dbms_logmnr.addfile);

-- Start the logminer

SQL> execute DBMS_LOGMNR.START_LOGMNR(OPTIONS => DBMS_LOGMNR.DICT_FROM_ONLINE_CATALOG);

SQL> select operation,seg_owner,seg_name,count(*) from v$logmnr_contents group by seg_owner,seg_name,operation;


Please refer to below article if there is any problem in using logminer.
Note 62508.1 - The LogMiner Utility

We can not get the Redo Size using Logminer but We can only get user,operation and schema responsible for high redo.


3] Run below query to know the session generating high redo at any specific time.

col program for a10
col username for a10
select to_char(sysdate,'hh24:mi'), username, program , a.sid, a.serial#, b.name, c.value
from v$session a, v$statname b, v$sesstat c
where b.STATISTIC# =c.STATISTIC#
and c.sid=a.sid and b.name like 'redo%'
order by value;

Oracle RAC node killed by CRS

I had faced this error message


[    CSSD]2012-01-12 15:24:19.352 [1199618400] >TRACE:   clssnmWaitThread: thrd(2), timeout(1000), wakeonpost(0)
[    CSSD]2012-01-12 15:24:19.353 [1220598112] >ERROR:   ###################################
[    CSSD]2012-01-12 15:24:19.353 [1220598112] >ERROR:   clssscExit: CSSD aborting from thread clssnmRcfgMgrThread
[    CSSD]2012-01-12 15:24:19.353 [1220598112] >ERROR:   ###################################

First of all location of log file is $CRS_HOME/log//cssd/
and file name is ocssd.log

There could be many reason for this error but in a nutshell CSSD has killed the local host connection to rest of RAC cluster. In this case, you will notice a hint on a line above ERROR which says that timeout is happening. Further investigating the log files I noticed that heartbeat between nodes is not fast enough. 
When I checked interface used by interconnect then noticed that it is running on slow speed.

By Changing speed of network interface resolve this problem.
 

Dropping Disk from ASM Group

Dropping disk in ASM by this method will mark the disk for drop and then start doing rebalance. Once it completes moving data off the ear-marked disks then it will remove that disk from diskgroup.
This is different concept than dropping disks on OS and storage level, as in both disk are unavailable straight away after comman executes.

This also helps to do addition and deletion of disks by doing one rebalance. So if new disks are added and straight after that old disks are dropped in ASM then ASM will only do one rebalance considering both new and removed disks.

Alter diskgroup DATA drop disk '/dev/data_disk1';
Diskgroup altered.



 

Calculate how much time/work is left to finish by RMAN

This command is specific to monitor RMAN activity but can be modified easily to monitor activity by other sessions

select SID, START_TIME,TOTALWORK, sofar, (sofar/totalwork) * 100 done,
sysdate + TIME_REMAINING/3600/24 end_at
from v$session_longops
where totalwork > sofar
AND opname NOT LIKE '%aggregate%'
AND opname like 'RMAN%'
/

 
or use  this

select OPERATION,MBYTES_PROCESSED from v$rman_status where status='RUNNING';