Ajwat's blog on Oracle: August 2013

Thursday, 22 August 2013

ORACLE RAC 11g R2 Commands

This is collection of commands works on 11g R2. I found that lots of command have changed from 11G R1 to 11G R2 so this list can be used to quickly find relevant command.

1. To register a listener in CRS

$srvctl add listener -l LISTENER_TEST1 -p TCP:1521/IPC:LISTENER_TEST1 -o /ora/grid/product/11.2.0.3

2. To start the new listener

$srvctl start listener -l LISTENER_TEST1

3. To register a database with new listener

SQL> sqlplus / as sysdba
connected.
SQL> alter system set local_listener='(DESCRIPTION=(ADDRESS=(PROTOCOL=IPC)(KEY=LISTENER_TEST1)))' scope=both;

SQL> show parameter local_listener

4. The most used to command "crs_stat -t" is gone (deprecated) on 11gR2. Instead you

crsctl status resource [the output will be status of all resources]

You can still use the "-t" option to get a tabular report though ;)

crsctl status resource -t

Specify resource name "oracle.mydb.vip" to get the status of any particular resource -

crsctl status resource oracle.mydb.vip

Deprecated Command	Replacement Commands
crs_stat	crsctl check cluster -all
	crsctl stat res -t

crs_register	crsctl add resource
	crsctl add type
	crsctl modify resource
	crsctl modify type
crs_unregister	crsctl stop resource
	crsctl modify resource resource_name -attr "AUTO_START=never"
crs_start	crsctl start resource
	crsctl start crs
	crsctl start cluster
crs_stop	crsctl stop resource
	crsctl stop crs
	crsctl stop cluster


crs_getperm	crsctl getperm resource
crs_getperm	crsctl getperm type

crs_profile	crsctl add resource
	crsctl add type
	crsctl status resource
	crsctl status type
	crsctl modify resource
	crsctl modify type

crs_relocate	crsctl relocate resource
crs_setperm	crsctl setperm resource
	crsctl setperm type

crsctl check crsd	crsctl check crs
crsctl check cssd	crsctl check css
crsctl check evmd	crsctl check evm


crsctl debug res log resource_name:level	crsctl set log

crsctl set css votedisk	crsctl add css votedisk
	crsctl delete css votedisk
	crsctl query css votedisk
	crsctl replace css votedisk

crsctl start resources	crsctl start resource -all
crsctl stop resources	crsctl stop resource -all

Wednesday, 21 August 2013

To Calculate and find reason of High Redo generation

To know the exact reason for the high redo, we need information about the redo activity and the details of the load. Following information need to be collected for the duration of high redo generation.

1] To know the trend of log switches below queries can be used.

SQL> alter session set NLS_DATE_FORMAT='DD-MON-YYYY HH24:MI:SS';

Session altered.

SQL> select trunc(first_time, 'HH') , count(*)
2 from v$loghist
3 group by trunc(first_time, 'HH')
4 order by trunc(first_time, 'HH');

We can also get the information about the log switches from alert log (by looking at the messages 'Thread 1 advanced to log sequence' and counting them for the duration), AWR report.

a) AWR Report

-- Create an AWR snapshot when you are able to reproduce the issue:
SQL> exec DBMS_WORKLOAD_REPOSITORY.CREATE_SNAPSHOT ();

-- After 30 minutes, create a new snapshot:
SQL> exec DBMS_WORKLOAD_REPOSITORY.CREATE_SNAPSHOT ();

-- Now run
$ORACLE_HOME/rdbms/admin/awrrpt.sql

b) Statspack Report

SQL> connect perfstat/
SQL> execute statspack.snap;

-- After 30 minutes
SQL> execute statspack.snap;
SQL> @?/rdbms/admin/spreport

In the AWR/Statspack report look out for queries with highest gets/execution. You can check in the "load profile" section for "Redo size" and compare it with non-problematic duration.

2] We need to mine the archivelogs generated during the time frame of high redo generation.

-- Use the DBMS_LOGMNR.ADD_LOGFILE procedure to create the list of logs to be analyzed:

SQL> execute DBMS_LOGMNR.ADD_LOGFILE('',options => dbms_logmnr.new);
SQL> execute DBMS_LOGMNR.ADD_LOGFILE('',options => dbms_logmnr.addfile);

-- Start the logminer

SQL> execute DBMS_LOGMNR.START_LOGMNR(OPTIONS => DBMS_LOGMNR.DICT_FROM_ONLINE_CATALOG);

SQL> select operation,seg_owner,seg_name,count(*) from v$logmnr_contents group by seg_owner,seg_name,operation;

Please refer to below article if there is any problem in using logminer.
Note 62508.1 - The LogMiner Utility

We can not get the Redo Size using Logminer but We can only get user,operation and schema responsible for high redo.

3] Run below query to know the session generating high redo at any specific time.

col program for a10
col username for a10
select to_char(sysdate,'hh24:mi'), username, program , a.sid, a.serial#, b.name, c.value
from v$session a, v$statname b, v$sesstat c
where b.STATISTIC# =c.STATISTIC#
and c.sid=a.sid and b.name like 'redo%'
order by value;

Oracle RAC node killed by CRS

I had faced this error message

[    CSSD]2012-01-12 15:24:19.352 [1199618400] >TRACE:   clssnmWaitThread: thrd(2), timeout(1000), wakeonpost(0)
[    CSSD]2012-01-12 15:24:19.353 [1220598112] >ERROR:   ###################################
[    CSSD]2012-01-12 15:24:19.353 [1220598112] >ERROR:   clssscExit: CSSD aborting from thread clssnmRcfgMgrThread
[    CSSD]2012-01-12 15:24:19.353 [1220598112] >ERROR:   ###################################

First of all location of log file is $CRS_HOME/log//cssd/
and file name is ocssd.log

There could be many reason for this error but in a nutshell CSSD has killed the local host connection to rest of RAC cluster. In this case, you will notice a hint on a line above ERROR which says that timeout is happening. Further investigating the log files I noticed that heartbeat between nodes is not fast enough.
When I checked interface used by interconnect then noticed that it is running on slow speed.

By Changing speed of network interface resolve this problem.

Dropping Disk from ASM Group

Dropping disk in ASM by this method will mark the disk for drop and then start doing rebalance. Once it completes moving data off the ear-marked disks then it will remove that disk from diskgroup.
This is different concept than dropping disks on OS and storage level, as in both disk are unavailable straight away after comman executes.

This also helps to do addition and deletion of disks by doing one rebalance. So if new disks are added and straight after that old disks are dropped in ASM then ASM will only do one rebalance considering both new and removed disks.

Alter diskgroup DATA drop disk '/dev/data_disk1';
Diskgroup altered.

Calculate how much time/work is left to finish by RMAN

This command is specific to monitor RMAN activity but can be modified easily to monitor activity by other sessions

select SID, START_TIME,TOTALWORK, sofar, (sofar/totalwork) * 100 done,
sysdate + TIME_REMAINING/3600/24 end_at
from v$session_longops
where totalwork > sofar
AND opname NOT LIKE '%aggregate%'
AND opname like 'RMAN%'
/

　
or use this

select OPERATION,MBYTES_PROCESSED from v$rman_status where status='RUNNING';