Thursday, 19 September 2013

OGG-01031 Pump Abending

A fairly common occurance in the Golden Gate world. The pump abends as the target destination becomes full.
In this scenario, I had tried restarting pump process and clearing down space but that didn't resolve the problem.
The pump error message is below


2013-09-19 10:25:42 INFO OGG-00993 Oracle GoldenGate Capture for Oracle, test_ext.prm: EXTRACT TEST_EXT started. 2013-09-19 10:25:42  INFO OGG-01226 Oracle GoldenGate Capture for Oracle, test_ext.prm: Socket buffer size set to 27985 (flush size 27985). 2013-09-19 10:25:42  ERROR OGG-01031 Oracle GoldenGate Capture for Oracle, test_ext.prm: There is a problem in network communication, a remote file problem, encryption keys for target and source do not match (if using ENCRYPT) or an unknown error. (Reply received is Unable to open file "/oracle/ggate/dirdat/rp/rp000213" (error 13, Permission denied)). 2013-09-19 10:25:42  ERROR OGG-01668 Oracle GoldenGate Capture for Oracle, test_ext.prm: PROCESS ABENDING.

In this case we have to manually tell Extract to rollover

alter extract TEST_EXT etrollover
2013-09-19 10:57:19 INFO OGG-01520 Oracle GoldenGate Command Interpreter for Oracle: Rollover performed. For each affected output trail of Version 10 or higher format, after starting the source extract, issue ALTER EXTSEQNO for that trail's reader (either pump EXTRACT or REPLICAT) to move the reader's scan to the new trail file; it will not happen automatically.


Run info pumpname to get the new trail file name.

At the target end, the replicat should have parsed all the data received until the drive became full and data stopped coming through.
To let the target know to switch to the new file, use below command.

alter replicat TEST_EXT ,EXTSEQNO 214 , EXTRBA 0 --- replicat was reading the seqno 213. moved to 214.





Installing Grid Infrastructure for a Standalone Server

There is a very good article with a tutorial to install Grid Infrastructure on a Standalone Server

http://www.oracle.com/webfolder/technetwork/tutorials/obe/db/11g/r2/prod/install/gridinstss/gridinstss.htm

Tuesday, 3 September 2013

DROP DATABASE VIA RMAN

If drop database command is run under RMAN then it will drop datafiles, online redo logs ,controlfiles , backups and also unregister database from recovery catalog. Only thing I noticed that was left unclean was flashback logs that has to be deleted manually.

In order to drop database within RMAN, make sure that database is in mount state and in restrictive mode.

RMAN> CONNECT TARGET /


connected to target database: TESTDB1 (DBID=39523451)

RMAN> STARTUP FORCE MOUNT
RMAN> SQL 'ALTER SYSTEM ENABLE RESTRICTED SESSION';
RMAN> DROP DATABASE INCLUDING BACKUPS NOPROMPT;


NOPROMPT will not ask for confirmation so if you want to be sure then use
 RMAN> DROP DATABASE INCLUDING BACKUPS;

Above command will use controlfile of the database so in order to user catalog use

RMAN > CONNECT TARGET SYS@TESTDB CATALOG RMAN@RMANCAT


 

Monday, 2 September 2013

DATA PUMP COMMAND

Data pump was introduced in oracle 10G and has become very common tool in 11G and 12C databases.

quick reference to use data pump export and import command with sysdba privileges is


For data pump you need to create a directory in database
$sqlplus / as sysdba
sql> create or replace directory DUMP_DIR as '/tmp/data_dump';


$expdp \'/ as sysdba \' dumpfile=exp_test_user.dmp logfile=exp_test_user.log directory=DUMP_DIR schemas=TEST_USER


Similar command for data pump import utility

$impdp \'/ as sysdba \' dumpfile=exp_test_user.dmp logfile=imp_test_user.log directory=DUMP_DIR tables=TEST_USER.TEST


If you just want to gnerate script for object creation then

$expdp \'/ as sysdba \' dumpfile=exp_test_user.dmp logfile=exp_test_user.log directory=DUMP_DIR schemas=TEST_USER content=METADATA_ONLY


$impdp \'/ as sysdba \' dumpfile=exp_test_user.dmp logfile=imp_test_user.log directory=DUMP_DIR tables=TEST_USER.TEST
content=METADATA_ONLY sqlpfile=testscript.sql


 

Golden Gate commands -- Quick Reference

GGCGI is golden gate command line interface which can be invoked by running
./ggsci


TO CHECK PROCESS
1> info all
or
1> show all

TO SEE DETAIL OF SPECIFIC PROCESS
2> info extract ext1


TO CHECK VERSION

to display operating system and database version information. Use DBLOGIN to connect to the database first
GGSCI  11> dblogin userid ggadmin, password gg_pwd
Successfully logged into database.
GGSCI 12> versions
Operating System:
Linux
Version #1 SMP Mon Mar 02  08:19:03 EDT 2012 Release 2.6.18-194.el5PAE
Node: linux01.oncalldba.com
Machine: i686
Database:
Oracle Database 11g Enterprise Edition Release 11.1.0.6.0 – Production
PL/SQL Release 11.1.0.6.0 – Production
CORE 11.1.0.6.0 Production
TNS for Linux: Version 11.1.0.6.0 – Production
NLSRTL Version 11.1.0.6.0 – Production



TO RUN SHELL SCRIPT WITHIN GGSCI

GGSCI  13> shell ls -l /home/oracle/goldengate/dirdat/
total 50588
-rw-rw-r- 1 oracle oinstall     5962 Jul 16 14:57 tst000000
-rw-rw-r- 1 oracle oinstall 9976629 Jul 19 13:09 tst000001

OBEY: to process a file that contains a list of GoldenGate commands. OBEY is useful for executing commands that are frequently used in sequence.
Suppose we want to run some Goldengate commands and execute those from a Unix shell script which can be called via say cron.
So we have a start_goldgate.sh shell script which will use the GGSCI command OBEY to call a text file which has the Goldengate commands which we would like to run in sequence.
[oracle@linux01 goldengate]$ cat start_goldgate.sh
cd /home/oracle/goldengate
./ggsci < EOF
OBEY /home/oracle/goldengate/startup.txt
EOF
[oracle@linux01 goldengate]$ cat startup.txt
START MANAGER
START EXTRACT EXT3
START EXTRACT DPUMP
INFO ALL
[oracle@linux01 goldengate]$ ./start_goldgate.sh
Oracle GoldenGate Command Interpreter for Oracle
Version 10.4.0.19 Build 002
Linux, x86, 32bit (optimized), Oracle 11 on Sep 29 2009 08:50:50

Copyright (C) 1995, 2009, Oracle and/or its affiliates.  All rights reserved.



GGSCI  1>
GGSCI  2> START MANAGER


Manager started.


GGSCI  3> START EXTRACT EXT3

EXTRACT EXT3 is already running.


GGSCI  4> START EXTRACT DPUMP

EXTRACT DPUMP is already running.


GGSCI  5> INFO ALL


Program     Status      Group       Lag           Time Since Chkpt

MANAGER     RUNNING
EXTRACT     RUNNING     DPUMP       00:00:00      00:00:01
EXTRACT     ABENDED     EXT2        00:00:00      172:36:58
EXTRACT     RUNNING     EXT3        00:00:00      00:00:04
 
 
 

Thursday, 22 August 2013

ORACLE RAC 11g R2 Commands


This is collection of commands works on 11g R2. I found that lots of command have changed from 11G R1 to 11G R2 so this list can be used to quickly find relevant command.

1. To register a listener in CRS

$srvctl add listener -l LISTENER_TEST1 -p TCP:1521/IPC:LISTENER_TEST1 -o /ora/grid/product/11.2.0.3

2. To start the new listener

$srvctl start listener -l LISTENER_TEST1

3. To register a database with new listener

SQL> sqlplus / as sysdba
connected.
SQL> alter system set local_listener='(DESCRIPTION=(ADDRESS=(PROTOCOL=IPC)(KEY=LISTENER_TEST1)))' scope=both;

SQL> show parameter local_listener



4. The most used to command "crs_stat -t" is gone (deprecated) on 11gR2. Instead you

crsctl status resource [the output will be status of all resources]

You can still use the "-t" option to get a tabular report though ;)

crsctl status resource -t

Specify resource name "oracle.mydb.vip" to get the status of any particular resource -
crsctl status resource oracle.mydb.vip
 
 

Deprecated Command
Replacement Commands
crs_stat

crsctl check cluster -all
crsctl stat res -t
crs_register



crsctl add resource
crsctl add type
crsctl modify resource
crsctl modify type
crs_unregister
crsctl stop resource

crsctl modify resource resource_name -attr
"AUTO_START=never"
crs_start


crsctl start resource
crsctl start crs
crsctl start cluster
crs_stop


crsctl stop resource
crsctl stop crs
crsctl stop cluster
crs_getperm

crsctl getperm resource
crsctl getperm type
crs_profile





crsctl add resource
crsctl add type
crsctl status resource
crsctl status type
crsctl modify resource
crsctl modify type
crs_relocate
crsctl relocate resource
crs_setperm

crsctl setperm resource
crsctl setperm type
crsctl check crsd
crsctl check crs
crsctl check cssd
crsctl check css
crsctl check evmd
crsctl check evm
crsctl debug res log resource_name:level
crsctl set log
crsctl set css votedisk
crsctl add css votedisk

crsctl delete css votedisk

crsctl query css votedisk

crsctl replace css votedisk
crsctl start resources
crsctl start resource -all
crsctl stop resources
crsctl stop resource -all
 

Wednesday, 21 August 2013

To Calculate and find reason of High Redo generation

To know the exact reason for the high redo, we need information about the redo activity and the details of the load. Following information need to be collected for the duration of high redo generation.


1] To know the trend of log switches below queries can be used.

SQL> alter session set NLS_DATE_FORMAT='DD-MON-YYYY HH24:MI:SS';

Session altered.

SQL> select trunc(first_time, 'HH') , count(*)
2 from v$loghist
3 group by trunc(first_time, 'HH')
4 order by trunc(first_time, 'HH');


We can also get the information about the log switches from alert log (by looking at the messages 'Thread 1 advanced to log sequence' and counting them for the duration), AWR report.


a) AWR Report

-- Create an AWR snapshot when you are able to reproduce the issue:
SQL> exec DBMS_WORKLOAD_REPOSITORY.CREATE_SNAPSHOT ();

-- After 30 minutes, create a new snapshot:
SQL> exec DBMS_WORKLOAD_REPOSITORY.CREATE_SNAPSHOT ();

-- Now run
$ORACLE_HOME/rdbms/admin/awrrpt.sql


b) Statspack Report

SQL> connect perfstat/
SQL> execute statspack.snap;

-- After 30 minutes
SQL> execute statspack.snap;
SQL> @?/rdbms/admin/spreport


In the AWR/Statspack report look out for queries with highest gets/execution. You can check in the "load profile" section for "Redo size" and compare it with non-problematic duration.


2] We need to mine the archivelogs generated during the time frame of high redo generation.

-- Use the DBMS_LOGMNR.ADD_LOGFILE procedure to create the list of logs to be analyzed:

SQL> execute DBMS_LOGMNR.ADD_LOGFILE('',options => dbms_logmnr.new);
SQL> execute DBMS_LOGMNR.ADD_LOGFILE('',options => dbms_logmnr.addfile);

-- Start the logminer

SQL> execute DBMS_LOGMNR.START_LOGMNR(OPTIONS => DBMS_LOGMNR.DICT_FROM_ONLINE_CATALOG);

SQL> select operation,seg_owner,seg_name,count(*) from v$logmnr_contents group by seg_owner,seg_name,operation;


Please refer to below article if there is any problem in using logminer.
Note 62508.1 - The LogMiner Utility

We can not get the Redo Size using Logminer but We can only get user,operation and schema responsible for high redo.


3] Run below query to know the session generating high redo at any specific time.

col program for a10
col username for a10
select to_char(sysdate,'hh24:mi'), username, program , a.sid, a.serial#, b.name, c.value
from v$session a, v$statname b, v$sesstat c
where b.STATISTIC# =c.STATISTIC#
and c.sid=a.sid and b.name like 'redo%'
order by value;

Oracle RAC node killed by CRS

I had faced this error message


[    CSSD]2012-01-12 15:24:19.352 [1199618400] >TRACE:   clssnmWaitThread: thrd(2), timeout(1000), wakeonpost(0)
[    CSSD]2012-01-12 15:24:19.353 [1220598112] >ERROR:   ###################################
[    CSSD]2012-01-12 15:24:19.353 [1220598112] >ERROR:   clssscExit: CSSD aborting from thread clssnmRcfgMgrThread
[    CSSD]2012-01-12 15:24:19.353 [1220598112] >ERROR:   ###################################

First of all location of log file is $CRS_HOME/log//cssd/
and file name is ocssd.log

There could be many reason for this error but in a nutshell CSSD has killed the local host connection to rest of RAC cluster. In this case, you will notice a hint on a line above ERROR which says that timeout is happening. Further investigating the log files I noticed that heartbeat between nodes is not fast enough. 
When I checked interface used by interconnect then noticed that it is running on slow speed.

By Changing speed of network interface resolve this problem.
 

Dropping Disk from ASM Group

Dropping disk in ASM by this method will mark the disk for drop and then start doing rebalance. Once it completes moving data off the ear-marked disks then it will remove that disk from diskgroup.
This is different concept than dropping disks on OS and storage level, as in both disk are unavailable straight away after comman executes.

This also helps to do addition and deletion of disks by doing one rebalance. So if new disks are added and straight after that old disks are dropped in ASM then ASM will only do one rebalance considering both new and removed disks.

Alter diskgroup DATA drop disk '/dev/data_disk1';
Diskgroup altered.



 

Calculate how much time/work is left to finish by RMAN

This command is specific to monitor RMAN activity but can be modified easily to monitor activity by other sessions

select SID, START_TIME,TOTALWORK, sofar, (sofar/totalwork) * 100 done,
sysdate + TIME_REMAINING/3600/24 end_at
from v$session_longops
where totalwork > sofar
AND opname NOT LIKE '%aggregate%'
AND opname like 'RMAN%'
/

 
or use  this

select OPERATION,MBYTES_PROCESSED from v$rman_status where status='RUNNING';