See OCFS Oracle Cluster Filesystem, ASM, TNSnames configuration, Oracle Database 11g New Features, Raw devices, Resource Manager, Dbca See http://www.oracle.com/technology/support/metalink/index.html to view certification matrix
This is just a draft of basic RAC 10g administration |
RAC benefit and characteristics - does not protect from human errrors - increased availabilty from node/instance failure - speed up parallel DSS queries - no speed up parallel OLTP processes - no availability increase on data failures - no availability increase on network failures - no availability increase on release upgrades - no scalability increased for applications workloads in all cases RAC tuning - After migration to RAC test: - Interconnect latency - Instance recovery time - Application strongly relying on table truncates, full scan tables, sequences and non-sequences key generation, global context variables |
RAC specific background processes for the database instance Cluster Synchronization Service (CSS) ocssd daemon, manages cluster configuration Cluster Ready Services (CRS) manages resources(listeners, VIPs, Global Service Daemon GSD, Oracle Notification Service ONS) crsd daemon backup the OCR every for hours, configuration is stored in OCR Event Manager (EVM) evmd daemon, publish events LMSn coordinate block updates LMON global enqueue for shared locks LMDn manages requests for global enqueues LCK0 handle resources not requiring Cache Fusion DIAG collect diagnostic info GSD 9i is not compatible with 10g |
FAN Fast Application Notification - Must connect using service Logged to: &ORA_CRS_HOME/racg/dump $ORA_CRS_HOHE/log/<nodename>/racg <event_type> VERSION=<n.n> service=<service_namne.db_domain_name> [database=<db_unique_name> [instance=<instance_name>]] [host=<hostname>] status=<event_status> reason=<event_reason> [card=<n>] timestamp=<event_date> <event_time> event_type Description SERVICE Primary application service event SRV_PRECONNECT Preconnect application service event (TAF) SERVICEMEMBER Application service on a specific instance event DATABASE Database event INSTANCE Instance event ASM ASM instance event NODE Cluster node event #FAN events can control the workload per instance for each service |
Oracle Notification Service ONS
- Transmits FAN events
- For every FAN event status change, all executables in $ORA_CRS_HOME/racg/usrco are launched (callout scripts)
The ONS process is $ORA_CRS_HOME/opmn/bin/ons
Arguments:
-d: Run in daemon mode
-a <command>: <command> can be [ping, shutdown, reload, or debug]
[$ORA_CRS_HOME/opmn/conf/ons.config]
localport=6lOO
remoteport=6200
loglevel=3
useocr=on
onsctl start/stop/ping/reconfig/debug/detailed
|
FCF Fast Connection Failover
- A JDBC application configured to use FCF automatically subscribes to FAN events
- A JDBC application must use service names to connect
- A JDBC application must use implicit connection cache
- $ORACLE_HOME/opmn/lib/ons.jar must be in classpath
- -Doracle.ons.oraclehome - <location of oracle home>
or
System.setProperty ("oracle.ons.oraclehome", "/u01/app/oracle/product/10.2.0/db_l");
OracleDataSource ods = new OracleDataSource();
ods.setUser("USERl");
ods.setPassword("USERl");
ods.setConnectionCachingEnabled(true);
ods.setFastConnectionFailoverEnabled(true);
ods.setConnectionCacheName("MyCache");
ods.setConnectionCacheProperties(cp);
ods.setURL("jdbc:oracle:thin:@(DESCRIPTION=(LOAD_BALANCE=on)(ADDRESS=(PROTOCOL=TCP)(HOST=londonl-vip)(PORT=152l)(ADDRESS=(PROTOCOL=TCP)(HOST=london2-vip)(PORT=152l)(CONNECT_DATA=(SERVICE_NAME=SERVICE1)))")
|
Check for main Clusterware services up
#check Event Manager up
ps -ef | grep evmd
#check Cluster Synchronization Services up
ps -ef | grep ocssd
#check Cluster Ready Services up
ps -ef | grep crsd
#check Oracle Notification Service
ps -ef | grep ons
[/etc/inittab]
...
hl:35:respawn:/etc/init.d/init.evmd run >/dev/null 2>&l </dev/null
h2:35:respawn:/etc/init.d/init.cssd fatal >/dev/null 2>&l </dev/null
h3:35:respawn:/etc/init.d/init.crsd run >/dev/null 2>&1 </dev/null |
crs_stat #Tested, as root #Lists the status of an application profile and resources #crs_stat [resource_name [...]] [-v] [-l] [-q] [-c cluster_node] $ORA_CRS_HOME/bin/crs_stat -t Name Type Target State Host ------------------------------------------------------------ ora.e2.gsd application ONLINE ONLINE e2 ora.e2.ons application ONLINE ONLINE e2 ora.e2.vip application ONLINE ONLINE e2 VIP Normal Name Type Target State Host ------------------------------------------------------------ ora.e2.vip application ONLINE ONLINE e2 ora.e2.vip application ONLINE ONLINE e3 VIP Node 2 is down Name Type Target State Host ------------------------------------------------------------ ora.e2.vip application ONLINE ONLINE e2 ora.e2.vip application ONLINE ONLINE e2 crs_stat -p ... AUTO_START = #2 CRS will not start after system boot crs_stat NAME=ora.RAC.RACl.inst TYPE=application TARGET=ONLINE STATE=ONLINE on londonl NAME=ora.RAC.SERVICEl.RACl.srv TYPE=application TARGET=OFFLINE STATE=OFFLINE #use -v for verbose resource use #use -p for a lot of details #use -ls to view resources and relative owners |
Voting disk
On Shared storage, Used by CSS, contains nodes that are currently available within the cluster
If Voting disks are lost and no backup is available then Oracle Clusterware must be reinstalled
3 way multiplexing is ideal
#backup a voting disk online
dd if=<fname> of=<out_fname>
crsctl
#Tested, as oracle
$ORA_CRS_HOME/bin/crsctl check crs
Cluster Synchronization Services appears healthy
Cluster Ready Services appears healthy
Event Manager appears healthy
#add online a new voting disk(10.2), -force if Oracle Clusterware is not started
crsctl add css votedisk 'new votedisk path' -force
crsctl start/stop/enable/disable crs
#set/unset parameters on OCR
crsctl set/unset <parameter> <value>
You can list the currently configured voting disks:
crsctl query css votedisk
0. 0 /u02/oradata/RAC/CSSFilel
1. 1 /u03/oradata/RAC/CSSFile2
2. 2 /u04/oradata/RAC/CSSFile3
Dynamically add and remove voting disks to an existing Oracle Clusterware installation:
crsctl add/delete css votedisk <path> -force
CRS log and debug
#as root, enable extra debug for the running CRS daemons as well as those running in future
#enable to inspect system reboots
crsctl debug log crs
#Collect log and traces to upload to Oracle Support
diagcollection.pl
|
OCR - Oracle Cluster Registry
[/etc/oracle/ocr.loc](10g) or [/etc/oracle/srvConfig.loc](9i, still exists in 10g for compatibility)
ocrconfig_loc=/dev/raw/rawl
ocrmirrorconfig_loc=/dev/raw/raw2
local_only=FALSE
OCRCONFIG - Command-line tool for managing Oracle Cluster Registry
#recover OCR logically, must be done on all nodes
ocrconfig -import exp.dmp
#export OCR content logically
ocrconfig -export
#recover OCR from OCR backup
ocrconfig -restore bck.ocr
#show backup status
#crsd daemon backup the OCR every for hours, the most recent backup file is backup00.ocr
ocrconfig -showbackup
londonl 2005/08/04 11:15:29 /uOl/app/oracle/product/lO.2.0/crs/cdata/crs
londonl 2005/08/03 22:24:32 /uOl/app/oracle/product/10.2.0/crs/cdata/crs
#change OCR autobackuo location
ocrconfig -backuploc
#must be run on each affected node
ocrconfig -repair ocr <filename>
ocrconfig -repair ocrmirror <filename>
#force Oracle Clusterware to restart on a node, may lose recent OCR updates
ocrconfig -overwrite
CVU - Cluster verification utility to get status of CRS resources
dd : use it safely to backup voting disks when nodes are added/removed
#verify restore
cluvfy comp ocr -n all
ocrcheck
#OCR integrity check, validate the accessibility of the device and its block integrity
log to current dir or to $OCR_HOME/log/<node>/client
ocrdump
#dump the OCR content to a text file, if succeds then integrity of backups is verified
OCRDUMP - Identify the interconnect being used
$ORA CRS HOME/bin/ocrdump.bin -stdout -keyname SYSTEM.css.misscount -xml
|
Pre install, prerequisite (./run)cluvfy : run from install media or CRS_HOME, verify prerequisites on all nodes Post installation - Backup root.sh - Set up other user accounts - Verify Enterprise Manager / Cluster Registry by running srvctl config database -d db_name |
SRVCTL Stores infos in OCR, manages: Database, Instance, Service, Node applications, ASM, Listener srvctl config database -d <db_name> : Verify Enterprise Manager / Cluster Registry set SRVM_TRACE=TRUE environment var to create Java based tool trace/debug file for srvctl #-v to check services srvctl status database -d RAC -v SERVICE1 srvctl start database -d <name> [-o mount] srvctl stop database -d <name> [-o stop_options] #moves parameter file srvctl modify database -d name -p /u03/oradata/RAC/spfileRAC.ora srvctl remove database -d TEST #Verify the OCR configuration srvctl config database - TEST srvctl start instance -d RACDB -i "RAC3,RAC4" srvctl stop instance -d <orcl> -i "orcl3,orcl4" -o immediate srvctl add instance -d RACDB -i RAC3 -n londonS #move the instance to node london4 srvctl modify instance -d RAC -i RAC3 -n london4 #set a dependency of instance RAC3 to +ASM3 srvctl modify instance -d RAC -i RAC3 -s +ASM3 #removes an ASM dependency srvctl modify instance -d RAC -i RAC3 -r #stop all applications on node srvctl stop nodeapps -n londonl #-a display the VIP configuration srvctl config nodeapps -n londonl -a srvctl add nodeapps -n london3 -o $0RACLE_H0ME -A london3-vip/255.255.0.0/eth0 |
Services Changes are recorded in OCR only! Must use DBMS_SERVICE to update the dictionary srvctl start service -d RAC -s "SERVICE1,SERVICE2" srvctl status service -d RAC -s "SERVICE1,SERVICE2" srvctl stop service -d RAC -s "SERVICE1,SERVICE2" -f srvctl disable service -d RAC -s "SERVICE2" -i RAC4 srvctl remove service -d RAC -s "SERVICE2" #relocate from RAC2 to RAC4 srvctl relocate service -d RAC -s "SERVICE2" -i RAC2 -t RAC4 #preferred RAC1,RAC2 and available RAC3,RAC4 #-P PRECONNECT automatically creates a ERP and ERP_PRECONNECT service to use as BACKUP in tns_names #See TNSnames configuration #the service is NOT started, must be started manually (dbca do it automatically) srvctl add service -d ERP -s SERVICE2 -i "RAC1,RAC2" -a "RAC3,RAC4" -P PRECONNECT #show configuration, -a shows TAF conf srvctl config service -d RAC -a #modify an existing service srvctl modify service -d RACDB -s "SERVICE1" -i "RAC1,RAC2" -a "RAC3,RAC4" srvctl stop service -d RACDB -s "SERVICE1" srvctl start service -d RACDB -s "SERVICE1" Views GV$SERVICES GV$ACTIVE_SERVICES GV$SERVICEMETRIC GV$SERVICEMETRIC_HISTORY GV$SERVICE_WAIT_CLASS GV$SERVICE_EVENT GV$SERVICE_STATS GV$SERV_MOD_ACT_STATS |
SQL for RAC select * from V$ACTIVE_INSTANCES; Cache Fusion - GRD Global Resource Directory GES(Global Enqueue Service) GCS(Global Cache Service) Data Guard & RAC - Configuration files at primary location can be stored in any shared ASM diskgroup, on shared raw devices, on any shared cluster file system. They simply have to be shared |
VIP virtual IP - Both application/RAC VIP fail over if related application fail and accept new connections - Recommended RAC VIP sharing among database instances but not among different applications because... - ...VIP fail over if the application fail over - A failed over VIP application accepts new connection - Each VIP requires an unused and resolvable IP address - VIP address should be registered in DNS - VIP address should be on the same subnet of the public network - VIPs are used to prevent connection requests timeout during client connection attempts Changing a VIP 1- Stop VIP dependent cluster components on one node 2- Make changes on DNS 3- Change VIP using SRVCTL 4- Restart VIP dependent components 5- Repeat above on remaining nodes |
oifcfg allocating and deallocating network interfaces, get values from OCR To display a list of networks oifcfg getif eth1 192.168.1.0 global cluster_interconnect eth0 192.168.0.0 global public display a list of current subnets oifcfg iflist etho 147.43.1.0 ethl 192.168.1.0 To include a description of the subnet, specify the -p option: oifcfg iflist -p ethO 147.43.1.0 UNKNOWN ethl 192.168.1.0 PRIVATE In 10.2 public interfaces are UNKNOWN. To include the subnet mask, append the -n option to the -p option: oifcfg if list -p -n etho 147.43.1.0 UNKNOWN 255.255.255.0 ethl 192.168.1.0 PRIVATE 255.255.255.0 |
Db parameters with SAME VALUE across all instances active_instance_count archive_lag_target compatible cluster_database RAC param cluster_database_instance RAC param #Define network interfaces that will be used for interconnect #it is not a failover but a redistribution. If an address not work then stop all #Overrides the OCR cluster_interconnects RAC param = 192.168.0.10; 192.168.0.11; ... control_files db_block_size db_domain db_files db_name db_recovery_file_dest db_recovery_file_dest_size db_unique_name dml_locks (when 0) instance_type (rdbms or asm) max_commit_propagation_delay RAC param parallel_max_servers remote_login_password_file trace_enabled #cannot be mixed AUTO and MANUAL in a RAC undo_management Db parameters with INSTANCE specific VALUE across all instances instance_name instance_number thread undo_tablespace #system param Listener parameters local_listener='(ADDRESS_LIST = (ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.0.13) (PORT = 1521)))' #allow pmon to register with local listener when not using 1521 port remote_listener = '(ADDRESS_LIST = (ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.2.9) (PORT = 1521)) (ADDRESS = (PROTOCOL = TCP)(HOST =192.168.2.10)(PORT = 1521)))' #make the listener aware of the load of the listeners of other nodes Important Rac Parameters gc_files_to_locks #other than default disable Cache Fusion recovery_parallelism #number of redo application server processes in instance or media recovery Rac and Standby parameters dg_broker_config_file1 #shared between primary and standby instances dg_broker_config_file2 #different from dg_broker_config_file1, shared between primary and standby instances |
Shared contents
datafiles, controlfiles, spfiles, redo log
Shared or local?
RAW_Dev File_Syst ASM NFS OCFS
- Datafiles : shared mandatory
- Control files : shared mandatory
- Redo log : shared mandatory
- SPfile : shared mandatory
- OCR and vote : shared mandatory Y Y N
- Archived log : shared not mandatory. N Y N Y
- Undo : local
- Flash Recovery : shared Y Y Y
- Data Guard broker conf.: shared(prim. & stdby) Y Y
|
Adding logfile thread groups for a new instance #To support a new instance on your RAC 1) alter database add logfile thread 3 group 7; 1) alter database add logfile thread 3 group 8; #makes the thread available for use by any instance 2) alter database enable thread 3; # if you want to change an used thread 2) alter system set thread=3 scope=pfile sid='RAC01' 3) srvctl stop instance -d RACDB -i RAC01 |
Views and queries select * from GV$CACHE_TRANSFER |
An instance failed to start, what do we do? 1) Check the instance alert.log 2) Check the Oracle Clusterware software alert.log 3) Check the resource state using CRS_STAT |
See official Note 239998.1 for removing crs installation See http://startoracle.com/2007/09/30/so-you-want-to-play-with-oracle-11gs-rac-heres-how/ to install 11g RAC on VMware See http://www.oracle.com/technology/pub/articles/hunter_rac10gr2_iscsi.html to install on Linux with iSCSI disks See http://www.oracle-base.com/articles/10g/OracleDB10gR2RACInstallationOnCentos4UsingVMware.php to install on VMwareSee OCFS Oracle Cluster Filesystem
Prerequisites check
#check node connectivity and Clusterware integrity
./runcluvfy.sh stage -pre dbinst -n all
./runcluvfy.sh stage -post hwos -n "linuxes,linuxes1" -verbose
WARNING:
Package cvuqdisk not installed.
rpm -Uvh clusterware/rpm/cvuqdisk-1.0.1-1.rpm
WARNING:
Unable to determine the sharedness of /dev/sdf on nodes:
linuxes1,linuxes1,linuxes1,linuxes1,linuxes1,linuxes1,linuxes,linuxes,linuxes,linuxes,linuxes,linuxes
Safely ignore this error
./runcluvfy.sh comp peer -n "linuxes,linuxes1" -verbose
./runcluvfy.sh comp nodecon -n "linuxes,linuxes1" -verbose
./runcluvfy.sh comp sys -n "linuxes,linuxes1" -p crs -verbose
./runcluvfy.sh comp admprv -n "linuxes,linuxes1" -verbose -o user_equiv
./runcluvfy.sh stage -pre crsinst -n "linuxes,linuxes1" -r 10gR2
|
Restart intallation - Remove from each node su -c "$ORA_CRS_HOME/install/rootdelete.sh; $ORA_CRS_HOME/install/rootdeinstall.sh" #oracle user export DISPLAY=192.168.0.1:0.0 /app/crs/oui/bin/runInstaller -removeHome -noClusterEnabled ORACLE_HOME=/app/crs LOCAL_NODE=linuxes rm -rf $ORA_CRS_HOME/* #root su -c "chown oracle:dba /dev/raw/*; chmod 660 /dev/raw/*; rm -rf /var/tmp/.oracle; rm -rf /tmp/.oracle" |
#Format rawdevices using dd if=/dev/zero of=/dev/raw/raw6 bs=1M count=250 #If related error message appears during installation, manually launch on related node /app/crs/oui/bin/runInstaller -attachHome -noClusterEnabled ORACLE_HOME=/app/crs ORACLE_HOME_NAME=OraCrsHome CLUSTER_NODES=linuxes,linuxes1 CRS=true "INVENTORY_LOCATION=/app/oracle/oraInventory" LOCAL_NODE=linuxes runcluvfy.sh stage -pre crsinst -n linuxes -verbose
/etc/hosts example # Do not remove the following line, or various programs # that require network functionality will fail, 127.0.0.1 localhost 147.43.1.101 londonl 147.43.1.102 london2 #VIP is usable only after VIPCA utility run, #should be created on the public interface. Remember that VIPCA is a GUI tool 147.43.1.201 londonl-vip 147.43.1.202 london2-vip 192.168.1.1 londonl-priv 192.168.1.2 london2-priv |
Kernel Parameters(/etc/sysctl.conf) Recommended Values kernel.sem (semmsl) 250 kernel.sem (semmns) 32000 kernel.sem (semopm) 100 kernel.sem (semmni) 128 kernel.shmall 2097152 kernel.shmmax Half the size of physical memory kernel.shmmni 4096 fs.file-max 65536 net.core.rmem_default 262144 net.core.rmem_max 262144 net.core.wmem_default 262144 net.core.wmem_max 262144 net.ipv4.ip_local_port_range 1024 to 65000 |
RAC restrictions - dbms_alert, both publisher and subscriber must be on same instance, AQ is the workaround - dbms_pipe, only works on the same instance, AQ is the workaround - UTL_FILE, directories, external tables and BFILEs need to be on shared storage |
Implementing the HA High Availability Framework Use srvctl to start/stop applications #Manually create a script that OCR will use to start/stop/status #Create an application VIP. #This command generates an application profile called haf demovip.cap in the $ORA_CRS_HOME/crs/ public directory. $ORA_CRS_HOME/bin/crs_profile -create hafdemovip -t application -a $ORA_CRS_HOME/bin/usrvip -o oi=eth0,ov=147.43.1.200,on=255.255.0.0 #As the oracle user, register the VIP with Oracle Clusterware: ORA_CRS_HOME/bin/crs_register hafdemovip #As the root user, set the owner of the apphcation VIP to root: $ORA_CRS_HOME/bin/crs_setperm hafdemovip -o root #As the root user, grant the oracle user permission to run the script: $ORA_CRS_HOME/bin/crs_setperm hafdemovip -u user:oracle:r-x #As the oracle user, start the application VIP: $ORA_CRS_HOME/bin/crs_start hafdemovip 2. Create an application profile. $ORA_CRS_HOHE/bin/crs_profile -create hafdemo -t application -d "HAF Demo" -r hafdemovip -a /tmp/HAFDemoAction -0 ci=5,ra=60 3. Register the application profile with Oracle Clusterware. $ORA_CRS_HOHE/bin/crs_register hafdemo $ORA_CRS_HOME/bin/crs_start hafdemo |
CRS commands crs_profile crs_register crs_unregister crs_getperm crs_setperm crs_start crs_stop crs_stat crs_relocate |
Server side callouts Oracle instance up(/down?) Service member down(/up?) Shadow application service up(/down?) |
Adding a new node - Configure hardware and OS - With NETCA reconfigure listeners and add the new one - $ORA_CRS_HOME/oui/bin/addnode.sh from one of existing nodes to define the new one to all existing nodes - $ASM_HOME/oui/bin/addnode.sh from one of existing nodes (if using ASM) - $ORACLE_HOME/oui/bin/addnode.sh from one of existing nodes - racgons -add_config to add ONS metadata to OCR from one of existing nodes Removing a node from a cluster - Remove node from clusterware - Check that ONS configuration has been updated on other node - Check that database and instances are terminated on node to remove - Check that node has been removed from database and ASM repository - Check that software has been removed from database and ASM homes on node to remove |
RAC contentions - enq:HW - contention and gc current grant wait events Use larger uniform extent size for objects - enq: TX - index contention Re-create the index as a global hash partitioned index. Increase the sequence cache size if retaining the sequence. Re-create the table using a natural key instead of a surrogate key. |