Monday, March 15, 2010

RAC Services, Load Balancing, TAF and FAN

RAC services , Load Balancing and TAF configuration

RAC Services
Oracle 10g RAC provides feature called Services which can be used to configure different services based on application, departments etc like finance, marketing, HR etc. We can create services and provide resources as per requirement for particular application or department.
For ex: If we create a service called HR then we can define the nodes on which it should run, any load balancing feature and if one node or all nodes are failed where service is running initially then which node service will be failed over.
You can check service status using crs_stat –t which will show you all application and database resources.
ora.myracdb.LBS1.myracdb1.srv ONLINE OFFLINE
ora.myracdb.LBS1.myracdb3.srv ONLINE ONLINE on myrac1
ora.myracdb.LBS1.cs ONLINE ONLINE on myrac1
ora.myracdb.taf_precon.myracdb1.srv ONLINE ONLINE on myrac1
ora.myracdb.taf_precon.cs ONLINE OFFLINE
ora.myracdb.myracdb1.inst ONLINE OFFLINE
ora.myracdb.myracdb2.inst OFFLINE OFFLINE
ora.myracdb.myracdb3.inst ONLINE ONLINE on myrac1
ora.myracdb.myractaf.myracdb3.srv ONLINE ONLINE on myrac1
ora.myracdb.myractaf.cs ONLINE ONLINE on myrac1
ora.myracdb.myractaf_BKP.myracdb3.srv ONLINE ONLINE on myrac1
ora.myracdb.myractaf_BKP.cs ONLINE ONLINE on myrac1
ora.myracdb.db ONLINE ONLINE on myrac1
ora.myracdb.hr.myracdb1.srv ONLINE ONLINE on myrac1
ora.myracdb.hr.cs ONLINE OFFLINE
ora.myracdb.hr_precon.myracdb1.srv ONLINE ONLINE on myrac1
ora.myracdb.hr_precon.cs ONLINE OFFLINE
ora.myracdb.hr_basic.myracdb3.srv ONLINE ONLINE on myrac1
ora.myracdb.hr_basic.cs ONLINE ONLINE on myrac1
ora.myracdb1.ASM3.asm ONLINE ONLINE on myrac1
ora.myracdb1.gsd ONLINE ONLINE on myrac1
ora.myracdb1.ons ONLINE ONLINE on myrac1
ora.myracdb1.vip ONLINE ONLINE on myrac1
ora.myracdb2.ASM1.asm ONLINE OFFLINE
ora.myracdb2.LISTENER_myracdb2.lsnr ONLINE OFFLINE
ora.myracdb2.gsd ONLINE OFFLINE
ora.myracdb2.ons ONLINE OFFLINE
ora.myracdb2.vip ONLINE OFFLINE
ora.myracdb3.ASM2.asm ONLINE OFFLINE
ora.myracdb3.LISTENER_myracdb3.lsnr OFFLINE OFFLINE
ora.myracdb3.gsd ONLINE OFFLINE
ora.myracdb3.ons ONLINE OFFLINE
ora.myracdb3.vip ONLINE ONLINE on myrac1

As we can see from above output that we have a database called myracdb and services are myracdb1, myracdb2, myracdb3 running on myrac1, myrac2 and myrac3 nodes. So whenever we create a database, all 4 default services will be created along with services for Listener, ONS, VIP and ASM (if we are using ASM)for all nodes.
Now if we want to create a non default service called finance then we can use below command
srvctl add service -d MYRACDB -s finance -r myracdb1
-r specifies preferred instances.
Now using below command, we can see the service details.
srvctl config service -d MYRACDB -s finance -S 9
#@=info: operation={config} config={full} ver={10.2.0.0.0}
finance PREF: MYRACDB1 AVAIL:
#@=service[0]: name={finance} enabled={true} pref={MYRACDB1} avail={} disabled_insts={} tafpolicy={NONE} type={user}
#@=endconfig:
To remove service, we can use below command
srvctl remove service –d MYRACDB –s finance

When we specify multiple preferred instances, Server side load balancing is enabled so when users are connecting to the database using this service, it’ll be load balanced by listeners.
srvctl add service -d MYRACDB -s finance -r myracdb1,myracdb2
srvctl config service -d MYRACDB -s finance -S 9
#@=info: operation={config} config={full} ver={10.2.0.0.0}
finance PREF: MYRACDB1 MYRACDB2 AVAIL:
#@=service[0]: name={finance} enabled={true} pref={MYRACDB1, MYRACDB2} avail={} disabled_insts={} tafpolicy={NONE} type={user}
#@=endconfig:
We can also specify available instances in case both preferred instances are down.
srvctl add service -d MYRACDB -s finance -r myracdb1,myracdb2 -a myracdb3
-a specifies available instances (We can specify multiple available instances)
srvctl config service -d MYRACDB -s finance -S 9
#@=info: operation={config} config={full} ver={10.2.0.0.0}
finance PREF: MYRACDB1 MYRACDB2 AVAIL: MYRACDB3
#@=service[0]: name={finance} enabled={true} pref={MYRACDB1, MYRACDB2} avail={MYRACDB3} disabled_insts={} tafpolicy={NONE} type={user}
#@=endconfig:
Please be careful while specifying same instance in preferred and available. See below example.
srvctl modify service -d MYRACDB -s finance -n -i "myracdb1,myracdb2" -a "myracdb1,myracdb3" -f
-f specifies to disconnect current connected sessions.
srvctl config service -d MYRACDB -s finance -S 9
#@=info: operation={config} config={full} ver={10.2.0.0.0}
finance PREF: MYRACDB2 AVAIL: MYRACDB1 MYRACDB3
#@=service[0]: name={finance} enabled={true} pref={MYRACDB2} avail={MYRACDB1, MYRACDB3} disabled_insts={} tafpolicy={NONE} type={user}
#@=endconfig:
So available instance parameter has overwritten preferred parameter value because one instance can be either in preferred or available but it cannot be in both preferred and available list.
Please see another example as below:
srvctl modify service -d MYRACDB -s finance -n -i myracdb1,myracdb2 –f
PRKP-1048 : Failed to change configuration for service finance.
CRS-0211: Resource 'ora.MYRACDB.finance.MYRACDB1.srv' has already been registered.
So did unregister first
crs_unregister ora.MYRACDB.finance.MYRACDB1.srv
srvctl modify service -d MYRACDB -s finance -n -i myracdb1,myracdb2 -f
srvctl config service -d MYRACDB -s finance -S 9
#@=info: operation={config} config={full} ver={10.2.0.0.0}
finance PREF: MYRACDB1 MYRACDB2 AVAIL:
#@=service[0]: name={finance} enabled={true} pref={MYRACDB1, MYRACDB2} avail={} disabled_insts={} tafpolicy={NONE} type={user}
#@=endconfig:
So as we can see that it has been removed from available and moved to preferred however no instance mentioned in available because we skipped –a option of modify service.
So bottom line is that we cannot specify same instance name both in preferred and available.
If you want any other help on modifying services, you can use below help.
srvctl modify service -h
Usage: srvctl modify service -d -s -i -t [-f]
-d Unique name for the database
-s Service name
-i Old instance name
-t New instance name
-f Disconnect all sessions during stop or relocate service operations
Usage: srvctl modify service -d -s -i -r [-f]
-d Unique name for the database
-s Service name
-i Instance name
-r Upgrade instance to preferred
-f Disconnect all sessions during stop or relocate service operations
Usage: srvctl modify service -d -s -n -i [-a ] [-f]
-d Unique name for the database
-s Service name
-n Modify service configuration
-r "" List of preferred instances
-a "" List of available instances
-f Disconnect all sessions during stop or relocate service operations
Usage: srvctl modify service -d -s -P [-f]
-d Unique name for the database
-s Service name
-P TAF policy (NONE, BASIC, or PRECONNECT)
-f Disconnect all sessions during stop or relocate service operations
-h Print usage

For starting and stopping service, we can use below command
srvctl start service –d MYRACDB –s finance
srvctl stop service –d MYRACDB –s finance
By default, it’ll be in offline state after creating so we need to start the service after creating it and it’ll start on preferred instances and all preferred instances are not available then it’ll start on available instances.
Assume that myracdb2 is down and we specified below command.
srvctl modify service -d MYRACDB -s finance -n -i myracdb2 -a myracdb1,myracdb3 –f
srvctl config service -d MYRACDB -s finance -S 9
#@=info: operation={config} config={full} ver={10.2.0.0.0}
finance PREF: MYRACDB2 AVAIL: MYRACDB1 MYRACDB3
#@=service[0]: name={finance} enabled={true} pref={MYRACDB2} avail={MYRACDB1, MYRACDB3} disabled_insts={} tafpolicy={NONE} type={user}
#@=endconfig:
Srvctl start service –d MYRACDB –s finance
ora.MYRACDB.finance.MYRACDB2.srv ONLINE ONLINE on myrac1
ora.MYRACDB.finance.cs ONLINE ONLINE on myrac1
As we can see that it has started on MYRACDB1 instance on MYRAC1 server.

Load Balancing

When we specify multiple preferred instances (server side load balancing) and enable client side load balancing then listener level load balancing is automatically happening. We’ve put local_listener parameter and remote_listener parameters in spfile as shown below.
Both parameters value point to TNSNAMES.ora entry
local_listener=LISTENER_MYRAC1
remote_listener = LISTENERS_MYRACDB
TNSNAMES.ORA entries: Below entries should be there on all nodes tnsnames.ora file.
LISTENERS_MYRACDB =
(ADDRESS_LIST =
(ADDRESS = (PROTOCOL = TCP)(HOST = myrac1-vip)(PORT = 2042))
(ADDRESS = (PROTOCOL = TCP)(HOST = myrac3-vip)(PORT = 2042))
(ADDRESS = (PROTOCOL = TCP)(HOST = myrac2-vip)(PORT = 2042))
)
LISTENER_MYRAC3 =
(ADDRESS_LIST =
(ADDRESS = (PROTOCOL = TCP)(HOST = myrac3-vip)(PORT = 2042))
)
LISTENER_MYRAC2 =
(ADDRESS_LIST =
(ADDRESS = (PROTOCOL = TCP)(HOST = myrac2-vip)(PORT = 2042))
)
LISTENER_MYRAC1 =
(ADDRESS_LIST =
(ADDRESS = (PROTOCOL = TCP)(HOST = myrac1-vip)(PORT = 2042))
)
Client side load balancing
finance =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = myrac2-vip)(PORT = 2042))
(ADDRESS = (PROTOCOL = TCP)(HOST = myrac1-vip)(PORT = 2042))
(ADDRESS = (PROTOCOL = TCP)(HOST = myrac3-vip)(PORT = 2042))
(LOAD_BALANCE = yes)
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = FINANCE)
(FAILOVER_MODE =
(TYPE = SELECT)
(METHOD = BASIC)
(RETRIES = 180)
(DELAY = 5)
)
)
)
Server Side load balancing
In a dedicated server configuration, a listener selects an instance in the following order:
1. Least loaded node
2. Least loaded instance

If a database service has multiple instances on multiple nodes, the listener chooses the least loaded instance on the least loaded node. If shared server is configured, then the least loaded dispatcher of the selected instance is chosen.
This is based on node load so distribution of sessions may not be equal in terms of no. To verify that we need to put event 10257 level 16 and restart the instance and check pmon trace file.
The pmon trace file output could be like below.
kmmgdnu: MYRACDB goodness 27
kmmgdnu: MYRACDB
goodness=27, delta=1,
flags=0x4:unblocked/not overloaded, update=0x6:G/D/-
kmmlrl: 65 processes
kmmlrl: node load 687
kmmlrl: instance load 35
kmmlrl: nsgr update returned 0
flags=0x4:unblocked/not overloaded, update=0x6:G/D/-
kmmgdnu: MYRACDBXDB
goodness=0, delta=1,
flags=0x5:unblocked/not overloaded, update=0x6:G/D/-
kmmgdnu: MYRACDB
goodness=27, delta=1,
flags=0x4:unblocked/not overloaded, update=0x6:G/D/-
kmmlrl: 66 processes
kmmlrl: node load 775
kmmlrl: nsgr update returned 0
kmmgdnu: MYRACDB
goodness=28, delta=1,
flags=0x4:unblocked/not overloaded, update=0x6:G/D/-
kmmlrl: 70 processes

kmmlrl: node load 847
kmmlrl: nsgr update returned 0
*** 2009-08-18 03:47:26.891
err=-319 tbtime=0 etime=319 srvs=1 nreqs=0 sreq
'Node load' gives an indication of how busy the machine is. By default, 'node load' is the primary factor for determining where new connections will go.
Load Balance Advisory
We can configure load balance advisory configure according to our requirement. There are two kinds of goal for a service that can be configured.

Service Time: Attempts to direct work requests to instances according to response time. Load balancing advisory data is based on elapsed time for work done in the service plus available bandwidth to the service.

Throughput: Attempts to direct work requests according to throughput. The load balancing advisory is based on the rate that work is completed in the service plus available bandwidth to the service.
In short, Load balance advisory does below things :
• Calculates goodness locally, forwards to master mmon
• Master mmon builds advisory for distribution of work
• Records advice to SYS$SERVICE_METRICS
• Posts load balancing advice via FAN event to AQ, PMON, ONS
Connection Load Balancing
We can also configure connection load balancing by specifying clb_goal for specific service.
Long: Use the LONG connection load balancing method for applications that have long-lived connections. This is typical for connection pools and SQL*Forms sessions. LONG is the default connection load balancing goal.

Short:Use the SHORT connection load balancing method for applications that have short-lived connections.

TAF configuration
TAF feature provides failover capability for oracle database instances in multi node RAC configuration. Depending on requirement, we need to configure TAF methods for a given service. There are two kinds of methods for server side TAF configuration.
TAF methods
1. Basic: It’s default method when we create a service. Establishes connections at failover time. This option requires almost no work on the backup database server until failover time
2. Pre-establishes connections. This provides faster failover but requires that the backup instance be able to support all connections from every supported instance. It creates the backup connection in alternate instance (available instance of a service).

Server side preconnect TAF is not supported using dbms_service.
SQL> exec dbms_service.modify_service(service_name=>'MYRACTAF_BKP',aq_ha_notifications => TRUE ,failover_method=>DBMS_SERVICE.FAILOVER_METHOD_PRECONNECT);
BEGIN dbms_service.modify_service(service_name=>'TSTRACTAF_BKP',aq_ha_notifications => TRUE ,failover_method=>DBMS_SERVICE.FAILOVER_METHOD_PRECONNECT); END;

*
ERROR at line 1:
ORA-06550: line 1, column 124:
PLS-00302: component 'FAILOVER_METHOD_PRECONNECT' must be declared
ORA-06550: line 1, column 7:
PL/SQL: Statement ignored

However it’ll allow to specify preconnect if we are creating a service through srvctl add

service -d MYRACDB -s finance -r MYRACDB3 -a MYRACDB1 -P preconnect
srvctl config service -d MYRACDB -s finance -S 9
#@=info: operation={config} config={full} ver={10.2.0.0.0}
finance PREF: MYRACDB3 AVAIL: MYRACDB1
#@=service[0]: name={finance} enabled={true} pref={MYRACDB3} avail={MYRACDB1} disabled_insts={} tafpolicy={preconnect} type={user}
#@=endconfig:

So this seems to be a bug in 10.2.0.4 and oracle metalink notes also say that Server side preconnect is not supported so if we want to take advantage of preconnect method then we have to specify available instance different from preferred instance and enable client side TAF with preconnect method as below.

TAF_Precon =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = myrac2-vip)(PORT = 2042))
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = TAF_PRECON)
(LOAD_BALANCE = NO)
(FAILOVER = yes)
(FAILOVER_MODE =
(TYPE = SELECT)
(METHOD = PRECONNECT)
(BACKUP = myrac3) -> This is tnsnames entry
(RETRIES = 180)
(DELAY = 5)
)
)
)
myrac3 =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = myrac1-vip)(PORT = 2042))
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = MYRACTAF_BKP)
(FAILOVER = yes)
(FAILOVER_MODE =
(TYPE = SELECT)
(RETRIES = 180)
(DELAY = 5)
)
)
)
srvctl config service -d MYRACDB -s TAF_PRECON -S 9
#@=info: operation={config} config={full} ver={10.2.0.0.0}
TAF_PRECON PREF: MYRACDB2 AVAIL: TSTRACDB1
#@=service[0]: name={TAF_PRECON} enabled={true} pref={MYRACDB2} avail={MYRACDB1} disabled_insts={} tafpolicy={basic} type={user}
#@=endconfig:

srvctl config service -d MYRACDB -s MYRACTAF_BKP -S 9
#@=info: operation={config} config={full} ver={10.2.0.0.0}
MYRACTAF_BKP PREF: MYRACDB1 AVAIL: MYRACDB2
#@=service[0]: name={MYRACTAF_BKP} enabled={true} pref={MYRACDB1} avail={MYRACDB2} disabled_insts={} tafpolicy={NONE} type={user}
#@=endconfig:

So after above configuration when we connect to database using TAF_Precon , it’ll create one connection on myrac2 and backup connection on myrac1.

sqlplus kmakwana@TAF_PRECON

SQL*Plus: Release 10.2.0.2.0 - Production on Mon Nov 23 22:51:40 2009

Copyright (c) 1982, 2005, Oracle. All Rights Reserved.

Enter password:

Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit Production
With the Partitioning, Real Application Clusters, OLAP, Data Mining
and Real Application Testing options

SQL>
So we can connection both on MYRAC2 and MYRAC1 since we are using preconnect method.
oracle@myrac2:/home/oracle>
KMAKWANA oracle 497 7731 1532148 sqlplus@myapp1 (TNS myapp1 I TAF_PRECON 0:0:33

oracle@myrac1:/home/oracle>
KMAKWANA oracle 489 9376 913578 sqlplus@myapp1 (TNS myapp1 I MYRACTAF_BKP 0:0:25

TAF Types

TAF type can be select or session. It defines what should happen to session or query when failover happens.
a. SELECT failover. With SELECT failover, Oracle Net keeps track of all SELECT statements issued during the transaction, tracking how many rows have been fetched back to the client for each cursor associated with a SELECT statement. If the connection to the instance is lost, Oracle Net establishes a connection to another Oracle RAC node and re-executes the SELECT statements, repositioning the cursors so the client can continue fetching rows as if nothing has happened. The SELECT failover approach is best for data warehouse systems that perform complex and time-consuming transactions.
b. SESSION failover. When the connection to an instance is lost, SESSION failover results only in the establishment of a new connection to another Oracle RAC node; any work in progress is lost. SESSION failover is ideal for online transaction processing (OLTP) systems, where transactions are small.

NONE: No failover for this service.
SESSION: Only reconnect the session; do not resume select statements.
SELECT: Use small amount of additional client memory to store state information necessary to resume non-transactional select statements during a failover. It allows users with open cursors to continue fetching on them after failure. However, this mode involves overhead on the client side in normal select operations. DML Transactions will still break and require a rollback.
TAF can operate in one of two modes, Session Failover and Select Failover. Session Failover will recreate lost connections and sessions. Select Failover will replay queries that were in progress.
In OLTP systems, we can use session type with preconnect method so that backup connections are already created while starting primary connections so only clients which are running queries will get the error and client while are connecting immediately after failover, they can keep continuing data retrieval.
If we use select type with preconnect method (backup connections are already created using MYRACTAF_BKP service in our example) then actual failover takes 4-5 seconds and if we use select type with basic method then it takes around 12-13 seconds to resume query execution assuming service will failover to available instance. however DML is not supported till 10.2.0.4 by Oracle so any DML statements running at the time failover will be rolled back.
If we are using Select type with preconnect method , there’s some overhead involved at the client side mentioned in below table.

The below table discusses about TAF method and TAF type depending on environment (whether it’s OLTP or DWH).
Type of System TAF method TAF type Advantages Disadvantages
OLTP/ Data Ware housing Preconnect
or basic Session Since we’ve already created backup connections, clients can immediately connect after failover and keep selecting data. If not preconnect then clients will fail over to available instance of a service since no backup connections are created. It’s best for OLTP systems. Clients who are running the select queries will get the error. Any DML statements will be rolled back.
OLTP/ Data Ware housing Preconnect or basic Select Using already backup connections, client queries can resume in 4-5 seconds on the backup service connections. If not preconnect, client queries will fail over to available instance of a service and it may take some time as Oracle has to establish connections to available instance and then resume query processing. It took 40 seconds in our environment. It’s best for Data Warehousing environment. Whether or not to use extra client memory to track information including the sql plan hash, SCN and rows fetched for each active SELECT statement so that they can be resumed transparently during a failover. Client side memory requirement can be trivial if all clients are connecting from same machine or one client is connecting using multiple sessions.
(Note: Please remember that the timing specified for failover differs from environment to environment and mentioned timing in this document is tested in our environment not specified by Oracle so please test in your environment before going for production implementation)

FAN – Fast Application notification
This feature will allows application to publish events related to node, instance or any service up/down events and event are published using ONS (RAC component) and Advanced Queues. ONS will send notification to application and Load balance advisory framework to change goodness and delta value for particular instance. FAN cleans up any connections when the failure occurs. It keeps track of service and instance for each connection.
When down event is published by ONS in RAC environment :
a. Routes new requests to remaining instances
b. Throws exceptions if applications are in middle of transactions. As mentioned earlier, DML is not supported for failover so applications should have capability to deal with DML transaction and throw appropriate errors or exceptions to the users.
When up event is published by ONS in RAC environment :
a. Creates new connections to remaining instances
b. Distributes new requests evenly to all instances based on Node load and instance load
c. Updates Load Balance advisory
FAN configuration
a. Configure at application level
For ex: If application is using java then we can configure as below.
OracleDataSource ods = new OracleDataSource()
ods.setUser(username);
ods.setPassword(password);
ods.setConnectionCachingEnabled(True);
ods.setFastConnectionFailoverEnabled(True);
Ods.setONSConfiguration("nodes=myrac1:6300,host2:6300");
ods.setConnectionCacheName(MyCache);
ods.setConnectionCacheProperties(cp);
ods.setURL("jdbc:oracle:thin:@(DESCRIPTION=
(LOAD_BALANCE=on)
(ADDRESS=(PROTOCOL=TCP)(HOST=myracVIP1)(PORT=2042))
(ADDRESS=(PROTOCOL=TCP)(HOST=myracVIP2)(PORT=2042))
(CONNECT_DATA=(SERVICE_NAME=MYSERVICE)))");

or set system properties
-D oracle.jdbc.FastConnectionFailover=true

b. Change below file.
oracle@myrac1:/u01/app/oracle/product/10.2.0/db_1/opmn/conf>cat ons.config
localport=6150
remoteport=6300
loglevel=3
usesharedinstall=true

Please make sure that remoteport value is same on all RAC instances.

c. When starting the application, specify system property and make sure that ons.jar is in CLASSPATH.
-Doracle.ons.oraclehome=

FAN Server side call out
We can configure server side call out for FAN in RAC environment. Server side calls would database instance down/up event, ASM instance down/up event , Clusteware Fan events.
We can download sample code for Clusteware FAN server side callout from below link http://www.oracle.com/technology/sample_code/products/rac/index.html
We can modify the script as per requirement and store at below location only. In the script, we can configure alerts bases on events and send email alerts to DBA so that they can take appropriate actions.
oracle@myrac1:/$ORACLE_CRS_HOME/racg/usrco

1 comment:

  1. Playtech | DrMCD
    The world's leading in 양주 출장안마 gaming information technology has it 계룡 출장샵 was the pioneer of a powerful technology 포천 출장샵 to help players place bets online, at 강릉 출장샵 the 익산 출장마사지 casino ‎Games · ‎Betting & iGaming

    ReplyDelete