For Oracle support & training call (800) 766-1884
Free Oracle Tips

Home
Oracle Tips
Oracle Code Depot
Oracle Monitoring
Oracle Consulting
Oracle Training
Oracle News
Oracle Forum
Oracle Support





 

Free Oracle Tips

image

 
HTML Text

Free Oracle App Server Tips

image

 
HTML Text


Privacy Policy

Redneck

Dress Code

Oracle tuning

Oracle training

Oracle support

Remote Oracle


 

 

 

 

Donald K. Burleson

Oracle9i RAC Tips

Overview of Oracle Real Application Clusters

First we will cover the evolution of Oracle Real Application Clusters; from it’s beginning in Oracle7 as Oracle Parallel Server (OPS), to the current efficient Oracle9i RAC design. In its first incarnation as OPS, the Oracle architecture consisted of the lock manager process, which coordinated locking between the instances in the parallel server configuration. Locks were escalated or released as required to allow the transfer of data blocks between the instances. This set of lock management processes is illustrated in Figure 4.1.

Figure 4.1 - Oracle7 OPS Processes

Only a single instance was allowed to modify a block at any given point in time. In order to modify a block, the instance desiring to do so had to have ownership of that data block.

To establish ownership of database blocks, the OPS architecture practiced what was known as block pinging. Each OPS instance creates locks that manage consistency between multiple instances, and these are called Parallel Cache Management (PCM) locks. PCM locks ensure that the instance reading a block gets a consistent image of that block.

An instance must acquire a PCM lock on a block before it can be read or modified, to prevent changes from occurring from other instances before the block can be read or modified by the calling instance. When there is a request from an instance for a block, and that block is currently being held by another instance in the OPS database, then the holding instance must write the block back to disk before the requesting instance can read or modify it.

The best way to explain the ping process is through an example. For this example we will look at a two instance OPS database, and call the instances ops1 and ops2. Instance ops1 reads table emp, row1, in block 1 for update. Once read, block 1 will be protected by PCM lock p1 in exclusive mode, and placed in the data block cache of instance ops1 while ops1 modifies it. Now ops2 will also see lock p1, but for ops2 the mode will be null, meaning ops2 cannot access the block.

Suppose instance ops2 decides it needs to modify row 10, also in block 1 of table emp. What will happen to the block? Instance ops2 must acquire lock p1 in exclusive mode, but it is unable to do so because ops1 already holds it. Instance ops1 will have to write block 1 back to disk, assuring that ops2 will see the most current version of block 1, and that the ops1 changes are not lost. Instance ops1 will now downgrade lock p1 to null mode, instance ops2 will read the block into its block cache, and put PCM lock p1 into exclusive mode. This entire process is known as a true ping, or just ping. Obviously, frequent pinging, with its required read-write-read I/O sequence to disk, is a real performance killer and one reason why Oracle7 OPS didn't make it very popular. Even more insidious is the concept of a false ping.

A false ping happens when a PCM lock protects more than one block. Instance ops1 may read several blocks rather than just one, and they will all be protected by lock p1. If instance ops2 needs one of the protected blocks, even though it may be different from the specific block required by ops1, then ops1 still has to write block 1 back to disk in order for instance ops2 to acquire PCM lock p1 in the necessary exclusive mode.

Fine grain locking, introduced in later versions of OPS, eliminated false pinging. Fine grain locking places a lock on each individual block. Since any given lock affects only one block, false pings become impossible. In this environment, only true pinging (where the actual ownership of the block is changed through the act of pinging) can take place. Let's look at true pings diagrammatically in Figure 4.2.

Figure 4.2 - Block Pinging Example

As you can imagine, all of this locking, unlocking, block read, write, and re-read could be quite costly for performance. To ensure good performance, the data in an OPS database had to be partitioned. Partitioning data to the server that handles the part of the application using data the most reduces both pinging and false pinging. This is because blocks are transferred between servers far less often than when all servers share all parts, and thus all data blocks, of an application.

Don't get this confused with the concept of table and index partitioning, which partitions an individual table or index based on key data values, and spreads it across multiple disk assets. Among the benefits of table and index partitioning is improved access speed. Data partitioning is quite different, because entire sets of tables and indexes applying to a single part of an application are moved to a separate tablespace.

To practice data partitioning, the database designer segregates the data for each database functional area to separate tablespaces. Once the functional areas of the database are segregated, users were only allowed to log in to a specific OPS instance that performed that particular function. By segregating functional data and functional areas, the pinging caused by other instances needing a given data block could be reduced. The concept of data and functional partitioning is illustrated in Figure 4.3.

Figure 4.3: Example of Data and Functional Partitioning

The initial perceptions of OPS, generated by advertising and, in no small part, by Oracle sales personnel, were that it was the answer to everything that plagued the Oracle database. OPS would allow multiple nodes to use the same database, provide scalable architecture that allowed the simple addition of a server to expand capability, and would solve performance issues by allowing more CPUs to chew away at problems. Unfortunately, the reality of required application changes, data partitioning, and poor performance soon destroyed those early perceptions. They were replaced by the perception that OPS was generally slow, and hard to manage, configure, and maintain. Hopefully, we will be able to dispel many of these negative perceptions of early OPS with a discussion of the new RAC architecture and benefits.

Obviously, commercial off the shelf (COTS) Oracle applications could not be placed easily into an OPS database. If the database was not designed from the start to be an OPS database, any attempts to shoe horn it into an OPS configuration generally met with disaster. Obviously, the limitations in OPS that required data and functional partitioning had to be eliminated. The block ping had to be erased from the OPS architecture. Unfortunately, until high bandwidth, high speed interconnects between machines became readily available the elimination of the block ping was not feasible.

Speeds from 100 Mbit to a gigabit are available in modern network and computer interconnect architectures. These high speed interconnects allow memory-to-memory transfer of data at quantities and speeds that were unheard of until recently. This has made the elimination of the block ping possible by allowing intra-process transfer of data blocks across the high speed interconnect, which completely eliminates the need to have a disk as a transfer media. This architecture is diagramed in figure 4.4.

Figure 4.4: RAC Parallel Architecture

Oracle9i RAC has virtually eliminated both the true and false ping since the data blocks are now transferred across the high speed interconnect and not via the read-write-read ping methodology. Oracle9i RAC also utilizes fine grain locking. As already mentioned, fine grain locking assigns each block a lock within the RAC SGA data block areas. While fine grain locking adds to the memory overhead in the shared pool region (where the locks are tracked), they add immensely to the performance of RAC by eliminating false pings.

Oracle9i RAC offers Real Application Clusters Guard and the Cluster File System, enhanced support for installation via the Oracle Universal Installer, and migration support in the Database Upgrade Assistant (formerly known as Database Migration Assistant).

The Oracle9i RAC also supports dynamic local and remote listener parameters. This means that you can use the alter system set SQL statement to dynamically update the local_listener and remote_listener parameters. In addition, when you add or remove nodes from a Real Application Clusters database, Oracle dynamically updates these parameters. During reconfiguration, the process monitor (PMON) records reconfiguration information with all the listeners in the cluster for cross-instance listener registration.

  • Overall system availability is often helped by the reboot.

Linux clusters adopted this method in an earlier period of their growth, when the Linux SCSI reserve/release support was immature and not consistently implemented. Some of the problems with this method include:

  • Potential loss of data integrity due to the forced shutdown of the node.
  • Nodes can shoot each other and shutdown the entire cluster.
  • Shot server cannot be used to diagnose problems.

In another example, Sistina GFS supports multiple, cascading I/O fencing methods including manual, network power control, and fiber channel switch zone control.

Oracle’s Instance Membership Recovery

We need to note that fault scenarios are more complicated than a generic cluster system. For example, when an instance dies or crashes, clusterware may not be aware of it. The quorum mechanism maintained by Oracle helps to manage this. This method is implemented by the IMR (Instance Membership Recovery). Each active instance in the cluster writes a bitmap to the control file. This is part of the checkpoint progress record. Thus, every instance has a membership vote. IMR perceives members as expired when they do not provide normal periodic heartbeats to the control file. All instances read this CFVRR (Control File Voting Results Record) to do arbitration.

Cache Coherency and Lock Management

One of the most critical features for a parallel database is its ability to control global concurrency of the data (pages or blocks) located in the individual node’s cache. As each of the nodes has its own local cache containing current data blocks, their status and access need to be controlled globally.

Another node’s cache might need to access concurrently. Blocks are moved frequently across the nodes when needed. Also, there should be effective and accurate monitoring of the status of blocks in the cache. Lock acquisition, lock release, and lock conversions should be performed at rapid speeds. Low latency and high speed communication between the nodes is an essential requirement.

Since a data block can be present in the database buffers of more than one node, when an update occurs all other buffered copies become obsolete. Global cache control mechanisms invalidate the obsolete data blocks. Another important feature is the way in which the cache reconfiguration occurs when a node fails. To maintain the integrity of the data blocks, a failed instance’s resources need to be taken over or re-mastered by another node’s instance.


For more information, see the book Oracle 11g Grid and Real Application Clusters  - 30% off if you buy it directly from Rampant TechPress .  Written by top Oracle experts, this RAC book has a complete online code depot with ready to use RAC scripts.


For more details and scripts, see my new book " Oracle Tuning: The Definitive Reference", over 900 pages of BC's favorite tuning tips & scripts. 

You can buy it direct from the publisher for 30%-off and get instant access to the code depot.

 

 


 

 

image

image

image  

image

image

image

 

Fast-Track Oracle Support
PO Box 511
Kittrell, NC 27544


Email BC: