Top 20 General Faqs: Oracle Fail Safe Frequently Asked Questions
Top 20 General Faqs: Oracle Fail Safe Frequently Asked Questions
Note:60787.1 26-NOV-2002
Oracle Fail Safe is software that makes it easy to deploy highly available
solutions on Microsoft Windows NT clusters. Oracle Fail Safe 2.1.3 (and earlier
releases) help you to configure highly available single-instance (Oracle7 and
Oracle8) databases. The next release, Oracle Fail Safe 8i, due out shortly, will
add functionality to support highly available Oracle8i databases, Oracle Forms,
and Oracle Reports.
Oracle Fail Safe has two main components: Oracle Fail Safe Manager and
Oracle Fail Safe server. Oracle Fail Safe Manager automates the configuration
and management of Oracle resources on Windows NT clusters from within
Oracle Enterprise Manager through wizards, drag and drop, property sheets,
and dialog boxes. Oracle Fail Safe server works with Microsoft Cluster Server
(MSCS) to ensure fast automatic failover for both planned and unplanned
outages.
When combined with appropriately redundant cluster hardware, Oracle Fail
Safe's features and integrated troubleshooting tools support rapid deployment
of a wide variety of highly available Oracle solutions. No changes are required
to existing applications to access Oracle Fail Safe databases.
Oracle Fail Safe runs under Windows NT Version 4.0 Enterprise Edition on
Intel. (There is another Oracle Fail Safe kit available for Digital Alpha, too.)
Windows NT Enterprise Edition is required because it includes Microsoft Cluster
Server (MSCS) Version 1.0, which is the cluster software that Oracle Fail Safe
uses. Currently, MSCS (and thus, Oracle Fail Safe) support only two nodes in a
cluster system.
Cluster Nodes:
On both cluster nodes, the following software must be installed on the local
(private) disks:
Client nodes where you will be running Oracle Fail Safe Manager, need to be
running Windows 95, Windows 98, or Windows NT Version 4.0 with Service
Pack 3.
Oracle Fail Safe supports Oracle databases in multiple homes under these
conditions:
← Fail Safe can be installed in any Oracle home but only once on each
node.
← If a database is to be made Fail Safe (added to a Fail Safe group), the
database must exist on the both nodes.
← Any Oracle home that is to be used with Fail Safe must be symmetrical
on the cluster (i.e. same home name and Oracle version).
← All resources in a Fail Safe group must reside in the same Oracle home.
You cannot add a database from one home and another database from
another home into the same Fail Safe group.
Oracle Fail Safe itself is fully Y2K compatible. The only uses of dates are in
various log files, and the year information is displayed using the full 4 digits in
these files. No internal operations rely on any form of date manipulation.
Oracle Fail Safe layers over and works with multiple Oracle and non-Oracle
products. All Oracle components that work with Oracle Fail Safe (e.g. Oracle7,
Oracle8, Oracle Enterprise Manager, the Oracle Developer servers) are Y2K
compliant. However, some third party componets may not be, so you must
ensure that all third party componets are also Y2K compliant (e.g. hardware,
networking, operating system). For example, if Oracle Fail Safe Manager is
deployed on an older Windows 95 system, update the system software and/or
hardware as described on the Microsoft web page to ensure Y2K compliance.
To ensure uninterrupted service, all components used in an Oracle Fail Safe
high availabilty solution (hardware and software) must be Y2K compliant.
Oracle Fail Safe has been designed to be easy to install and manage. A
component called Oracle Fail Safe Manager acts as a console, allowing
standard Windows operations such as drag-and-drop capabilities to be used in
management.
There are a number of terms specific to Oracle Fail Safe and cluster systems:
Cluster Alias:
An IP address that is used (by Microsoft Cluster Adminsitrator and Oracle Fail
Safe Manager) to refer to the multiple nodes in the cluster. Each node still has
its own IP address. Clients do not connect to the cluster alias.
The unit of failover. Minimally, a Fail Safe group consists of a virtual server,
network name (chosen when Oracle Fail Safe is installed) and an IP address,
which is used by clients to access the service provided by the Fail Safe group.
A Fail Safe group runs on only one cluster node at a time. A Fail Safe group
and a virtual server are synonymous.
Failover:
The process by which a Fail Safe group (virtual server) is made available on the
surviving node.
Failover Time:
The total time taken between the database becoming unavailable on the failed
node and available on the surviving node.
Failback:
Virtual Server:
2. Oracle Fail Safe requests MSCS to move the Fail Safe group resources
to the other cluster node.
4. Client applications can then connect to the Oracle Fail Safe database on
the second node, using the same virtual server address.
2. A predetermined failover policy for the Fail Safe group is followed. Often,
one or more attempts will be made to restart the database rather than fail
it over, in case the problem was transitory.
Client applications then can connect to the Oracle Fail Safe database on the
second node, using the same virtual server address.
No. Fail Safe can also work with Oracle and limited support for third-party
applications. We have worked with vendors to make sure that third-party
applications can work with a Fail Safe Database. However, we don't support
configuring the third party applications.
The next release will provide support for Oracle Forms and Oracle Reports.
Yes. Oracle Fail Safe minimizes the need to have a standy machine that is idle
until a failover occurs. However, care is needed in capacity planning. If both
cluster nodes perform useful work (this is called an active/active configuration),
loading may be such that the surviving node may be unable to cope effectively
with the workload. A compromise would be to plan for both nodes to be, say,
75% utilized under normal operation, so after failover, performance of the
surviving node would be slower but hopefully acceptable.
2. Create a Fail Safe Group using Oracle Fail Safe Manager, supplying a
virtual server network name and IP address.
3. Drag the standalone database to the Fail Safe Group to invoke the Add
Resource to Group Wizard. The wizard will then determine the disks
used by the database, add them to the Fail Safe group, configure
SQL*Net V2 or Net8 files to work with the virtual server, and test that the
Fail Safe database works correctly on each node.
In tests using Oracle databases with SAP R/3, both planned and unplanned
failovers took between about 30 seconds on systems with a few users, to about
a minute on systems with 300 users. For database failover, the Oracle7 and
Oracle8 releases allow users to connect before recovery is complete.
Can Oracle Fail Safe support applications that use OCI or ODBC?
Beginning with Oracle8 release 8.0.5, ODBC and OCI applications can
reconnect automatically and restart a query, effectively from the point of failure.
If a user was running an update, the transaction will be rolled back automatically
and a message will be displayed to the user, who can then reissue the update.
This capability is provided by the Oracle8 OCI transparent failover feature. The
Oracle8 ODBC driver is layered over the Oracle8 OCI interface and Net8. Thus,
ODBC clients can take advantage of the Oracle8 features automatically while
OCI clients must do some minimal application coding to achieve the same thing.
This DLL is the heart of Oracle Fail Safe. Its main tasks include the following:
When you configure an Fail Safe group, the most important information that you
supply is the group name and the associated virtual server address. The host
name, database instance, SID entry, and protocol information must match on
both cluster nodes and on each client system that is running Oracle Fail Safe
Manager.
When you add a resource to a Fail Safe group, a TNS listener is created for the
group and the LISTENER.ORA and TNSNAMES.ORA files are modified
automatically by Oracle Fail Safe on the cluster nodes. If you have Oracle Fail
Safe Manager running on client systems, you must manually modify the
TNSNAMES.ORA file on each client system to match the server's host-name
information.
How does Oracle Fail Safe differ from Microsoft Cluster Server?
There are some similarities between Oracle Fail Safe and OPS, but there are a
greater number of differences.
Both products support Oracle databases on more than one node. Oracle Fail
Safe is currently limited to two nodes, whereas OPS can, depending on the
vendor, support larger numbers of nodes. The key differences are in cost and
scalability: Oracle Fail Safe is a much lower-cost solution than OPS, but does
not offer the scalability of OPS for expanding the cluster and its workload as a
business' needs grow.
Other differences lie in the technology used. Oracle Fail Safe uses the MSCS
clustering software, which means the cluster nodes share no resources. OPS
uses shared disks and all nodes have concurrent access to the data on all
disks. Sophisticated lock and cache management technologies, used in
conjunction with partners disk cluster technologies, allow OPS to offer a highly-
available, highly-scalable solution for NT clusters.
Fail Safe is suited for easily partitioned workloads and data for customers who:
This depends entirely on when MSCS will support more than two nodes.
Although n-node clusters are planned, a release date is not available.
Oracle Fail Safe Manager also works with Oracle Enterprise Manager-the
needed software is included on the Oracle Fail Safe CD. Oracle Enterprise
Manager is not a requirement for Oracle Fail Safe but you can use Enterprise
Manager for routine database administration tasks (such as database backup
and restore operations, or SGA analysis) on Fail Safe databases.
Integration with Oracle Fail Safe Manager became optional starting with the
2.1.3 release. Prior to 2.1.3, Oracle Enterprise Manager was required in order to
discover any standalone databases running on the cluster nodes. This
discovery is performed directly by Oracle Fail Safe server starting with the 2.1.3
release.
Future releases of Oracle Fail Safe may include their own, standalone configuration and
management tool. However, Enterprise Manager can be used (optionally) to help you
with managing events and jobs in the cluster.