SUMMARY: High Availability

Rusty Rose (rros44@tsg.cbot.com)
Thu, 18 Dec 1997 16:44:51 -0600

Messages sorted by: [ date ][ thread ][ subject ][ author ]
Next message: Markus 'FvD' Weber: "SUMMARY: (2) Filesystem full (NOT /!)"
Previous message: Michel Pilon: "SUMMARY: firewall configuration hel"

Note: This is a long summary, but hopefully useful to others
considering this type of problem.

Recap of my question:

>Is it feasable to configure a second server with the same hostname/IP
>as the first as a backup for High Availability, and boot it up when
>needed. I am exploring this from an O.S. standpoint only at this
>point, ignoring database and application issues. Also, can I boot both
>servers from the same root disk (they are connected to a dual ported
>Sparc Storage Array).

Highlights of Answers and Recommendations:

The things that will be different on the second server, will be
the HostID and the Mac Address of the interface(s). These issues
can be addressed in various ways.

HostID will be different....

Use Forth at the EEPROM prompt to fudge HostID

Use a public domain utility to fudge HostID
HostID can be changed by change-sun-hostid at
http://ftp.doc.ic.ac.uk/sun/sunsite-sun-info/admin-tools/

Or, don't worry about it. Usually only license managers care
about the HostID. If that is the case, run multiple redundant
license servers.

Mac address will be different.....

The routers maintain an ARP table, which matches the IP to
the Mac (hardware address). These entries regularly expire
and the Router re-arps for the Mac. Once the routers do this
after failover, your clients will be able to reach the other
server you just brought up.

Or, use ifconfig to set the Mac address of the interface
the same, and avoid the problem of different Mac addresses.

Use the same IP address on the backup server....

I thought I would have to have a cold backup server to do this,
but it turns out, the server can be up and ready, but just have
the interface with the same IP turned off until needed. (again
using ifconfig).

I also learned that one hardware interface can be configured with
multiple IP addresses simultaneously. The device names of the
multiples looks something like (le0:0, le0:1, etc). Once again
done with ifconfig.

Another possiblity: Instead of using the same IP, use multicast.

Boot both servers from the same root disk.....NOT RECOMMENDED

No one recommended this. For one thing, the hardware configuration
of the servers would have to be IDENTICAL, or else files referring
to the hardware would be wrong (/etc/path_to_inst, /devices, and
others). Also, this would introduce a single point of failure
between the 2 servers. If the first server doesn't shut down
gracefully, it is very likely to leave behind a "dirty" filesystem,
which may or may not be reliable... and you sure don't want to find
out when you are in a failover situation!

Another suggestion was to use a third "proxy" server. Your clients
talk to this box, whose sole duty is to redirect the packets to
whichever "real" server is up and responding to requests.

Another good comment (paraphrased here) is.... If you're willing
to invest in a redundant Enterprise 4000 server, go the extra mile
and invest in a 3rd party HA software package to make it useful
as your backup server.

Others commented that all of the above items may be accomplishable
by creating custom scripts, etc., but really the most usable solution
would be to invest in an HA software package which will take
care of most of this stuff for you, as well as provide heartbeat,
monitor your application processes, etc. The HA software can
also provide for automated switchover, whereas the solutions I
was looking at would all require manual intervention.

I haven't made a final decision on how we will do this, but have
received nothing short of a wealth of information from those who
responded. In fact it will take me some time to sift through all the
possibilities. Many thanks to all who responded to what really turns
out to be a complex issue with many possible solutions....

Thanks to:

brainard@ihs.com
azhang@ect.enron.com
robertr@nwmarkets.co.jp
Kevin.Sheehan@uniq.com.au
rogerio@bvl.pt
sfrost@mitretek.org
dank@veritas.com
vhoang@lucent.com
steve_turgeon@putnaminv.com
davmaj@flash.net
rsk@itw.com
ian.camm@sse.eu.sony.co.jp

Rusty Rose, Chicago Board of Trade
rros44@tsg.cbot.com

Next message: Markus 'FvD' Weber: "SUMMARY: (2) Filesystem full (NOT /!)"
Previous message: Michel Pilon: "SUMMARY: firewall configuration hel"