eCluster™ 3.6 - Implementation and Solution
by AiForce.net on March 14, 2002
Brief:
eCluster™ design background and technical information.
Outline:
The eCluster™
Design Decisions and the Implementation
The
eCluster™ is a fully distributed cluster
with unrestricted level of global distributed redundancy. It offers
several advantages over a device implementation. A device implementation
came with two flavors, the DNS version and the NAT based version.
The DNS device implementation suffers in its hardware limitation. The
NAT capable device implementation suffers in its hardware limitation as
well as the routing path requirement.
A device implementation
is not as flexible as a software implementation, which may run on any OSes, any
machine architectures. In comparison, the software approach is an
Open Implementation. One may choose his own OS, his own hardware
configuration, CPU architectures, in a low cost fashion, even assemble his
own hardware. A device implementation is limited to its own embedded
OS, the limited power of the embedded CPU, and the amount of memory
shipped with the device.
The NAT based device implementation suffers in its requirement that the
device must be setup as a router in the enterprise network routing path, where
all traffic must going through the device and so the load balance and fail safe
can be possible. The NAT capable device will have to spend extra amount of
CPU time and memory to:
- Inspect each packet going in and out to the
Internet.
- Search through its sessions tables to find
a specific entry. This search time is a linear function of the number
of sessions in the network. The device takes longer to do the search
when the device gets more network sessions. For a large network, there
may be millions of sessions from and to various internal network servers and
user workstations, the search time will increase while there are increased
number of sessions and the speed to switch and route the packet will
decrease.
- Modify each packet in/out of the device.
In the worst case, the switch/router will slow
down and eventually stopped working. A broken switch/router is fairly a
disastrous event to any enterprise network.
The routing path requirement makes the device a bottleneck, which is subject to
the single point of failure dilemma. Extra packet processing load slows
down the vital enterprise Internet link of the whole local network.
The optimal network design is to free the load from the switch/router as much as
possible and makes it as fast as possible.
eCluster™ General Description
eCluster™ is a highly scalable Smart Load Balance Cluster solution, which can be
scaled up to 1 million Internet Cluster Groups (virtual domains). Each
contains 1024 Cluster Nodes (1 million X 1024). With round trip time load
balance and fail safe algorithm, one can setup a groups of clusters for any
OSes, on any CPUs with infinite standbys. The load balance and fail safe
clustering ensures non-stoppable and fast eBusiness transactions for Internet
services and SQL database servers, such as Oracle, MS SQL, and Informix.
eCluster™ Smart load balance algorithms include: CPU load, weighted load, VM
usage, round trip time, CPU usage, traffic allocation, number of users sessions,
etc.
eCluster™ provides network traffic distribution, network failsafe, CFS™
for both Internet and Intranet. Network traffic can be distributed among
large number of networked heterogeneous servers and workstations. It's
able to detect busy servers and crashed network services. Once detected,
the eCluster™ will route traffic to the server that is non-busy
and with the most available resources. The email subsystem will report in
real time to administrators if there are any resource shortages or clustered
node malfunctions. The emails can be sent to email capable pagers or cell
phones. eCluster™ is able to clusterize any servers and workstations on
the Internet by virtually grouping them together into virtual cluster groups.
Network traffic distribution manages Internet bandwidth and speeds up Internet
services, such as web services, and FTP services. Network traffic will be
routed only to the server that is well functioning, with most system resource.
For example, if the server node is short in virtual memory, file system space,
or too many network connection, the Internet traffic will be routed to the node
with most system resource. Should malfunction occurs, an email alert will
be sent to a pre-specified email address where administrators can be notified.
The malfunctioned node will be taken offline. eCluster™ provides better
performance and faster response to users with fail safe assurance.
The eCluster™ fail safe capability allows non-interruptible Internet
services and Intranet services. Administrators may take down systems
temporarily for system maintenance without interrupting Internet or Intranet
users. eCluster™ can have unlimited number of standbys. It is a true
High Availability system with large scale of redundancy.
eCluster™, with CFS™, provides file system clustering across all cluster nodes
for any given virtual cluster groups. The Clustered File System Daemon
(CFSD) provides file system virtual replication and synchronization in "real
time" with encryption, through high speed 7 to 1 compressed data communication
channels. The CFS™ is able to cluster nodes dispersed on the Internet and
is OS and computer architecture independent. The CFS™ product is
distributed along with the eCluster™ distribution.
The eCluster™ Monitor helps administrators to view and monitor the all cluster
groups at their finger tip (a single image). Once the network is
eClustered, statistics of each clustered server and workstation can be seen from
a single console. Alert messages will inform administrators for resource
shortages, such as memory shortage, hard drive space shortage, etc.
Administrators will be alerted periodically if the shortage condition or failure
event is not fixed. The server or workstation that is short in resource
will be taken off-line automatically by the Cluster Master Daemon (CLMD) and
Internet request will be directed to the server or workstation with
most available resource and least latency.
Design a Clustered Network - Network Load
balance and Mirroring
An ideal network is fast and never broken. What if earthquake
occurred, power generator failed, or any other catastrophic event happened to my
network? What if there are more and more traffic coming in and out the
ideal network that its bandwidth near depleted? The ideal network gets
slower and failed?
To solve a network failure is to have another network standby. To solve
the network bandwidth limitation is to have several network to share bandwidth
as a single giant network. eCluster™ answers such problem by Virtual
Cluster Groups (VCG). With VCGs setup across several networks, each
network will be able to backup the other, a detailed network mirroring is
formed. In a normal operating environment, eCluster™ will manage and
distribute network traffic among several networks. In a catastrophic
network event, eCluster™ will continue its load balancing operation among
networks, yet taking the failed network off-line.
What
Platform does eCluster™
Support?
The platform that eCluster™ runs on
includes:
- SUN SPARC - Solaris 2.6, 2.7, 2.8 above
- Linux
- FreeBSD
- Windows NT 4/2000/2003/XP
Notes: For the SUN port, the master daemon only
works on Solaris 2.7 and 2.8. The CSD client daemon works for all the
above platforms as well as Windows NT 4/2000/2003/XP.
|