eCluster™ 3.6 - Implementation and Solution
by AiForce.net on March 14, 2002

Brief: eCluster™ design background and technical information.

Outline:

The eCluster™ Design Decisions and the Implementation
The eCluster™ is a fully distributed cluster with unrestricted level of global distributed redundancy.  It offers several advantages over a device implementation.  A device implementation came with two flavors, the DNS version and the NAT based version.

The DNS device implementation suffers in its hardware limitation.  The NAT capable device  implementation suffers in its hardware limitation as well as the routing path requirement.

A device implementation is not as flexible as a software implementation, which may run on any OSes, any machine architectures.  In comparison, the software  approach is an Open Implementation.  One may choose his own OS, his own  hardware configuration, CPU architectures, in a low cost fashion,  even assemble his own hardware.  A device implementation is limited to its own embedded  OS, the limited power of the embedded CPU, and the amount of memory  shipped with the device.

The NAT based device implementation suffers in its requirement that the device must be setup as a router in the enterprise network routing path, where all traffic must going through the device and so the load balance and fail safe can be possible.  The NAT capable device will have to spend extra amount of CPU time and memory to:

  • Inspect each packet going in and out to the Internet. 
  • Search through its sessions tables to find a specific entry.  This search time is a linear function of the number of sessions in the network.  The device takes longer to do the search when the device gets more network sessions.  For a large network, there may be millions of sessions from and to various internal network servers and user workstations, the search time will increase while there are increased number of sessions and the speed to switch and route the packet will decrease.  
  • Modify each packet in/out of the device.

In the worst case, the switch/router will slow down and eventually stopped working.  A broken switch/router is fairly a disastrous event to any enterprise network.  

The routing path requirement makes the device a bottleneck, which is subject to the single point of failure dilemma.  Extra packet processing load slows down  the vital enterprise Internet link of the whole local network.  The optimal network design is to free the load from the switch/router as much as possible and makes it as fast as possible.

eCluster™ General  Description
eCluster™ is a highly scalable Smart Load Balance Cluster solution, which can be scaled up to 1 million Internet Cluster Groups (virtual domains).  Each contains 1024 Cluster Nodes (1 million X 1024).  With round trip time load balance and fail safe algorithm, one can setup a groups of clusters for any OSes, on any CPUs with infinite standbys.  The load balance and fail safe clustering ensures non-stoppable and fast eBusiness transactions for Internet services and SQL database servers, such as Oracle, MS SQL, and Informix.

eCluster™ Smart load balance algorithms include: CPU load, weighted load, VM usage, round trip time, CPU usage, traffic allocation, number of users sessions, etc. 

eCluster™ provides network traffic distribution, network failsafe, CFS™ for both Internet and Intranet.  Network traffic can be distributed among large number of networked heterogeneous servers and workstations.  It's able to detect busy servers and crashed network services.  Once detected, the eCluster™ will route traffic to the server that is non-busy and with the most available resources.  The email subsystem will report in real time to administrators if there are any resource shortages or clustered node malfunctions. The emails can be sent to email capable pagers or cell phones.  eCluster™ is able to clusterize any servers and workstations on the Internet by virtually grouping them together into virtual cluster groups. 

Network traffic distribution manages Internet bandwidth and speeds up Internet services, such as web services, and FTP services.  Network traffic will be routed only to the server that is well functioning, with most system resource.  For example, if the server node is short in virtual memory, file system space, or too many network connection, the Internet traffic will be routed to the node with most system resource.  Should malfunction occurs, an email alert will be sent to a pre-specified email address where administrators can be notified.  The malfunctioned node will be taken offline.  eCluster™ provides better performance and faster response to users with fail safe assurance.

The eCluster™ fail safe capability allows non-interruptible Internet services and Intranet services.  Administrators may take down systems temporarily for system maintenance without interrupting Internet or Intranet users.  eCluster™ can have unlimited number of standbys.  It is a true High Availability system with large scale of redundancy.

eCluster™, with CFS™, provides file system clustering across all cluster nodes for any given virtual cluster groups.  The Clustered File System Daemon (CFSD) provides file system virtual replication and synchronization in "real time" with encryption, through high speed 7 to 1 compressed data communication channels.  The CFS™ is able to cluster nodes dispersed on the Internet and is OS and computer architecture independent.  The CFS™ product is distributed along with the eCluster™ distribution.

The eCluster™ Monitor helps administrators to view and monitor the all cluster groups at their finger tip (a single image).  Once the network is eClustered, statistics of each clustered server and workstation can be seen from a single console.  Alert messages will inform administrators for resource shortages, such as memory shortage, hard drive space shortage, etc.  Administrators will be alerted periodically if the shortage condition or failure event is not fixed.  The server or workstation that is short in resource will be taken off-line automatically by the Cluster Master Daemon (CLMD) and Internet request will be directed to the server or workstation with most available resource and least latency.  

Design a Clustered Network - Network Load balance and Mirroring
An ideal network is fast and never broken.  What if earthquake occurred, power generator failed, or any other catastrophic event happened to my network?  What if there are more and more traffic coming in and out the ideal network that its bandwidth near depleted?  The ideal network gets slower and failed?

To solve a network failure is to have another network standby.  To solve the network bandwidth limitation is to have several network to share bandwidth as a single giant network.  eCluster™ answers such problem by Virtual Cluster Groups (VCG).  With VCGs setup across several networks, each network will be able to backup the other, a detailed network mirroring is formed.  In a normal operating environment, eCluster™ will manage and distribute network traffic among several networks.  In a catastrophic network event, eCluster™ will continue its load balancing operation among networks, yet taking the failed network off-line. 

What Platform does eCluster Support?
The platform that eCluster™ runs on includes: 

  • SUN SPARC - Solaris 2.6, 2.7, 2.8 above
  • Linux
  • FreeBSD
  • Windows NT 4/2000/2003/XP

Notes: For the SUN port, the master daemon only works on Solaris 2.7 and 2.8.  The CSD client daemon works for all the above platforms as well as Windows NT 4/2000/2003/XP.

 Home Contact Us  |  E-Mail this Page 
Copyright © 2002 AiForce.net. All Rights Reserved.