WHERE TO BUY  |   SITE MAP  |  CONTACT US  
 

ASCE Networks InChorus
Global Server Load Balancing
Technical White Paper

Introduction

Global server load balancing (GSLB) allows web hosters, portals and enterprises to distribute content and services geographically. Dispersing content and services offers a number of advantages, including:
Allowing users to be automatically directed to content from servers located in their own geographic region thus reducing response times and decreasing the use of expensive international data connections.
Directing users away from congested networks and servers, enhancing the users' experience.
Increasing fault-tolerance and availability by allowing multi-site content and service deployment, guarding against failures in the event of local or regional network outages, power outages or natural disasters.
ASCE Networks Web switches running GSLB direct user requests to the "best site" to service the requests using three criteria:
site health,
site proximity and
response time required to retrieve specified content.
Global server load balancing is explained here in the context of HTTP and the World Wide Web but is by no means limited to HTTP. Any service that can be load balanced with ASCE Networks' Web switches can operate with GSLB.

GSLB Operation Overview

When client Z loads their browser and enters the URL: http://www.site.com (see Figure 1), the system sends a DNS getByHostname query to the client's local DNS server, asking for the IP address that represents www.site.com.
The local DNS server examines its DNS cache to determine if it already knows about this particular domain name and host. If it doesn't, the local DNS server hands the request off to the appropriate upstream DNS server.
The DNS query is either responded to by an upstream DNS server's cache or is passed on until the request arrives at a DNS server embedded in one of the Web switches at site A, B, or C (in this case site A). Which site ultimately receives the request is determined by a myriad of DNS configuration parameters.
The Web switches at site A, B and C are configured to be "distributed sites" and all can act as Authoritative Name Servers for the domain www.Site.com. Each can respond directly to DNS queries with IP addresses that represent that domain.
For example, if site A receives the DNS query, the IP address that site A returns represents a Virtual IP (VIP) address for one of the sites hosting the requested content (in this case site B).

TOP

Figure 1 - GLSB Operation

In the example above, assume that site A returns the IP address 172.176.110.20. The client receives the DNS query response from its local DNS server indicating that 172.176.110.20 is the IP address for www.Site.com. It then opens a TCP Port 80 connection to 172.176.110.20, the VIP address running at site B. Now, the client is communicating with the ASCE Networks Web site with content from site B.
So, how did site A determine that site B was the right site to handle the client's request? How is site B "better" than the other two possible sites, including the one that responded to the DNS query? With GSLB, three criteria are used to determine to which site DNS will direct the client:
Site health
Geographic location of the client and sites(s)
Measured site response time
GLSB develops an ordered list of sites that DNS uses when responding to client requests. The above criteria are used to determine if and where on the list a site appears (this is detailed later).
What happens if the site to which the client has been pointed suddenly experiences a failure or is overloaded? Assuming the Web switch running GSLB and its Internet connection are up, the Web switch issues an HTTP Redirect back to the client, telling it to go to a different site.
This occurs when a VIP no longer has any healthy real IP addresses (RIPs) or when an HTTP request is sent to real servers that have reached their respective maximum connection thresholds.

Major Components

GSLB consists of four major components that run on each Web switch in the GSLB group:
Distributed Site Monitoring -- where a Web switch at each site performs Layer 4 health checking (with content verification as an option) on all other peer remote sites. This determines the health and response time of servers and applications at each site.
The Distributed Site State Protocol (DSSP) -- used to exchange health, load, response time and throughput information between sites through both periodic updates during normal operation and triggered updates when a significant event occurs.
Internet Topology Awareness -- where a Web switch acting as an Authoritative Name Server examines DNS requests and considers geography when responding.
A DNS Authoritative Name Server -- responds to DNS requests directed to that site.


Distributed Site Monitoring

A Web switch at each distributed site performs periodic health and response time checks of each defined Remote Real IP (RIP) addresses. These remote RIPs (i.e. devices participating in the GSLB operation) typically correspond to VIP addresses running in Web switches at peer sites being load balanced by GSLB. By executing configurable, iterative health checks to each remote RIP, a site learns about its peer sites' server, application and content availability and response time.
Each health check consists of open and closing a TCP connection for a configured application. For application/protocols where the Web switch supports content health checking (HTTP, FTP, NNTP, DNS, SMTP and POP3), content access can also be configured as part of the health check. Content is accessed based on the content configuration (URL, filename, etc.) defined by the system administrator.
When content-based health checking is used, response time is defined as the time from when the Web switch issues the request to open the connection to the time it closes the connection, including the time needed to retrieve the content. Without content-based health checking, the time needed to retrieve content is not a factor.
Each Web switch performs this health and response time check for each defined remote RIP, each corresponding to a VIP address running in a Web switch at another site.
For instance, if site A sees four other sites and there are five VIPs defined on the Web switch at site A, (each having corresponding remote RIPs at each site), then the Web switch at site A performs 20 health and response time checks during the health check interval (4 sites times 5 remote RIPs at each site).
It's important to note that remote RIP health checks don't stop at the Web switch hosting the remote VIP. The remote Web switch passes them through to a server or servers behind the Web switch.
If the Web switch is in front of a group of load balanced servers, the health checks are distributed across the servers in accordance with the configured load balancing metric. As a result, remote health checks determine the availability of not only remote Web switches but also the server, applications and if configured, the content behind the Web switches.
If a Web switch flags a remote RIP as down because it does not respond to health checks, the Web switch:
No longer considers the site eligible for connection handoffs and stops using the remote Web switches VIP address as a target for DNS responses.
Notifies all other distributed sites that the site is not responsive. Each distributed site may then test to see if the site is responsive and act accordingly.

Distributed Site State Protocol

The Distributed Site State Protocol (DSSP) is a light-weight protocol used to communicate health and response time information from one distributed site to every other distributed site. Each DSSP packet communicates:
Each site uses the information communicated by the DSSP, plus its own response time checking results, to construct a table of response times for all sites as measured by Response times for each peer site as measured by the site transmitting the DSSP packet.
Remaining site capacity (connections available per VIP address) of the transmitting site.
Status of the transmitting site.
all sites. This information is, in turn, used to calculate the desired relative traffic distribution traffic between the distributed sites, including itself. For example, the sites might determine that:
Site A should receive 20% of all traffic.
Site B should receive 10% of all traffic.
Site C should receive 10% of all traffic.
Site D should receive 20% of all traffic.
Site E should receive 10% of all traffic.
Site F should receive 30% of all traffic.
The DNS authoritative name server in each Web switch uses these percentages to determine how often each site's VIP address should be included in responses it sends to downstream DNS servers.

The advantages to this algorithm include:
Sites that perform the best will generally receive more connections than other sites, but not all of the connections. This prevents traffic spikes from overloading individual sites.
The traffic will be averaged across the top sites, providing consistently good response times and user experiences.
The sites that are seen as poorly performing by all other sites (an indication of a real problem) will tend to receive few or no connections, providing relief while they process their existing load or corrective action is performed.
If every site is performing well (including WAN links, servers, etc.) then it's likely that each site will receive an equal distribution of traffic over time. This ensures that sites don't get overloaded but also perform their share of the work.

In addition to regular updates, Web switches send DSSP triggered updates under the following exception conditions:
The Web switch is no longer able to communicate with a remote RIP.
The Web switch experiences a local resource constraint, such as all servers have reached their maximum connections limit or no real servers available for a VIP.

DSSP triggered update contains all of the information in a regular update.

Internet Topology Awareness

In addition to site health and response time, GSLB takes geographic information into account when determining which distributed site should handle a request -- for instance, if there are five sites that host content for a given host and domain name, one each in San Jose (West-U.S.); Atlanta (East-U.S.); Ecuador (South America); Paris, France; and Tokyo, Japan.
In general, associating users within a geographic domain (country, continent) with servers in that domain optimizes the user's experience (unless the "nearby" site is down or overloaded).
With this in mind, users in Europe will generally be served by the Paris site while users in Chile will be served by the Ecuador site. Having a user in Japan come to the Atlanta site for content would waste expensive international bandwidth and cause unnecessary response delays to the user. Switches at distributed sites also consider geography when responding to DNS requests.
When a Web switch receives a DNS request, it recognizes the geographic source of the request by inspecting the Source IP address of the request. It then consults the relative traffic distribution table (described later) for that geographic area to determine which site within the area the DNS response should indicate.
For example, if the requesting host is located somewhere in the Pacific Rim area, it will be pointed to the server in Tokyo. If the requesting host is located somewhere in the United States, the Web switch will consult the relative traffic distribution traffic table for the U.S. to determine if the host should be pointed to Atlanta or San Jose.

TOP

DNS Authoritative Name Server

Ultimately, GSLB is accomplished by the DNS Authoritative Name Server running in the Web switches at distributed sites returning the appropriate IP address to downstream DNS servers.
For example, when a client enters a URL into their browser for a particular hostname (represented by several VIPs scattered throughout the U.S.), their system sends a DNS getByHostname query to their local DNS server, asking for the IP address representing that domain name and host.
The local DNS server then examines its DNS cache to determine if it already knows about this particular domain name and host. If it doesn't know about the hostname, it hands off the request to the next appropriate DNS server. The DNS query is either responded to from that DNS server's cache or is passed on until the request comes to a Web switch at a distributed site running GSLB.
When a Web switch at a distributed site receives a DNS query to resolve a hostname from a downstream DNS server, it determines to which geographic region the requesting host belongs. It then checks to see if any healthy distributed sites are present in that region. If there are none, it looks for healthy distributed sites in other regions.
Otherwise, the Web switch provides a DNS response containing an IP address based on the relative traffic distribution traffic table for that region. The IP address will change from response to response based on the percentages in the relative distribution traffic table.

ASCE NETWORKS' GSLB Advantages

While competitive solutions for distributing load across geographically distributed servers exist, none offer the full range of advantages supported by ASCE Networks' GSLB. These advantages include:
Local server load balancing, global server load balancing, application redirection, Layer 2 and 3 switching within a single platform. This enables additional applications such as Web cache redirection, DNS redirection, firewall load balancing and router load balancing. Today, no competitive product can match this level of integration and flexibility.
GSLB directs users to the best performing sites within a geographic region. Competitive solutions that rely on metrics such as router hops fail to consider important factors such as networks congestion and server load when making their load balancing decisions. Only ASCE Networks' GSLB effectively takes these factors into consideration.
Intelligent load distribution funnels most traffic to the best performing sites without overwhelming them. All resources are effectively used, improving the user's experience.
Users are automatically redirected to the next best site when all servers at a site are down or congested, improving service and content availability.


Summary

Global Server Load Balancing (GSLB) allows Web hosters, portals and enterprises to enhance users' Web experiences by reducing response times and increasing service and application availability.
At the same time, organizations deploying GSLB benefit through the decreased use of expensive international data connections. GSLB is an optional software capability on ASCE Networks Web switches such as the InChorus 2, InChorus 3 and ACEswitch 180. GSLB running on these Web switches directs user requests to the "best site" to service the requests. The "best site" is determined by monitoring site health, site proximity and response time.

TOP

WHERE TO BUY  |  SITE MAP  |  CONTACT US
Copyright © 2008 Asce Networks, Inc. All Rights Reserved.