Proceedings Template word

EXAMPLE MULTILEVEL FAILOVER

Download 493.43 Kb.

View original pdf

Page	49/58
Date	17.12.2020
Size	493.43 Kb.
	#55166

1 ... 45 46 47 48 49 50 51 52 ... 58

the-akamai-network-a-platform-for-high-performance-internet-applications-technical-publication

Machine failure
Cluster failure

8. EXAMPLE MULTILEVEL FAILOVER
As we mentioned in Section 4.3, we take an approach similar to recovery-oriented computing throughout our platform design—
making the assumption that failures are an inevitable part of operation and the system must be able to operate regardless. We now look briefly at a concrete application of this approach by examining how Akamai content delivery services maintain availability in a scenario of multiple component failures (more details are given in [1], based on an older version of the system. To understand how this works, we must first look at the basic flow of an HTTP request to the Akamai network. Initially, a DNS lookup is made to resolve the Akamai hostname. The DNS resolution takes several steps The first request goes to generic TLD servers, which return
Akamai Top Level Name Servers (TLNS) as authorities, generally with long DNS TTLs. The Akamai TLNS are globally distributed, using a mixture of IP Anycast and large clusters. The next query, to an Akamai TLNS, returns delegations with shorter DNS TTLs to a number of Akamai Low Level Name Servers (LLNS). The Akamai LLNS are typically located in close network proximity to the resolving name server. The final query, to an Akamai LLNS, returns edge server IP addresses based on both the cluster assignment and the low level map described above. These answers have very short
TTLs so that changes to the mapping assignments (such as in response to failures or shifts in demand) can be rapidly distributed to end users. The end user browser then makes an HTTP request to the edge server IP address to receive the content. If the content is not already in cache, the edge server retrieves it from the origin server and then delivers it to the end user. Now consider the following types of failure:
Machine failure: Within an edge cluster, Akamai implements high availability techniques we have evolved from principles similar to those in TCPHA [42]. This allows for virtually seamless response to machine failures, as another machine will start responding to the IP address of the failed machine. In addition, the low level map is updated every few seconds, redirecting new requests as appropriate to accommodate for the failure.
Cluster failure: When an entire cluster fails or is experiencing unreliable connectivity, the cluster assignment from mapping is rapidly updated to no longer handout clusters that have failed or which are experiencing connectivity issues.

Download 493.43 Kb.

Share with your friends:

1 ... 45 46 47 48 49 50 51 52 ... 58