In addition to the robust platform availability that is its direct goal, there area couple of useful byproducts to the recovery- oriented design philosophy. The first is a significant reduction in the number of operations staff needed to manage the network. Because the network is designed with the assumption that components
are failing at all times, staff do not need to worry about most failures nor rush to address them. Moreover, staff can be aggressive in proactively suspending components if they
have the slightest concern, since doing so will not affect the performance of the overall system. Even though the operations staff is itself distributed across multiple sites, the human staff is not in the critical path for the operation of the network. A second benefit is the ability to roll out software updates in a rapid
and non-disruptive manner, as described in Section 7.3. Again, because the failure of a number of machines or clusters does not affect the overall system, zoned software rollouts can be performed quickly and frequently without
disrupting services to Akamaiās customers. Some interesting metrics relating to the two benefits we cite here are presented in [1].
Share with your friends: