Geographically Distributed Application Environments
Posted by Mike Brunt at 12:25 AM
0 comments - Categories: ColdFusion | JRun-J2EE
Over 2007 and 2008 I worked with three clients who were looking to create a geographically distributed application environment, the main reason they wanted to do that is so that they had maximum redundancy; so that in the event of a complete data-center failure, they would be covered, Such catastrophic failures are rare but they do happen. In late 2007 Rackspace had such a failure in Texas, an independent report of what happened can be found here.
In working with clients to analyze the needs and costs of setting up geographically distributed application environments we found that there is a fairly significant difference in costs between having an active/active environment as opposed to active/passive. In active/active both datacenters would be handling traffic at the same time and this presents two fairly difficult challenges which need costly solutions.
- DNS Clustering: There needs to be some way to direct traffic to one or other of the active datacenter locations. This would almost certainly need what is called "stickiness". A user directed to one data-center should remain there for some time period which may be for the duration of a single session or perhaps for longer periods of time. In the event of a failure at one data-center the user would be failed over to the second active center and this introduces the second major challenge and cost; replication.
- Replication/Synchronization: In an active/active geographically distributed application environment, content needs to be as identical in both data center locations as possible. Stickiness of a user to one location helps a little in the sense that data can be updated in that local environment and the user will see those changes immediately. However if either the user is failed over mid-session or the data that they are changing is critical and needs to available everywhere, immediately, then near real-time replication is needed; this is a great challenge. Typically, web site content can be replicated across some sort of WAN connection quickly as, outside of large media files, the size of the data is relatively small. However, database content is usually much larger. It is not uncommon for databases to be multiple gigabytes or even terabytes in size. The cost of providing large enough "pipes" across geographically distributed data centers can be very considerable.
There will be other blog posts on this subject. Early in the New Year I intend to dig into a method of dealing with distributed data centers. In that piece we will look at what is known as a "Data Grid" concept. It is not a datagrid of the type we might use in Flex, for instance, but rather a way of sharing data between geographically distributed application environments. One such concept, in that space, is Oracle's "Coherence" initiative.