Shared Memory Scoped Variables and Clustering
Several clients and potential clients asked us recently how they can deal with Application scoped variables in a ColdFusion cluster where a visitor is moved from one cluster instance to another, during their time on the web site. Before answering that I thought it would be worth going through an overview of the three shared memory scopes used in CF and some comments about their use. I realize that many of us already know what we are showing here, having said that there are many newcomers to CF so it won't hurt to discuss this; in my opinion. Please note that the only version of ColdFusion which supports clustering is ColdFusion Enterprise.
The three shared memory scopes are as follows:
- Server Scope: This scope is shared by all ColdFusion Applications and all Users on a single server. Prior to ColdFusion MX 6, this always meant one physical server. In the J2EE world and in ColdFusion Enterprise, we entered a new era where multiple instances are the norm. An instance is directly equivalent to a physical server in pre CF-J2EE installs. So if you have multiple ColdFusion Applications on a single instance, as instantiated using the <cfapplication name = "application_name"> tag, all those named ColdFusion Applications have access to the Server Scope as a single shared scope. Adobe will be introducing hooks into the Session Scope from Application.cfc in ColdFusion 9 (codenamed "Centaur"). The Server Scope cannot be replicated to other instances, natively, in the ColdFusion-J2EE clustering mechanism, so all creations/sets of Server Scope variables should be made in the Application.cfm or Application.cfc so that they are available immediately after the first visitor hits the ColdFusion Application and to all subsequent visitors. The creating/setting of Server Scope variables should be locked using <cflock> so as to avoid a "race-condition", which is roughly equivalent to what could happen in a Database where a row being added or updated is being read at the same time. One other very important point, in clustered environments the code for each instance should be identical, this can be achieved by using either real-time replicated code or a single shared copy of the code on something like Network Attached Storage (NAS) or a Storage Area Network (SAN). Having identical code bases ensures that even if site visitors are moved around instances-servers the Server Scope will contain identical variables and just to reiterate; all instantiating of server variables should be done in the root Application.cfm or Application.cfc and no lower in the code structure.
- Application Scope: The Application Scope is created using the <cfapplication name = application_name"> tag which creates a named ColdFusion Application and there can be multiple ColdFusion Applications on a single ColdFusion instance/server; if there are, they must all be uniquely named. Each ColdFusion Application can have an Application Scope which can contain Application Scope variables which are shared by all visitors. The Application Scope cannot be replicated in the ColdFusion-J2EE clustering mechanism, natively, so all creations/sets of Application Scope variables should be made in the Application.cfm or Application.cfc so that they are available immediately after the first visitor hits the ColdFusion Application and to all subsequent visitors. The creating/setting of Session Scope variables should also be locked using <cflock> so as to avoid a "race-condition", which is roughly equivalent to what could happen in a Database where a row being added or updated is being read at the same time. One other very important point, in clustered environments the code for each instance should be identical, this can be achieved by using either real-time replicated code or a single shared copy of the code on something like Network Attached Storage (NAS) or a Storage Area Network (SAN). Having identical code bases ensures that even if site visitors are moved around instances-servers the Application Scope contains identical variables after they have been instantiated and just to reiterate; all sets of Application variables should be done in the root Application.cfm or Application.cfc and no lower in the code structure. The Application Scope has a timeout value which is set either in the ColdFusion administrator GUI and in the <cfapplication name = application_name"> tag, this setting overrides the setting in the ColdFusion administrator GUI . One other very important point about the Application scope, there is another scope which is not persisted in memory and which can often be interchanged with the Application Scope, this is the Request Scope. During my time troubleshooting ColdFusion applications some clients have encountered memory leakage problems where a large amount of data is persisted in the Application scope. My opinion, as a result, is that the Request Scope should be used wherever possible in preference to the Application scope. This is not practical, though, where frameworks such as ColdBox, FuseBox, mach-ii/Model Glue, ColdSpring, Reactor/Transfer are used. These frameworks are persisted in the Application Scope which means large numbers of objects are created and persisted in the Application Scope, in order to ensure acceptable performance. The Request Scope is not persisted in memory and variables created there, die at the end of each request.
- Session Scope: The Session Scope is in essence a sub-set of the Application Scope and contains all variables relating to a single visitors session. There cannot be a Session Scope unless an Application scope also exists and the Session Scope is typically set within the <cfapplication> tag. The reason that the Session Scope is considered to be a shared scope is that the variables in the Session Scope are shared throughout a single users session. If a web site has multiple visitors, which most do, there will be multiple sessions existing at any one time; one Session Scope per user. This maintaining of state is tracked by either using a cookie on the users browser or passing a user identifier on the URL. Passing this on the URL is not recommended. In the J2EE clustering mechanism, which is utilized by ColdFusion, Session variables can be replicated using J2EE "buddy" replication. J2EE clustering is "peer-to-peer" which means that there is no central controller of the cluster; this is a note point. All cluster members (ColdFusion instances) must have unique names and all clusters also must also have unique names. The buddy mechanism is activated when we select "Session Replication" in the ColdFusion Administrator when creating or updating a cluster. Once this is successfully activated, all Session data is replicated via the network and where there is a lot of data per Session and where there are large numbers of concurrent users this creates a large volume of network traffic; as replication occurs in real-time. I have seen a good number of cases where Session Replication can be fraught with difficulties, such as out of synch data across cluster members (instances), for this reason I always recommend that the "round-robin" with "sticky sessions" algorithm be selected so that site visitors are not bounced around a cluster with each request. As with the Application scope, there is an alternative to the Session Scope, in ColdFusion. This is the Client Scope and I would always recommend its use over the Session Scope, the one draw-back being that there is no native way to store complex data types in the Client Scope. However, I have seen our clients get around this by using serialization-de serialization, typically employing WDDX to do so. Typically we would maintain-persist Client Variables in a database, this is configurable in the ColdFusion administrator GUI. The big advantages of the Client Scope is we do not need to replicate large amounts of data around each cluster member, via the network, in real-time and this makes it very easy to grow the cluster, by adding new instances.
In this blog piece I have attempted to detail memory resident shared scope variables and how their use is impacted when we move to a clustered environment. If some of you use different methodologies it would be good to hear about them here.