The Final Nail for ColdFusion Client Variables
A gentle reminder to brace yourself
Back in 2005, I blogged about my loathsome relationship with client variables. I was content to just write a couple points down; it helps relieve my frustration. I was content, at least until a couple months ago when my company's middleware hosting department decided that if we wanted to be hosted on ColdFusion 9, we would have to change all of our session variables into client variables. Frustration was rising. I needed to write more.
First, I wrote a fix in my application. It was easy. Session variables were never disabled; the hosting group wanted us to switch for server failover safety. One tweak and I was good to go.
Second, I wrote a strong-worded letter to our company's internal CF developer email list. My friends asked if I had any more where that came from. I assured them, I did.
Third, I wrote a presentation for CF.Objective() 2011 titled "The ColdFusion Client Variable Ultimate Smackdown," which I gave as a lightning talk. Most of the images below come from there.
Finally, I am writing this. The last and final word, the final nail in the coffin for ColdFusion client variables. Let's get started!
A client variable is a key/value simple data variable stored in ColdFusion's client variable scope. This scope was one that Allaire added to ColdFusion (sometime around version 3) that gives you access to variables that are somehow in-between a session and a persistent storage option. The typical life span of a client variable is 90 days, and ColdFusion handles all of the caching, storage and security by itself. Client variables are fetched from storage at the start of every request, and then any changes are posted back at the end of the request. Sounds like a great plan, but in practice, this doesn't work as well as you would hope.
Client variables fall short, and they do so in three categories. There are the common execution mistakes, the implementation complications, and the painful fundamental flaws.
It's easiest to start with those common execution problems, it will hurt more that way.
Common Execution Mistakes
You always do it wrong
These mistakes can be avoided. Whether by fixing each issue pointed out here, or by disabling client variables altogether, these are the most common mistakes that I see all the time, and you should be able to get around the problems with a little critical thinking and some elbow grease.
In many cases, the client scope has outright replaced the session scope. I was told to use client variables because that will be the only thing safe when my server crashes. While the client vars will live on, I need to ask - did they even need to? Session data tends to be cached credentials and other data that is easier to keep close. This kind of data can be reconstructed in the rare emergency failover (perhaps with a cookie). For other data that cannot be reconstructed, you can weigh the importance of the data against the threat of a server failover. For example, saved search criteria in a session-scoped struct is typically not mission-critical, though a multi-step form may be. Allow the search criteria to be lost, but put the form data into a database table as it comes in.
Most of us are smart enough to put the client vars in a database (instead of cookies or the Windows registry), but in most cases you will use a shared database - now your app is no longer portable, it's married to that shared database. If you move your application, perhaps to another data center, all your users will lose those client storage relationships. The data might as well have been truncated and lost forever.
Further, this central database is now a single point of absolute failure for every application using it to store client variable data. At the moment this central client storage database becomes unavailable, every application in your cluster using the database will also become unavailable, and often will cause your ColdFusion servers to crash. When the client storage database is unavailable, Java's JDBC thread pool begins to run dry as users wait for their pages to load. Meanwhile, more users on applications across the cluster are added to those waiting for pages to render, eating up the web/application server threads. In most cases, this will topple the ColdFusion servers. As soon as the client storage database becomes available again, it is flooded with connections, and has the potential to slow down or crash the database server.
Both of these problems can be mitigated by storing client variables in a different database for each application. The application's own primary database is the perfect candidate because it solves the portability issue and minimizes the problem of the database becoming unavailable.
Another way client variables are done wrong is when administrators leave Global Request Updates on. It's on by default, but that's no excuse. Enabled, every time the browser hits a page, even if they are not logged in or doing anything important, even if they don't interact with any client variables, the client variable tables are retrieved, and the Client.HitCount and Client.LastVisit rows are updated. Yes, that's a blocking operation on the one table used on every request by every application in your entire cluster to update two variables that will go unused in most applications. Even if it didn't block on every update, it still saturates the JDBC thread pool and slows down your servers. This is the kind of technology decision that keeps your DBA up at night, crying.
Sometimes, developers and server administrators think they can overcome the database problems by forcing client variables into cookies with the setting in the ColdFusion administrator. Don't do this. On most browsers, there are finite limits to the amount of data and number of cookies that can be stored in the browser, per domain. Exceeding the limits can cause very strange errors, and the problems are hard to troubleshoot. These larger cookies will also mean longer load times for your users as the cookies are moved back and forth with each request. The worst part about client variables stored in cookies is that it takes what is expected to be a server-side variable and puts it on the client side, exposing it to manipulation from users by simple cookie editing. There's no help for you if you put any security-related information there, or if you don't double-check every point of data coming back from the client scope. This is an easy way to have your application hacked!
Allaire, Macromedia and Adobe have always done it wrong
Adobe's implementation is also to blame for many of the problems with client variables. These are things that could be fixed but have never been addressed by the ColdFusion development team. The last substantial upgrade to client variables came in 1998 with the release of ColdFusion 4 (the addition of cookie and database storage). They've never repented, they've not fixed the design flaws listed here that have plagued us for too long, and they've only patched the most critical errors. Working around these issues requires abandoning client variables in favor of a custom approach.
We are starting this discussion with some of the defaults. If anyone installs ColdFusion in any standard way, then starts using client variables, you will run into immediate trouble.
By default, with no configuration, client variables are stored in the registry. Yes on a Windows server, this is the Windows registry. Yes, this is a HUGE problem! When you overload your registry, the OS becomes unstable to the point of needing a full restore. The registry is quick, but only up to a certain point. Also, storage here is limited to the local machine. You can never move it to another server or back it up without awkward registry tools, and you can never share it between multiple servers. Even Adobe recommends you avoid using the registry, so why is it still the default?
On non-Windows servers, with no registry available, ColdFusion writes to a single file. This client variable file, the "cf.registry" file, has some similar scalability problems. It works as well as a flat file can work, and at least it won't crash your server from a system perspective, though from an I/O perspective, it could turn into a bottleneck.
A simple, obvious default would be a local Derby database. Derby ships with ColdFusion, it would not interfere with the OS stability, it would be portable across servers and operating systems, and it has potential to scale and export to another database engine. Another great option already built into CF is Ehcache with "diskpersistence." Ehcache could even turn out to be much faster and much more scalable. The registry is a dead end.
When you add a client storage database, by default, the option to "Purge data for clients that remain unvisited for x days" is set to 90. This means, ColdFusion will remove client variables that have not been accessed only after 90 days. Many overlook this setting, and shared hosting tends to be the biggest offender, for fear of upsetting any developer who intended to use this default. The number of applications that need client variables to live for 90 days (or longer) is much fewer than this default setting assumes.
In cases where client variables replace the session scope (default timeout of 20 minutes), it would be generous to let them live even one hour. In this case, client variables are stored 2,160x longer than they need. If the default were lowered to 30 days, this would reduce the amount of client variable data by 66%, it would increase performance and manageability of the stored data, and would have a negligible effect on most applications. Even if most server administrators change the setting, 90 days as the default encourages bad data hoarding.
Also, this setting is global to the server. There is no solution for one application using a shared client variable database that wants to save client variables for a year, while another wants to save them for a day. These applications must specify different client storage databases.
One last note on purging, the timeout span is in days only. There is no option to save client variables for less than one day, so if client vars are used to replace session vars, they will not purge after 20 minutes, or after an hour. The smallest amount of time a client variable can time out is one full day, and the second smallest is twice that amount.
The last default option in the implementation of database client storage that needs to change is "global client variable updates" which is always enabled by default. The problem with this option was discussed previously in the Common Execution Mistakes above (the busy bees). The option should be disabled by default. Some applications may require these updates and it should be configurable at the application level, but it's not, this is a server-wide setting.
These updates are troublesome on their own, but fixable with a checkbox. One deeper problem with global updates is that the changes go straight into the database on every request, and that the entire client variable scope for the user is selected back to the request on every request again. They should not; this is a design flaw.
There must be a way to select client data during onSessionStart and update during onSessionEnd (or live, as the scope is updated, not including global updates). Of course perhaps that would not work for a no-sticky-session clustered round-robin load balancing server architecture. Again, if the option were configurable at the application level, it would allow some applications to be as smart as they need to be and others to work with the (improved) defaults. The lack of intelligence here proves yet again that client variables are not scalable.
This brings up the subject of Ehcache again. My friend, Rob Brooks-Bilson has been blogging and presenting Ehcache with replication across servers as a very capable session and client scope replacement over the past two years - this may truly be the best option. With nothing to install, it just takes some work to configure it.
I also still hear about the client variables database tables not purging, holding far too many records. This happened a few weeks ago at my office where one application was experiencing unknown trouble. When they examined their client variable storage in their development environment, the record count was in the multi-millions. CF was set to purge, but it wasn't happening. It's problematic, and it requires babysitting. In all my experience with variable storage, it always has, and these types of issues are very difficult to debug. Some type of monitoring or error correcting needs to come out of the product box, and the lack thereof shows how client variables are not ready for the enterprise.
Serialization is the process of turning data structures into a format that can be saved or transmitted. Client variable storage holds simple values only - strings, numbers and dates - so every other data type has to be serialized before it can be set. This is different from every other available scope in ColdFusion except cookies, which instead of interacting with the server, are an interaction with the browser.
This is 2011. We are at least 6 versions past the introduction of client variables. The true substance of ColdFusion has always been about making hard things easy. Why, then, is it that this serialization is never done automatically for us? Developers should not have to serialize to set a variable, or deserialize to read one. CF should do it without a thought. This is a big failure.
A pet peeve of mine (and a side-note to the discussion) is that the way ColdFusion applications that I have seen always serialize their complex client data (structs, arrays) with WDDX. WDDX is a verbose XML format that can hold all kinds of data types, just not objects. WDDX was a huge win for the early days of CF, and even today makes basic data serialization easy, but falls short when compared to JSON and AMF where the output is much more compact and these newer methods have the potential to store and transmit objects (depending on language and implementation). Support for these serialization methods is built in to CF to some degree, but none as well as WDDX for just serializing some 'stuff, and none can serialize instances of CFCs in CF. There are open source projects that will, however. My friend John Blayter has a project called SessionSwap that will even serialize components like magic and store data in Base64. I'm sure there are many similar projects out there. ColdFusion lacking proper serialization has been a problem for too long.
Storing client variables in a relational database may introduce scalability issues down the road. There are databases and storage options that could make a much better solution. Instead of storage in a standard RDBMS, a document-centric database makes sense here, given that client variable data varies a tremendous amount. Vertical data storage design, such as is used with the client variable tables, has overhead, but storing client variables as an atomic document would make a lot of sense and could increase the potential for the client scope into mega-website scalability potential.
Finally, something I saw when climbing into the network layer to look at the SQL that powers the client façade, communication with the database would make another SQL Server DBA break down in tears when they hear this. What I saw was a SQL statement for fetching client variables being prepared, executed, and unprepared, then again, prepare, execute, unprepare.
Preparing a statement is not too expensive; SQL Server makes a plan and caches it once so that it can run over and over in an efficient way. If you execute a prepared statement at least 3 times, you make up the difference from the processing cost that it took to prepare it. When you unprepare the statement, SQL Server removes that cached execution plan. By preparing, running the query only once and immediately unpreparing it, the current client variable implementation is making a mockery of your database. If you care about scalability, care enough to not use client variables in a SQL Server database.
The lack of attention to this technology given from Adobe, Macromedia and Allaire keep client variables in the dark ages of internets gone by. These grievances that have persisted for at least four major versions have been ignored. Client variables are dead, in their present form. May they rest in peace.
Everything about client variables is wrong
So far you have seen all the physical problems, now let's attack it at the soul.
When you set your gaze on this technical frustration, you have to conclude that client variables are lazy. You are using them because you don't want to make a database table. You are using them because you don't want to figure out how to overcome session problems in a clustered environment. You are using them because you don't want to think.
While laziness in programming can be a great virtue that brings simple solutions, laziness in thought allows superficial solutions to creep in. It will have the appearance of working, but not the substance.
If you want to hold something a little longer than a session, make your session length a little longer.
If you want something to persist, create a database table. A vertical layout table with random data is difficult to work with, use a regular table to store regular values. Control the access to this table yourself so that it doesn't update and fetch on every request.
If you want unknown amounts of data to persist, use a document-centric database. MongoDB and CouchDB are perfect for this, and very scalable.
If you want to cheap out and write bad software, stop being a programmer. Please.
Beyond lazy, client variables are a form of sloppy programming. When you put a client variable in your application, you're saying "I don't care." When you leave it in there for the next developer, you're saying "I hate you and everything and everyone!" You're creating careless problems and leaving a mess behind. We developers have to work around these problems with clustering, session scopes, and data persistence. There are better ways to do what is required.
Think harder and think smarter. You are better than client variables. You don't need them.
Client variables illustrate why people hate ColdFusion. CF gets a bad rap for having a lot of terrible applications and spaghetti code. CFML itself is fantastic and has given its developers a tremendous advantage over most, but when the language is abused, it draws criticism that is not deserved. When you add client variables to an application, it gives it that 'code smell.' Client variables add problems to problematic applications. Client variables make ColdFusion applications worse, and they make ColdFusion worse.
In the end, the heart of the matter is that client variables are technical debt. When you write a client variable, you are taking a technical loan against your application and your servers. Failure to pay that loan back will soon cost real money in additional hardware and design time later on when you learn how bad the mistake is that you have made. Design up front is always cheaper. Client variables need to be refactored out of your code.
The best time to remove technical debt is before you start. The second best time is right now!
More than just encouraging you to avoid client variables, I want to encourage you to take a stand against bad programming, and to think about the technology decisions you make.
Still more, I hope that Adobe listens and fixes client variables. They have been a problem for too long, and many of their basic troubles can be easily corrected. We care, and we want Adobe to care, too.
This humongous rant was written by Nathan Strutz in June 2011. Nathan is an IT analyst/developer at The Boeing Company who lives in the Phoenix area and co-manages the Phoenix ColdFusion Users Group. Visit his web site at dopefly.com and follow @nathanstrutz on twitter.
Huge thanks also go to Rex Aglibot for technical validation and to Alan Rother, Jonah Blossom, Steve Rittler, Dennis Clark and Rob Brooks-Bilson for proofreading, providing input and making this the best angry technology rant possible. You are the people who make ColdFusion and the ColdFusion community great!