Ben Summers’ blog

Why development should look like deployment

It starts so innocently. You create a new web application, and while developing it you run it on a non-privileged port. But over time the differences between development and production mount up, and one day, the system running in development looks nothing like production.

This means the application you’re developing is very different from the one which is running in production, and there’s the potential for things to become interesting when you deploy. I have come to realise that you must make a conscious effort to keep development mode as close to production as possible.

A modern client/server application is a large system with many parts. There’s at least three or four processes running (web server, application server, database server, and so on) before you even look at the network and the browser running on the user’s machine.

This means it’s non-trivial to match the development environment to the production environment. But that doesn’t mean we shouldn’t try.

How my life got complicated

As with any project, I started ONEIS very simply. There was just a single application server process, and development mode looked very much like it would when it was deployed.

Fast forward two years, and now I find myself managing

  • multiple application servers
  • cached state within application server processes
  • a notification server to inform the other processes of invalidated cached state
  • several job runner processes, to run async tasks
  • a conversion server, to process files
  • an uploads server, to offload the upload process from the application servers and do stuff like calculating digests and decompressing files during the upload
  • various messaging processes to communicate with the management applications
  • and of course, the database and front end web server

Yet, in development, I just ran a single application server process, and the other components when I was working on them.

So, not only was the port number different, but there was a huge difference in concurrency and the associated potential for hidden interactions between processes and the cached state.

Then, just to make life even more interesting,

  • the code is run through a post-processor before deployment, which rewrites the HTML to be far more compact, and changes the URLs of files
  • static files and per-customer files are served through Apache rather than the web application
  • SSL is used everywhere – and web browsers do different things, especially with caching, when you use SSL (I’m looking at you, IE6).

While it’s possible to manage this complexity by being very careful during development and doing rigorous testing as part of your deployment process, it’s certainly not the best plan.

Minimise the differences

So, what differences do you actually need?

  • Some mechanism to reload code while the application server is running to minimise the time between making a change to the code and seeing the result.
  • Emails probably shouldn’t be sent.
  • It might be useful to have some extra code to check application logic, or do validation on the HTML.

But is any other difference necessary, and do it’s benefits outweigh the cost of running different code in production?

This includes the operating system. Developing on something like Mac OS X and deploying on Linux is a pretty big difference between development and deployment. With the availability of free virtualisation software there’s no excuse! And ideally you should create the VM using exactly the same procedure as the production servers, using tools like JumpStart, Puppet and so on. Just run a file server such as Samba to allow the host OS to edit files on the VM.

It’s worth considering a slightly less obvious difference between development and production: the amount of data the system is handling. As any algorithm is fast when the amount of data is small, it’s important to try and get a decent amount of test data into the system so you can see how it performs. It’s probably best to avoid using live data (mistakes could be costly, for example, accidently emailing users), but use a script to generate sensible test data.

My solution

I’m currently writing a custom framework for my application which runs on the JVM. I’m taking the opportunity to put everything inside a single Java process, which simplifies things enormously. (Since each customer has their own silo of information, I can easily scale by running more than one process.)

Everything in my list above is automatically started within this process, and stopped when it finishes. This encapsulates the complexity of all the various components, and it’s managed exactly as it is in production.

I’m not using a separate web server process as a proxy. Instead I use an embedded web server, which listens on port 80 for http and port 443 for https in both development and production. (I run as a normal Solaris user, but have added the ‘net_privaddr’ privilege to allow binding to low ports.)

The framework runs multi-threaded in development as well as production, so it can serve multiple requests at the same time as it would in production. To allow code reloading in development mode, every few seconds the source files are checked just before request handling. If any of the code has changed, it’s reloaded while the server is temporarily paused.

Another, more practical, solution

While my solution works well for me, it’s unlikely that it would work for the average web application. Before radically changing the way my application worked, I was using a more generally applicable method.

It was the introduction of the uploads server which forced me to do something. The application server could no longer handle file uploads without assistance, and the front end proxy was needed to send uploads to the uploads server. This required the web proxy to run in the same configuration as it would in production.

So, the obvious solution is to install your application and use parts of the deployed application in development.

In production, you’ll be using some form of process management system to manage the various processes which make up your service. As I’m on Solaris, I use SMF, which is ridiculously wonderful after you’ve got the hang of it’s amusing XML based service descriptions. On other operating systems, there’s systems like monit and God.

So, you create a new virtual machine which matches your deployment operating system, and install the application as normal. Then, when you’re working on a component, such as the main application server, simply disable the installed service and run them under your normal user account.

This will probably require a few changes, although they’re relatively minor.

  • Change any settings (files, directories, ports etc) in development mode to point to production locations.
  • Adjust file system permissions to allow your normal user to access the production files — ideally scripted.
  • Make sure the same number of development app servers are run in development as production, which might require a script to make it convenient to start and stop them.

Once that’s done, you can be pretty confident that during development, you’re seeing your application as it behaves in production. And you don’t have to write any hacks to work around the differences.

Develop as you deploy

To produce high quality code, you need to catch issues quickly and experience your application as it appears to the user. Spending the time to make the development environment accurately reflect the production environment is well worth the effort.

 

COMMENTS

blog comments powered by Disqus

 

Hello, I’m Ben.

I’m the Technical Director of Haplo Services, an open source platform for information management.

 

About this blog

 

Twitter: @bensummers

 

Subscribe

Jobs at Haplo
Come and work with me!