Sunday, March 27, 2016

Why do we need a configuration management tool?

I spent some years developing web applications in Java for the corporate world. The apps always used some webcontainer or application server and we used different staging environments to show our work to testers, managers and customers before we went to production. Unfortunately, there are always some differences between these environments, be it a feature turned off or a database password or a mail server URL. So one needs to provide, ideally external, configuration file for every environment.

The usual way to do that few years ago was to define whatever configuration necessary, place it on the server and be done with that. In our case we put a logback.xml and an app-env.properties to the lib folder of Tomcat. Whenever a property is introduced or needs a change one asks a fellow ops guy and he performs this change on the day of the deployment. It worked most of the time but there are couple obvious issues with this approach:
  • Error-prone – typos, missed changes, missing properties.
  • Unknown configuration – in corporate world it is the ops who have access to production so there is no way to know the actual content of a configuration file for a poor prod-issue-solving developer.
  • Impossible full deployment automation – since a manual step is needed from time to time.

We lived with these downsides for a while, suffered some break downs of the app, or the RabbiMQ broker (our OPS never really embraced the 'infrastructure as a code' principle), or a broken database. Ultimately, the change came along and we started the internal DevOps pilot. One goal was to do automated deployments. Now you can write a script that stops a Tomcat, copies over the war and starts it, you can use Flyway or other database migration tool and run a command to do the database migration for you, but what about the external per environment configuration?

Surprisingly, lot of people I talked to thought that configuration is easy. You just check in the version control system the configuration changes along with the feature. I believe they miss couple interesting corner cases. Let's use an introduction of a mail server as an example. A developer implements first feature which sends emails. He tries it out locally with a postfix server and adds mailServer.url property and wants to check it in but:
  • The value is not known for all environments as ops may be able to provide you the right mail server URL weeks (and I saw months:)) after you've finished the ticket.
  • The value may change in the future.
  • Additionally, the value should be there only since the very first version which includes the commit.

If you want to get close to continuous delivery you may want to maintain that any commit that builds successfully can be delivered up to production. But then again, where and how should you store the configuration?

Given the example above we concluded that the configuration files have different lifecycle from the code. They are changed at different times. Moreover, they are shared between dev and ops teams since they both maintain certain parts of the configuration.

We decided we needed a configuration management tool with following properties:
  • The deployment tool can get from it a zip file via an HTTP GET call. The tool provides project name, environment name and version.
  • The zip contains all files in the correct structure that can be extracted to a designated folder.
    • e.g. /lib/certificates/server.jks
          /lib/logback.xml
          /lib/application-env.properties
          /conf/server.xml
  • Only authenticated users can change files.
  • Every action is audited – no more shrugging and not knowing who changed a value and when. No more what version of configuration was there when an issue happened a week ago.
  • (Optional) Support for 'secrets' – e.g. production database password visible only for certain group of users (ops).
At the point of the decision there were no obvious solutions out there so we developed in-house a simple web application. Obviously it was a web app since when one has a hammer everything looks like a nail:). The app had a UI for listing files per project and environment. It had some neat syntax highlighting, it allowed you to add, copy, diff, edit, delete files, etc. It stored files in the database. The UI part needed a lot of maintenance and feature work yet it was clumsy to work with. It offered HTTP API to get the zip with configuration. It solved most of our problems. Since its roll-out we could finally see how are the customer facing environments configured. When a new property was introduced we could make the change, commit, take the version of that commit and create respective versions of the property files in all environments leaving TODOs where we did not know the right value. No more documentation about what all needs to be changed with the next deployment. No more figuring out what kind of configuration do I need locally to be able to run certain version of application.

I will elaborate a bit on how the versioning of files work by providing following example. Let's say the app is released in version 1.0. There are two files for each environment 'logback.xml' and 'app-env.properties'. We decide we want to change logging a bit in version 1.0.10 so we add updated logback.xml for version 1.0.10 and higher. Then a mail server needs to be added. The feature is implemented in version 1.0.42 so we add updated app-env.properties for version 1.0.42 and higher. So far our configuration management contains four files for each environment:
  • app-env.properties/1.0.0
  • logback.xml/1.0.0
  • logback.xml/1.0.10
  • app-env.properties/1.0.42
If one needs app configuration for given environment for version say 1.0.5 he will get the initial two. If one needs app configuration for 1.0.32 he will get app-env.properties/1.0.0 and logback/1.0/10. For version 2.0 he will get app-env.properties/1.0.42 and logback/1.0.10. So there is a requirement that versions can be lexically ordered.

The above described approach worked fairly well and we were autodeploying for more than a year with great success. Downtime of 40 seconds and the peace of mind were totally worth it. But on the hindsight, we spent one hell of a time developing features of the configuration management tool which could be covered by choosing another approach which came to our minds. Let me present it to you. Most of the work on the configuration management tool was spent on the UI and all the file management, syntax highlighting, and diffing. Despite all that work it is still not great! Yet there are so many tools around that can do parts of that work much better, namely our beloved IDEs. There are also good file storing solutions out there with auditing and we happen to call them version control systems:) So why not to pick Git with vim or Idea, create a predefined folder structure and let some daemon loose on top of it to serve the configuration zip file?

Our plan is to try just that at techdev, try it on shypp.it first and then open source it. We will keep you posted.


Also if you know about such a tool please let us know! We don't want to reinvent the wheel all that much.

No comments:

Post a Comment