TIPS & TRICKS

Software configuration – why does this wheel need re-invention?

I have worked on so many software projects that I can’t possibly enumerate them. Most of my contribution to these projects has been on the server side of things. Every one of these projects needed to be configured in some way, shape or form and I just realized that every one of them had it’s own configuration subsystem that was implemented from scratch. Many of these configurations could be managed via GUI’s and/or CLI’s, and others simply were “managed” via vi, or emacs. They all share one thing in common however – they all suck in one way or another. Why? Because configuration subsystems are incredibly difficult to get right.

Building a configuration system on the surface seems boring. If I went and showed the sales guys how cool my configuration system was they would roll their eyes back into their heads. Put some rotating, flashing thing on the GUI and they think you’re the coolest, most creative developer around. The fact is that a good configuration system makes a huge difference to a product. In fact, it can make or break it in some cases.

Indulge me in allowing me to share a typical “configuration system lifecycle”. Please tell me if this seems familiar to you. I have personally gone through this many times.

  • Version 1.0 – simple configuration language, usually XML. Why? Because you need to get something up and running quickly. XML has tons of parsers, validators, etc. Users of this early release need to edit the configuration files using a text editor. They need to restart the system every time a change is made. The developer states that this is fine – the product is “not intended for use by people that can’t use an editor”. Fuck em’.
  • Version 1.5 – The next release has some really complex configuration. However, it’s still only modifiable via a text editor. Maybe flow control is introduced. Changing a configuration in the wrong way causes very bad and very weird things to happen. Customer Support gets lots of calls. There is no way to tell what a customer changed and what the default configuration was supposed to be without comparing the two configuration files side by side.
  • Version 2.0 – We need an adminstration GUI so people can configure this without have to call support every single time! So a GUI is added. Every administered item is coded into the server and into the GUI because every configuration has different validation, different things to check, etc. The customers are much happier. Until graybeard decides he hates the GUI and insists on using emacs. The GUI and emacs don’t get along very well. Things break again.
  • Version 2.5 – The executives decide that we need a way for “the community” to build widgets that other people can use. They need to package these widgets up in some way that they can be downloaded and added to the system without disturbing local and default configurations. The engineers decide to use layering to separate these 3 things out. But layering in XML is nasty and people will get confused. So out with the XML to something “simpler”. Boy did this open a can of worms. All the different parts of the system need to be modified to handle the new configuration syntax. We are just about ready to ship. Boy is this code base different – “Oh SHIT! We forgot we need migration scripts!”. So they are frantically built and hastily tested. The product ships. Customers complain. Not only do the migration scripts hork periodically, but the configuration language is new to them.
  • Version 3.0 – The server engineers are adding lots of new features to support customer requirements. Unfortunately, every new feature needs custom GUI and CLI work to handle the administration of that feature. This is simply not sustainable, so it’s been decided to data drive the GUI and CLI from a specification file that describes the syntax, the interdependencies, etc for each configuration item/file. Furthermore, the community is going gangbusters, but downloading new widgets requires a restart of the server. So does all configuration changes. Once again every part of the system is changed to handle this dynamic configuration. Man is this hard – “what do I do with the data that is already in the queues when the queue is supposed to be shrunk in this re-configuration, asks one of the brightest engineers?” Hmm.

You get the idea.
So here in a nutshell is a list of reasons why configuration systems are so difficult. I’m sure you can add more:

  • They are actually small languages. I have seen XML, simple linear lists of attribute/value pairs, scripting languages with flow control, strange and weird languages like in sendmail, etc.
  • They need validation so they don’t break the system
  • If there is GUI or CLI access, they need to be dynamically updated
  • Consistency between updates is critical so that someone editing a config file using via doesn’t collide with someone using the GUI.
  • They need to be migrated from version to version or need some kind of backward compatability
  • They can be layered so local changes override system defaults
  • They need to be extensible so ultimately 3rd parties can develop configurations that are add-ons
  • They need solid documentation – ultimately self generating.
  • They should be data driven such that every time someone invents something that needs new configuration, the GUI and/or CLI doesn’t need new code.
  • They need to support dynamic loading with no system restarting
  • They may need to support versioning in systems that are composed of modules, each which may be independently revved.

Conclusion

Configuration systems are often overlooked, but can be the core of an entire system. There is no substitute for a really good one. It’s almost impossible to get it right the first time, but you must really think long and hard about where you want it to go and what you want it to become.

Yes. I copped out. I didn’t tell you how to do these things. I didn’t tell you where you can look on SourceForge to find the ultimate configuration system so you don’t need to re-invent the wheel yet again. That is because there is none – at least not that I know of. I have some ideas on how to build a generic configuration system that if open-sourced could save engineers months of time, but that is the topic of a different post.

Splunk
Posted by

Splunk

Join the Discussion