This post is not really about anything, but this is one of those thoughts in my head that can leave my head when they are written out.
Bugs in software are real, and frequently it’s easy to rage on twitter about a piece of software having a bug or not being perfect. It is easier to be cool about these things happening when we understand why they happen.
One measure of whether a program is likely to have more bugs is the “cyclomatic complexity” of the code – basically how many branching paths (if conditions, loops, config options) the program has. Each variation in a control path increases the number of things that can go wrong in a program.
Another aspect is how many libraries or external systems the program touches – that can also break or change in APIs – or have their own complexity in permutations of response to input/state.
Finally, while I love open source contributors, the more people working on something, the harder it is to communicate, keep everything in a mental model, and check on things. The more contributions, the less time for code review and editing as well.
So, what you have here is what looks like Systems Management tech being one of the harder areas of software development – way harder than say, CRUD apps. Web browsers don’t have it easy though, and video games of sufficient complexity don’t either.
Bugs will happen, but frequency can be reduced by automated tests – but ONLY if you have a sufficient battery of automated tests. If there is a wide amount of code path variation or external system variation, this becomes hard. When the systems in question make a lot of permanent changes (like OS installs?) and the result of F(x) is hard to test, this is also hard.
There is no obvious solution, but we need to not be indignant when we discover a problem.
A bug in an bank ATM – that doesn’t have a lot of variation in it – is something we can be reasonably shocked to find, but when applications are moving fast in areas that touch hundreds of things moving equally fast, and have the near-complexity of programming languages and distributed systems at the same time, there is a lot to manage.
I used to give video game QA a black eye in the era where they could just release a lot of patches, but it’s gotten a lot better – this is because with word of mouth and such, it’s important to get things mostly right the first time, and also I believe because of consoles and things becoming a bit more standard (mobile, OS X, less permutations) it’s easier to try out variations.
There’s no real conclusion here – but maybe just that systems management tech is not rocket science – but perhaps *harder* than rocket science – and the reason it feels like things are always glued together is because the entire industry is trying to glue a bunch of angry wasps together while trying to teach them to both build a typewriter and write shakespeare on it once they are done. Not because there are any revolutionary algorithms or difficult domain knowledge, but because the number of branching conditions is so high and it deals with so many permutations, variations, and changing unknowns.
Be cool when you find a bug – because a herculean effort was made that kept away 10,000 of them that don’t exist.
There aren’t many bugs in say, a desk or an armchair – but they do not operate on worlds with multiple different types of gravity, or for creatures with any number of limbs between 0 and 56 either, and are not expected to talk, fly, or process your taxes either. If they did, they would have more bugs.