There is an acronym that has been thrown about quite a bit in most agile development circles (and elsewhere) and that phrase is YAGNI. In case you don't know, it means "You Ain't Gonna Need It". The idea is that any time you spend implementing something that you don't end up needing is wasted time, so don't do it, because "You Ain't Gonna Need It"! Now, while this phrase doesn't quite jibe with what I am talking about, I want to introduce a new and related phrase.
This new acronym is IAGW, and is pronounced aye-ag-wuh. It stands for "It Ain't Gonna Work" and can be applied to virtually every part of an application development, but I only want to talk about it in terms of "self healing software". I really love the term "self healing software" because it implies that the software is going to do something to actually repair itself without your intervention. While a more accurate term would probably be "robust software" I think that this phrase probably wouldn't create as many Phd candidates or sell as many pieces of software.
So, let me first start off by saying that my problem is not about robust software, my problem is that people start ignoring YAGNI when they start thinking about how they are going to make their software robust. In fact, I'm probably going to eat a lot of crap for saying this, but "YAGNI" and building robust software can sometimes be at odds. A lot of what you see out there passing itself off as robust is just developers trying to anticipate where and when the software is going to break. Don't you love it how most of us cannot write a method with more than 20 lines without introducing a bug, but somehow we think that we can predict when a piece of million line software is going to break? The reality is that you can't. And if you try to, well, IAGW. If you account for one bug, then some other bug will happen. If you try and recover from one failure, then you are going to get bit when the failure doesn't happen exactly as you expected. How bad would it be if your code that you put in to try and recover from a problem cause another one or hid the real problem?
So am I saying that you shouldn't try to write robust software? Of course not! What I am telling you is that unless you know for a fact that something in your application is subject to breaking (such as a db call or a remote web service call), then you shouldn't try to pile in code to account for it. You should run your application and test and prod to find out where your breaks are going to occur, and then you should fix them. If you run your application in production and it breaks, then you need to start putting code in to resolve problems. One thing to focus your efforts on is instrumentation so that you can find the bugs that do pop up. You may think that technically putting in lots of instrumentation violates the YAGNI principle, but I'll let you dwell on that technicality when you put an application into production and have no way to figure out why your application did something odd.
So get out there, and stop trying to be a psychic, and start being a software developer!