Most companies that begin to formally manage their IT infrastructures do so when their management teams begin to perceive a pervasive state of uncertainty. Not only do various services go offline or perform poorly (due to overloading, data stalls or network problems), but they do so haphazardly, making mitigation difficult to plan and inefficient to implement.
The first reaction is to alleviate this incertitude by implementing increasingly complex monitoring schemes, in the hope that they will provide this much-needed information. However, this quickly turns out to be insufficient. While more data alleviates the momentary incertitude, as the IT team now knows that a service is down before the first user tries to use it and fails, simply gathering more data about the infrastructure does not make its behaviour more predictable, nor does it make responding to incidents much easier.
Upon a close analysis, this is not unexpected. First, what is usually reported and seen as a problem is the failure of a service or business process. Correlating that to the infrastructure problem that generated it is not always trivial – and, in fact, having more data makes it harder, not easier, as the IT team needs to sift through heaps and heaps of irrelevant information. Second, as the amount of data increases, the effort of analyzing increases accordingly: even if the data required to respond to incidents or to improve the system is now available, it becomes impossible to do so in a timely and efficient manner.
Unfortunately, more often than not, the usual reaction is to insist on even more data being gathered, but that only strengthens the illusion of control, while in fact being detrimental to it. Most of our clients realized that, as the number of alerts regarding various incidents rose and rose, their response times became longer, not shorter. It became difficult to prioritize interventions, or they would wrestle with dozens of related problems because the root cause was buried under hundreds of reports.
Read the complete version of the blog here and learn more on how you can only receive the relevant information.