SRE Lessons from Sherlock Holmes — Introduction.
Inspired by my recent SRECon2023 Americas talk, I’m going to make a digression from my usual space exploration articles and discuss some lessons we can learn from the master detective Sherlock Holmes.
After all, Holmes solved crimes using tools such as magnifying glasses and chemical reagents just as we resolve incidents using telemetry and observability tools. The technical details may be different but the concepts and mindset are the same.
While there are many incarnations of Sherlock Holmes, each with a slightly different personality, strength, or weakness, they are all recognizable as a consulting detective who solves crimes and brings criminals to justice. In short, Holmes is an expert in resolving incidents. Faithful Watson is by his side, documenting and publishing post-incident reviews.
In the first short story published, Holmes famously stated to Watson that —
“You see, but you do not observe”
In other words, the necessary information may technically be available to you, but you do not know how to benefit from it or place it in the correct context. You may be drowning in data and do not know how to generate insights out of it or find the 1% important information.
In the next few articles I write, I’ll go through a series of lessons we, as SREs, can take from Holme’s criminal investigations and use them in our own reliability work.
To keep things in context, I’ll try to map the different lessons to different stages in the famous SRE Service Reliability Hierarchy pyramid.
Please let me know what you think of this idea and whether you have a favourite Sherlock Holmes story.