Better Living Through ChatOps
In previous entries, we’ve discussed creating a virtual war room using ChatOps, some implementation details on creating a Chatbot and even some of the historical precursors to ChatOps, but we haven’t discussed ChatOps directly.
The way I define ChatOps is
the integration of development tools, operations tools, and processes into a collaboration platform so that teams can efficiently communicate and easily manage the flow of their work.
Collaboration platforms enable groups of people to communicate (primarily through text) either directly or as a group in a named room or channel.
Multiple conversations can be held simultaneously and efficiently. There is no need for anyone to be physically present in the same place or even to be working at the same time as anyone else since the reply can simply wait for the other person to see it.
Examples of collaboration platforms include Slack, Microsoft Teams, Mattermost and others, including older platforms such as IRC.
Many organizations are already using these types of solutions for regular business work and coordination. Adding integrations that enable operational work is a natural evolution.
This kind of solution maintains a timeline of communications that provides a record of decisions made and actions taken. It keeps everyone up-to-date while avoiding information overload.
It also allows for instant collaboration between Humans-and-Humans and Humans-and-Machines through automation (aka bots).
Not every use of a collaboration platform is done for the benefit of operations. ChatOps is the convergence of Operations, Collaboration, and (optionally, but highly recommended) Automation.
The Benefits of ChatOps
ChatOps integrates people, tools and processes so that teams be more efficient and work more easily.
ChatOps is not a product or a tool, it’s a way of working.
For example, during the troubleshooting of an incident, instead of each person working independently and coordinating by phone/mail/trouble-ticket or having people brought physically together in a war room, they can collaborate using a platform that is designed for both human-to-human and human-to-machine interactions.
The major advantages of ChatOps are that:
- All the people who need to work together are in the same virtual location using a tool that they’re all familiar with and use for other purposes anyway.
- All the necessary tools are at their fingertips and they can now be operated from within the chat windows using bots and other commands instead of opening a dedicated application window or console.
Having the work done in a chat has further benefits. Someone who’s just been added to the incident can scroll up and see what people did earlier instead of having to interrupt and get updates.
Instead of having to keep the management updated — “Who’s working on the problem?”, “Do you think you’ve found the issue?” —leaders can see (in real-time) who’s working on the project and see the confidence of the participant.
Of course, it is highly recommended to have enough channels free of management people so that technical people can brainstorm and have open conversations, but a channel dedicated to updates can be used to easily synchronize all the incident participants.
If you’re doing ChatOps properly and using the tools built into the platform, then later, during the incident post-mortem or retrospective, the chances of understanding who did what and how the problem was solved will be easy. Everything is self-documented!
If you’re familiar with the Incident Management toolchain, you know that there are a variety of pieces involved in resolving incidents. These range from the monitoring tools that detect the issue and show the current status through ticketing systems to document the incident and runbook tools which present automated response through notification systems that alert people.
ChatOps doesn’t replace the available tool, it simply makes them more accessible to everyone involved. The benefit of ChatOps comes by reducing the context switches between the tools and allowing you to work with one central tool. Why open a browser when you can pull the dashboard into the chatroom? Why be alerted by (yet another) e-mail when you can join the incident chatroom?
The demonstration that you can work in such a synergetic way is a revelation to everyone I’ve introduced to this concept. One of the reasons it resonates so well with people is that it slots into a natural instinct to reduce waste and needless toil. “Waste” in a service organization (and just about everyone is in the service business in one way or another) is anything we do that does not add value.
Of the 7 types of waste defined in Lean Process Thinking, two are directly addressed by ChatOps:
- Motion (people or moving or walking more than is required to perform the processing) is addressed by bringing (the right) people into the same virtual location instead of physically (i.e. a war room) or ping-ponging the issue between people (i.e. slow escalation via trouble tickets).
- Transport (moving products that are not actually required to perform the processing) is addressed by bringing technical tools and displays into the chatroom instead of forcing people to context switch between tools.
For operations personas such as Operators, Sysadmins, Site Reliability Engineers, DevOps engineers, Level 1/2/3 support and so on, ChatOps has the benefit of:
- Reduction of “Waste by Motion” — No needless context-switching between tools, easier and closer collaboration between humans.
- Reduction of “Waste by Transport” — No copy/paste between tools, faster access to information.
- More opportunities to learn from others — tasks are more transparent and each person can see what others are doing.
For development or business personas such as Developers, Managers, Line of Business Owners/Application Owners, DevOps engineers, other Subject Matter Experts and so on, ChatOps has the benefit of:
- Availability of operational tools in an environment that’s already familiar to them, improved ramp-up time and reduced friction to adopt new tools.
- Reduced dependence on the traditional operations personas, more collaboration between groups.
- Gaining a dynamic & flexible collaboration platform in parallel to the more traditional process-oriented tooling (ticketing tools, for example) which translates into more transparency.
As a result, ChatOps allows you to work more comfortably and easily.
ChatOps can be used in just about every section of the software lifecycle, but my personal bias is towards incident management (which I often present as reducing Mean-Time-To-Repair) and problem management (which I present as increasing Mean-Time-Between-Failures).
The diagram above demonstrates the many Key Performance Indicators (KPIs) which benefit from the use of ChatOps. MTTD, MTTI, MTTK, MTTR are all reduced while MTTF and MTBF are lengthened.
In a nutshell, ChatOps will reduce the time it takes to solve problems and make it easier to avoid problems.
In a future entry, we will discuss ways in which organizations adopt ChatOps.
Bring your plan to the IBM Garage.
Are you ready to learn more about ChatOps?
We’re here to help. Contact us today to schedule a time to speak with a Garage expert about your next big idea. Learn about our IBM Garage Method, the design, development, and startup communities we work in, and the deep expertise and capabilities we bring to the table.
Schedule a no-charge visit with the IBM Garage.