I got to read a lot of great books in 2013, shaping a lot of my current thinking both in terms of business and web operations. If you’re looking for something to read, follow along!
Effective Monitoring and Alerting
I’ve developed an interest in improving our monitoring, and this book delivered some good insight. The book is a quick read, and it ends up delivering a great idea for how a monitoring system could be structured, without going too much into implementation details.
There’s a distinct lack of books on this topic, so this one is good to start with.
The Signal and the Noise: Why So Many Predictions Fail — but Some Don’t
From the guy who predicted the 2012 election results, a great introduction on statistics, probability, written in an entertaining style (i.e. not boring you with too much math). Gives lots of examples from Nate’s personal history of getting into numbers and probabilities. Very entertaining read and will definitely make you think about these topics some more.
The Black Swan: The Impact of the Highly Improbable
Nassim Taleb has an interesting writing style, to say the least. Among slews of bashing other people and polishing his own ego, you’ll find streaks of genius.
The Black Swan is a great read on managing risk, or the inability of managing certain risks. If you’re running any kind of application in production and are actively involved in operations, this book is a must-read.
The book’s point could be made in a significantly shorter version of it, but that would of course deprive you from the joys of reading about Taleb’s personal feelings.
This book is a classic, and I only got around to reading it this year, shame on me.
It’s wonderfully short and to the point, dealing a lot with monitoring and making your capacity planning work in the best ways possible.
If you haven’t read this book yet, start right now. It’s timeless for anyone working in web operations.
Friendly Fire: The Accidental Shootdown of U.S. Black Hawks over Northern Iraq
A book describing in great detail the shootdown of two helicopters over the no-fly zone in Iraq in the early nineties.
Why is this relevant and interesting? It’s pretty terrible that people have died during this incident.
The fascinating parts of this incident and the book are in what happened in all the parties involved in the incidents and in the organizations as a whole, for years leading up to the event.
This book was an eye-opener for me on how I think about production incidents. It covers in great, great detail the events in a socio-technological organization, the US military.
There’s so much to learn in this book that can be applied to web operations.
The Field Guide to Understanding Human Error
Sidney Dekker is a great writer on topics are very relevant to web operations: humans.
Human error is a common excuse for production incidents, all the more so when a plane crashed or other bad things happen.
Dekker puts these things in perspective, and he’s helped shape my thinking a lot. What a human does in a particular situation, why he does it and how the organization and the work environment have shaped his understanding of a particular situation.
A great and highly recommended read, goes nicely with Snook’s “Friendly Fire.”
Managing the Unexpected: Resilient Performance in an Age of Uncertainty
I didn’t know what to expect from this book, I bought it on a hunch.
It turns out it’s an important read when it comes to managing risk and achieving organizational resilience. Fits in perfectly with the books mentioned just before.
It looks at a slew of different industries and on how they handle risk and risky situations. Fire departments, air travel, and lots more.
It introduces the concept of a high resilience organization and a lot of great ideas, which are picked up by Dekker in a book introduced below.
The book on complex systems and systems theory.
I was amazed, only in hindsight, how important this book is. It introduces the idea of complex systems and how parts in a complex system interact with each other, how they influence each other, how even just small changes in one can have long and delayed effects on others.
The book introduces these concepts in ways not directly related to web operations and software, but you’ll notice the relationships right away.
This is the book to read. Read this one first, read it now. The most influential book I’ve read in 2013.
After reading “Thinking in Systems”, you’ll see systems and complex interactions between systems everywhere. You know why? Because there are systems everywhere.
Small Giants: Companies That Chose to Be Great Instead of Big
Running a small little business myself, I strive to get as much inspiration and insight on how others running their companies as possible.
This book has been a big inspiration for me. It covers a dozen or so companies that did everything they could to build a balanced workplace with happy employees and a long-lasting legacy rather than go for the big buck or for outside funding.
Lots of great stories in this book. If you’re running any kind of business, this is a great read.
The Challenger Launch Decision: Risky Technology, Culture, and Deviance at NASA
This book is so chock full of insight on behaviors and interactions in socio-technical organization, it kept blewing my mind over and over.
It’s about the Challenger incident in 1986, and everything contributing to this incident over years leading up to it.
The level of detail extracted by Diane Vaughan is amazing. She introduces the term “normalization of deviance” in the book, a term that stuck with me, and that I’m now seeing everywhere. It provides a description for something that naturally happens, whether we want it or not, but that seems unavoidable.
Again, this book isn’t directly related to software, but it’s so very relevant to what we do in web operations. Humans, human interaction, organizations and culture. I found these bits to be the most interesting and fascinating bits when it comes to running anything in production.
If you’re curious about these bits too, then this book is for you. Together with “Friendly Fire”, it provides amazing insights and learnings for running your own operations team or how to improve human interaction and culture at your company overall.
These books are very relevant to the ideas of DevOps.
Now that you’ve read “Thinking in Systems”, “Managing the Unexpected”, “Friendly Fire”, and “The Challenger Launch Decision”, here comes Dekker putting all of them together, quite literally.
I was quite amazed to find references to all of them in this book.
The books left me wondering a bit on how you can detect normalization of deviance, what you can learn from the Challenger and Iraq incidents, from high-resilience organizations.
This book is Dekker helping you find answers.
Read “Thinking in Systems” first, then “Friendly Fire”, “Managing the Unexpected” and “The Challenger Launch Decision”, and then read “Drift Into Failure” to find more practical applications of what’s been introduced in the other books.
I’ve had my mind blown several times by all of them. They’re pretty amazing eye-openers.
Just started reading this, but it seems relevant to the topics of biases. Had it on my reading list for a while now, so it’s about time to read it.
Will have to report back on this soon.
Free Bonus Reads
If you need more business-related reading, I posted a list of the books that influenced how we run our company over at Travis CI.
Here’s to more reading in 2014!