Andon Cord catapulted Toyota into 40 years of unprecedented quality and domination. What is Andon Cord and how did they do it?

The production costs are super high. Actually, they have always been high. In 1984, it cost NUMMI $15,000 per minute. That's $42,758 in today's value.

This means any issues in production also results in great losses. Toyota knew this very well.

Taiichi Ohno, an industrial engineer took upon the task and introduced Andon cord in Toyota's manufacturing plants in the early 1900s.

The problem:

Take the assembly line, each employee has a specific job to do and pass the car to the next person.
However, if the next person finds a defect in the process of doing their job then it takes time to fix this problem. It also stalls the work for everyone ahead in the assembly.

This costs a lot of money and also low quality in their cars.

Also, in a situation like this, an employee could end up monkeypatching a solution which lead to poor quality and increase in production costs.

The solution:

There had to be a better way. So, Taiichi Ohno introduced Andon Cord and here's how it worked

Install a long rope across the assembly line. When a defect is found by any employee, they simply pull the rope which triggers an alert and halts the entire production line. Following this, a line manager would work with the employee in understanding the defect to resolve the issue or pull the car out for further inspection.

Seems simple, doesn't it? Everytime, an employee finds an issue - pull the rope !

In those days it may seemed like an extreme reaction, but it still worked great for Toyota. This rope is Andon Cord and for 40 consecutive years, this process led the foundation of Toyota's unprecedented quality and domination in cars.

Taiichi Ohno was later known as the Father of Toyota Production Systems
Employee pulls the andon cord

The Andon Cord got pulled a LOT. Yet, it helped immensely.

It became a signal to highlight an anomaly and this bought light on potential defects. When a defect was suspected, an alert got triggered, and a sign board would light up and systems were stopped waiting for the problem to be resolved.

Just installing the Andon Cord wasn't enough though. Here are 4 major things Toyota did to get effective results.

1. Pull the cord, throw an alert

Each employee was highly encouraged to pull the cord. If you don't pull, then you are basically compromising the quality and increasing the cost. That was not an option.

No criticism for pulling the cord - even mistakenly pulling the cord or false-positives never witnessed a criticism. The entire team was onboard to the idea that a good quality product needs to be shipped and there is no compromise there.

Another way to look at it was - If you hesitate to pull the cord, then essentially we are looking at a compromised car quality. This was not accepted. Quality over quantity

They achieved this by shaping the culture at Toyota.

2. Culture

When a cord is pulled, a manager would walk up to the employee and they would --

  1. Thank the employee for pulling the rope.
  2. Remind them that no defect is small enough to be ignored and encourage to pull the cord and create an alert
  3. And say - "your efforts are appreciated by me and your CEO because you have saved a customer from receiving a defect"

The repetition of this gesture evolved into what we call today "Safety Culture".

The result of this was not just quantified in terms of increased sales and amplified quality at Toyota but each employee would eventually abolish any fear bringing a lot of transparency.

3. Resolve it now

A lot of focus was on resolving the issue at hand now. There is no need for a long bureaucratic process. The manager would essentially ask - "How can I help you?"

Help you and help you now. Eliminating long tail processes means continue to implement and improve at a high pace.

Learning at a high scale needs principals and practicing these principals at all times.

4. Continuous learning

You see, failure is not bad. Look up the dictionary, it never implies that failure is actually a bad thing. Each failure is an opportunity to learn and that's exactly what Toyota did to further improve themselves.

Faster learning, better quality, lesser production costs.

Who is working on these practices now?

Amazon has a Customer Service Andon Cord. This was an established practice of metaphorically pulling an Andon Cord when they noticed a customer was overpaying or had overpaid for a service. They would scan their systems looking for these kinds of potential customer service mismatches. These were considered defects at Amazon because they had a vision of being an organization that was always customer centric.

Netflix also encourages failtures and uses it for their learning. The popular Chaos engineering principles randomly brings down production servers. Developers plan and poke-yoke (mistake proofing) their code.

Andon Cord in Software environments

I think Andon Cord sets a great foundation for introduction of Incident Management in Engineering all around.

Much like Toyota, we all can learn to embrace incidents and look at them as an opportunity to learn.

We can run the same principles as Toyota to improve. I mean they did it in 1900s, I am sure we can do this today.

Schedule a team meeting and do this --

  1. Encourage teams to setup and receive alerts
  2. A blameless culture. No incident is small or big. Thank you team-mates for creating an incident or setting up the integrations so you can get incidents
  3. Focus on resolution immediately. If you work on temporarily fixing it, that's good too but make sure to create a ticket on JIRA / Linear / Clickup to look into later
  4. Learn from your incidents. Ask the responder this - How did you resolve this incident? Ask them to answer in notes so the next time this incident repeats, other responders would know how to resolve. Continuously keep learning about your own systems.

I learnt about Andon Cord from "The DevOps Handbook" written by Gene Kim, Jez Humble, Patrick Debios, and John Willis. It's a great book and I highly recommend everyone from your engineering team to read it.