Non-lineary is everywhere. Faults in a machine are not uniformly distributed but are interdependent on one another leading to a power relation or even an exponential relation. This is the case in virtually every system.
The solution to addressing these problems is to prioritise the order in which problems will be tackled. But what is the best way to do this?
One of the best regarded approaches takes into account the non-linear nature of problems. Vilfredo Pareto observed the same non-linearity in regards to land ownership in his day; 80% of the land was under the control of 20% of the population. In his honour, Joseph Juran (who, at one point, was an Apple consultant) invented the Pareto chart.
A Pareto chart provides a basis by which to address problems by taking into account the severity with which they impact a service. An example is illustrated below.
Imagine that the observable issues fall into nine (9) categories with various frequencies. First, we plot the number of issues in each category in reverse order. In the example above, about 600 of the 1000 observations fall into category D and so on. Next, we calculate the percentage that these numbers represent then plot the cumulative percentage. The last category should constitute only what is required to achieve 100%. Finally, we plot a horizontal line at the 80% mark and designate all those issues below the line as the vital few from which amendment will produce the biggest gains.
Once we have resolved D and F conclusively, we can repeat the exercise using a fresh set of observations until we have satisfactorily raised the quality of the process.
You can find a code example at https://github.com/paulkorir/blog-examples/blob/master/pareto_chart.py. The key functions are:
group_data(data)
, which takes a list of observations in the N categories and produces a list
of tuple
s each with a str
and int
. This list is the reverse-sorted category counts e.g. [('C', 70), ('B', 23), ('A', 7)]
pareto_chart(grouped_data)
plots the Pareto chart like the one above.