‘Noisy’ expert judgements

Economics Nobel Prize winner Daniel Kahneman has released his latest book, Noise: A Flaw in Human Judgment, co-written with Olivier Sibony and Cass Sunstein.

The book is about the enormous variability – ‘noise’ – in most experts’ judgements and decisions arrived at from evaluating the same information, that you would otherwise expect to result in the same decisions.

Examples of such noisy decision-making include variability in doctors’ diagnoses of identical patients with the same condition, differences in sentences handed down by judges to people who have committed the same crime, divergences in professors’ grades for the same students’ work, etc.

Noise is present in all judgements. However, most people are oblivious to the inherent randomness in their decision-making and resulting actions.

Although variability in individuals’ judgements, reflecting differences in personal preferences, is to be celebrated in creative or personal circles, such as writing a song or choosing a new coat, too much variability in other settings can be dangerous, especially when the outcome is critical – e.g. risking patients being misdiagnosed and defendants unfairly sentenced, etc.

Arriving at consistent decisions is especially desirable in the corporate and public sectors, particularly for decisions that are repeated. Doing otherwise – in effect, making decisions arbitrarily or capriciously – is both inefficient (wasteful of resources) and potentially unfair. Decisions should not depend on the ‘good luck’ or ‘bad luck’ associated with who makes the decision.

For example, patients who present with the same symptoms, should receive approximately the same diagnosis (and, preferably, the ‘right’ one!). The notion that a given patient’s diagnosis (and treatment) depends on the particular doctor they see, and it might be different if a different doctor had been seen, is disturbing.

A practical solution to the problems arising from judgemental noise recommended by Kahneman, Sibony and Sunstein is to specify explicit criteria and weights – i.e. algorithms (e.g. using 1000minds) – to be used in support of decision-making, with the objective of more valid and reliable decisions.

‘Noise auditing’

Kahneman and his co-authors recommend that decision-makers and organizations conduct ‘noise audits’, with the objective of increasing ‘decision hygiene’.

A noise audit involves case scenarios containing the same information being presented to decision-makers who are asked to make their judgments individually – e.g. diagnosing patients or sentencing defendants. And then these judgements are compared, in the process gauging their variability.

An example of a noise audit supported by 1000minds from a recent study is discussed below.

Interviews and reviews

In which Daniel Kahneman, Olivier Sibony and others discuss the ideas in the book.

Example of a ‘noise audit’

1000minds has been used in an extraordinarily wide range of applications to reduce judgemental ‘noise’ and increase ‘decision hygiene’. Common examples include prioritizing patients for treatment, crimes for investigation, research questions and grant applications for funding and project management.

Many applications begin with a ‘noise audit’ whereby decision-makers are asked to participate in a 1000minds ranking survey involving ranking case scenarios containing the same information.

This audit/survey usually dramatically demonstrates the need for a new and improved decision-making process based on explicit criteria and weights (as recommended by Kahneman and his co-authors, as discussed earlier). Most decision-makers aren’t aware of how much they can differ in their judgments – that they are often highly idiosyncratic.

In an example from the field of disease classification, Mahmoudian et al. (2021) reports on a 1000minds ranking survey involving 34 experts in symptomatic early-stage knee osteoarthritis (OA). The experts comprised 14 orthopaedic surgeons, 13 rheumatologists, 2 general practitioners, 2 sports medicine specialist and 3 physical therapists.

Each expert was presented, in random order, with 20 patient case scenarios and asked to rank them, based on their clinical experience, as to how likely they would classify them as early-stage knee OA patients: from 1st = most likely to 20th = least likely. The case scenarios included the patients’ clinical signs and symptoms as well as socio-demographic characteristics such as age, gender and social circumstances.

Audit results

The results from the ranking survey are presented in Table 1 and Figure 1 below. In the table, the number in each cell is the number of participants who ranked the patient case scenario (horizontal axis) in each rank position (vertical axis). For example, 11 of the 34 experts ranked ‘Bob’ 1st (i.e. “most likely to have early-stage knee OA”), 5 experts ranked him 2nd, 3 ranked him 3rd, 5 ranked him 4th and 1 ranked him 5th.

As can be seen in the table and figure, agreement over the rankings for the two most ‘extreme’ patients, ‘Bob’ and ‘Rose’ – on average, 1st and 20th respectively – is higher than for the ‘middle-ranked’ patients. For example, ‘John’ received an almost full range of rankings, from 2nd to 20th – meaning some experts thought he was very likely to have early-stage knee OA, whereas others thought he was unlikely to.

Overall, the distribution of the experts’ rankings indicates a high degree of disagreement for most cases – in other words, the experts’ judgements are very ‘noisy’.

Based on these results, the authors concluded that what is needed – to reduce the noise and increase decision hygiene – are explicit criteria for classifying patients with early-stage knee OA. They are using 1000minds to specify these criteria and determine weights representing their relative importance.

Table 1: Participants’ ranks for the 20 patient case scenarios (n=34)

Figure 1: Participants’ rankings of the 20 patient case scenarios (n=34) – each participant’ ranking is indicated by a colored shape

References

D Kahneman, O Sibony & O Sunstein (2021), Noise: A Flaw in Human Judgment, Little, Brown Spark.

A Mahmoudian, S Lohmander, M Englund, P Hansen, F Luyten & International Early-stage Knee OA Classification Criteria Expert Panel (2021), abstract, 2021 OARSI Virtual World Congress on Osteoarthritis: OARSI Connect ‘21, “Lack of agreement in experts’ classification of patients with early-stage knee osteoarthritis”, Osteoarthritis and Cartilage 29, S299-S300.