Just Emil Kirkegaard Things

Just Emil Kirkegaard Things

Understanding audit studies

The demographic priors cannot be ignored

Emil O. W. Kirkegaard's avatar
Emil O. W. Kirkegaard
Jan 11, 2026
∙ Paid

In January, I am paywalling every third post as an experiment. In December I paywalled every post. I will keep trying different approaches and hopefully at the end of the year, I can write a post about which approach works best. The reason for the paywalls is that the blog doesn’t make enough money and I’ve got a baby to feed.

John B. Holbein is a mild mannered economist on X. 2 days ago he posted a long post on Lee Jussim’s new review paper of racial discrimination in hiring. Jussim is the stereotype accuracy social psychologist. Holbein says:

Some people argue that discrimination is rampant in our society.

Others argue that few people discriminate.

Who is right?

This article by Lee Jussim takes on this question.

(I found this article to be a fascinating read. It was a little mind-bendy at first, but then it became very intuitive.)

Jussim flags what he calls the “discrimination paradox.”

The apparent paradox is that some high-quality audit studies find large disparities (e.g., in callbacks) against minority applicants, while other equally rigorous studies—lab experiments, field studies of everyday interactions, and platform-based choices—find very few discriminatory acts.

The key point is arithmetic, not a paradox. Outcome gaps in audit studies don’t map cleanly onto the number of discriminatory actors. This is especially true when base rates are low. When positive outcomes are rare (e.g., job callbacks), even a small number of biased decisions can generate large relative disparities in outcomes.

By contrast, many non-audit studies focus directly on individual decisions or acts (e.g. responses in games) and therefore estimate how often discrimination actually occurs at the decision level. Those studies often find discrimination happens infrequently, even though it is systematic.

This helps reconcile findings that otherwise look contradictory: large disparities in outcomes can coexist with discrimination occurring rarely at the level of individual decisions.

So who’s right: those who say discrimination is rare, or those who say only a few people discriminate?

According to this paper, both are.

It’s this one:

  • Jussim, L. The discrimination paradox. Theor Soc 54, 1083–1102 (2025). https://doi.org/10.1007/s11186-025-09652-0

Rigorous studies published within the past eight years have found diametrically opposed results regarding racial discrimination. Some have found that racial discrimination is very rare; others that racial discrimination is very common. The paradox is that they are all well-conducted studies. In this paper, I show why there is no paradox, and the two sets of findings are completely compatible.

The two lines of evidence that Jussim is trying to reconcile are:

  1. Audit (resumé submission/job application) studies. Typically researchers make up fake CVs, keep everything constant but the demographics they are interested in. So they might apply for junior programmer roles. The CV is then some photo, a name, university etc., and the photo shows either a Black/White/whatever person, male/female, or whichever other ‘protected class’ it is currently fashionable to study with regards to discrimination.

  2. Direct action studies. E.g., people engage in mock jury trials, play ultimatum games (player 1 proposes splitting some amount of real money 50-50 or 80-20 etc., and player 2 accepts or not; if they don’t no one gets anything), or some other similar game.

The first kind of study usually finds that protected classes receives fewer callbacks than the Whites. Jussim summarizes one study:

A review and meta-analysis (Quillian et al., 2017) found 21 audit studies of racial discrimination in hiring since 1989 and three additional studies going back to 1972. The studies included over 55,000 applications submitted for over 26,000 jobs. There were two headline findings: (1) On average, White applicants received 36% more callbacks than did Black applicants. (2) This difference did not decline between either 1972 or 1989 and 2015. Indeed, there was weak evidence that it had increased over that time.

On the other hand, direct action studies:

In the Peyton and Huber (2021) study, participants played the ultimatum game 25 times with either Black or White partners, so the total number of offers accepted or refused was over 18,000. They operationalized racial discrimination as occurring when White players rejected offers from Black players that would have been accepted had the person offering been White.

The abstract of the paper emphasizes “racial resentment” and “explicit prejudice.” Indeed, the last sentence declares that “explicit prejudice is widespread.” However, to be clear about their main result regarding discrimination, I quote from the paper directly (pp. 30–31):

The first estimate, a 1.3% point decrease (p <.01) in the probability of acceptance, shows that, on average, white responders engaged in anti-Black discrimination by rejecting offers they would otherwise accept if the proposer was white (M1.1).

1.3%point sounds rather trivial. Not mentioned by Jussim are mock jury trials. They give results like these:

This post is for paid subscribers

Already a paid subscriber? Sign in
© 2026 Emil O. W. Kirkegaard · Privacy ∙ Terms ∙ Collection notice
Start your SubstackGet the app
Substack is the home for great culture