Commenter RR cited a paper by Richardson and Norgate (2015) arguing that IQ is not as predictive of job performance as once thought. In table 1 (see below) they summarize the research, showing newer studies find much lower predictive coefficients than older studies. They list both the corrected and uncorrected correlations between IQ (or some close proxy there of) and job performance. These correlations have to be corrected for range restriction because jobs sort people so efficiently by IQ that within a given job, IQ differences are too small to predict much. In addition, measures of job performance can be unreliable because one year you can make $1000 in commission and the next year you make $10,000 in commission, so correcting for good and bad luck can make the correlation more meaningful.

Now looking at the five older meta-analyses, the mean corrected correlation is about 0.5. The mean correlation correlation of the five newer meta-analyses is 0.2. No idea why there’s such a huge discrepancy between the old and new studies. Perhaps it could be that in an era of wokeness and snowflake culture, job performance is more about participation than about actually doing a good job, thus lowering its correlation with IQ. Or perhaps researchers have become more aware of the file drawer effect, and journals are demanding studies be pre-registered to avoid selective publication of only high correlations. Or maybe wokeness has caused a bias in favor of publishing low correlations.

I decided to look at studies that approached the question from a different angle. Instead of just calculating the corrected correlation between IQ and job performance among regular employees, one study asked what happens when a bunch of brilliant people are hired to do a job normally performed by average people.

Smart cops

From the book A Question of Intelligence by Daniel Seligman (a great read for anyone new to the IQ debate):

We begin with a cautionary tale from the files of New York City’s police department. The time is April 1939. The long depression is still very much in place, and good jobs are hard to get. Any jobs are hard to get. So there is a huge turnout when the department announces civil service exams that will result in the hiring of several hundred policemen. More than 29,000 men take the written exam, which is essentially just an intelligence test.

By normal police standards, a sizable number of the testees are absurdly “overqualified.” In the circumstances, the NYPD set its standards high. It announced that the physical exam for cops would be administered only to the top 3,700 scorers on the written test. After the physical tests, there was more winnowing: It resulted in a new list of the top 1,400 prospects (whose rankings reflected a 70 percent weighting for written scores and 30 percent for physical scores). Going down this list, the department next offered patrolmen’s jobs to 350 or so of the top candidates. In the end, 300 of them—roughly 1 per¬ cent of those who had been competing for the jobs—ended up in the class of 1940.

The 300 were plainly smart cops. If you assume that the initial 29,000 test takers were roughly representative of the country’s overall IQ distribution, then you could estimate that the average IQ of the 300 was something like 130.

Fifty years later, a group of Harvard psychologists— Prof. Richard J. Herrnstein and two graduate students, Terry Belke and James Taylor—went back to the NYPD records to see what had become of the brainy class of 1940. Questionnaires were sent to the 192 men then still alive, and more than three-quarters of them responded. Analysis of the survey data demonstrate yet again that high-IQ people do well in the world. The group had on average stayed with the police department for 24.7 years and rose high in the ranks: 43 percent reached the rank of lieutenant or captain, and 18 percent became inspectors of one kind or another. The class of 1940 also produced one police commissioner, four police chiefs, four deputy commissioners, one chief inspector, two chiefs of personnel, one director of the city’s Waterfront Commission, one chief assistant district attorney, one director of the New York State Identification and Intelligence System, and one director of the New York Regional Office of the Law Enforcement Assistance Administration.

At first I was really excited about this study but then I remembered that the NYPD is a huge testocracy, so of course people who did well on written tests got promoted, since you have to take another one every time you apply for promotion(at least below the captain level). Did Hernstein not know this, or was he hoping we wouldn’t know or was it less of a testocracy in the 1940s? I doubt it since that was the peak testing era.

Now it’s very likely these smart cops still would have done well even if tests were not used to promote them since life itself is an IQ test, but I’m pretty sure they were so this promotion rates are uninformative. It would be like hiring 300 black cops and then claiming many got promoted because melanin enhances productivity, without telling your readers there was an affirmative action policy to promote black cops. That study would never have past muster with Hernstein or Seligman so they should have applied their same skepticism here, though in fairness, I can’t find Hernstein’s original paper so maybe he had a rebuttal or maybe the study included other less circular data .

Project 100,000

Perhaps the single biggest experiment ever done on IQ and job performance was Project 100,000. Normally the U.S. military avoids recruiting anyone with AFQT score below the 30th percentile (IQ 92; U.S. norms) and is prohibited from recruiting anyone below the 10th percentile (IQ 81; U.S. norms) however the need for more men during the Vietnam war combined with President Johnson’s desire to lift the poor into the middle class resulted in over 300,000 New Standard Men (IQ 82 to 92) being recruited from October 1966 to December 1971.

Sadly, the New Standard Men (NSM) died in war at three times the rate of the regular recruits. Of the NSM entering basic training, 41.6% remained after 23 months vs 68.8% of regular recruits (see figure below from Gottfredson, 2005). By subtracting these numbers from 100%, we see that just keeping your job put you at only the 31.2 percentile for normal recruits, but it put you at the 58.4 percentile for NSM.

On the bell curve, the difference between these two percentiles is 0.66 standard deviations, suggesting that the job performance curve of the NSM was 0.66 SD to the left of regular recruits. Now assuming the regular recruits average IQ 108 (the approximate average IQ of Americans above IQ 92) and the NSM average IQ 88 (the approximate average IQ of Americans ranging from IQ 81 to 92), the IQ gap between them is 1.33 SD (20 IQ points).

This suggests that if all American young men had been recruited by the army, the line of best fit on a scatter plot predicting normalized productivity from normalized AFQT scores would have a slope of 0.66/1.33 = 0.50. Assuming a bivariate normal distribution, the slope of the standardized regression line equals the correlation.

And note 0.5 might even be an underestimate because the denominator is likely too high and the numerator is likely too low. The true IQ gap is slightly less than 20 points because (1) some NSM likely faked their low scores to try to evade military service making the true average IQ of NSM likely higher than 88, and (2) the true IQ of the regular recruits was likely lower than 108 because it did not include the disproportionately high IQ men who got academic deferments or had powerful parents pulling strings. There also would have been considerable pressure on the military to make the NSM succeed, thus deflating the numerator.

But taking the numbers at face value, and assuming the military is representative of U.S. jobs, at least as recently as the 1960s, the correlation between IQ and job performance was 0.5, consistent with the older studies in table 1. The fact that my novel and indirect calculations confirm the traditional calculations bodes well. When wildly different approaches using massive datasets converge on the same result, you know you’re on the right path.