Converting GMAT scores to IQ

Commenter Deeru asked me to convert GMAT scores to IQ equivalents.  This is always tricky because while IQ tests are normed on the general population of Western countries, college admission tests are normed on only the educated segment of a population, so converting from one type of norming to another requires some assumptions.

According to Wikipedia:

The Graduate Management Admission Test (GMAT (/ˈmæt/ (JEE-mat))) is a computer adaptive test (CAT) intended to assess certain analytical, writing, quantitative, verbal, and reading skills in written English for use in admission to a graduate management program, such as an MBA.[3] It requires knowledge of certain grammar and knowledge of certain algebra, geometry, and arithmetic. The GMAT does not measure business knowledge or skill, nor does it measure intelligence.[4] According to the test owning company, the Graduate Management Admission Council (GMAC), the GMAT assesses analytical writing and problem-solving abilities, while also addressing data sufficiency, logic, and critical reasoning skills that it believes to be vital to real-world business and management success.[5] It can be taken up to five times a year. Each attempt must be at least 16 days apart.[6]

It’s ironic how they deny the GMAT is an intelligence test, while in the next sentence describing it is a test of “problem-solving abilities” because one of the most common definitions of intelligence is the (cognitive) ability to problem solve.  It’s clear that the people who make the GMAT are trying to have their cake and eat it too.  They want the predictive validity of an IQ type test, while at the same time, want to be seen as good liberals who don’t believe in IQ.

Of course any test that measures literacy and numeracy  will tend to correlate substantially with g (the general factor of all cognition, measured by IQ tests) whether the test manufacturers intended it to be an IQ test or not because symbolism itself (words and numbers) is a defining feature of the human intellect.

Here’s some basic GMAT data:


My first observation is that from about 300 to 700, GMAT scores are more or less normally distributed with a mean of 551.94 and a standard deviation (SD) of 120.88.

GMAT scores of 300 to 700

If we assume that the GMAT population is roughly equivalent to the U.S. college graduate population (mean IQ 111, SD = 13.5), compared to the general U.S. population (mean IQ 100, SD = 15), then the following formula equates GMAT scores to IQ equivalents (U.S. norms):

Formula one (for GMAT scores of 300 to 700):

IQ = [(GMAT score – 551.94)/120.88](13.5) + 111

GMAT scores of 700 to 800

However much like the pre-1995 SAT, GMAT scores seem to become much more rare than the Gaussian curve would predict at the highest levels (perhaps because of ceiling bumping or even Spearman’s Law of Diminishing Returns reducing the correlation between sub-sections).  For example, a score of 800 is 2.05 SD above the GMAT mean, which on a Gaussian curve, predicts one in 52 GMAT testees should score 800.  Instead Deeru claims only one in 6,667 scores this high!

But in order to map this to the IQ scale, we need to know how many people  would score 800 on the GMAT if all four million 22-year-old Americans took the GMAT every year (including college dropouts, high school dropouts etc).

I begin with the assumption that the higher you would score on a graduate school admission test, the more likely you are to actually take such a test (given the correlation between academic talent and education level). and so roughly 100% of U.S. 22-year-olds who would score perfect on a graduate school admission test, actually take such a test, and whatever shortfall there may be is roughly balanced by perfect scoring foreign-test takers, or test-takers from other age groups.

Thus people who score perfect on graduate school admission tests did not merely score higher than those who are applying to graduate school, but they scored higher than all 22-year-olds in America, if all 22-year-olds took these tests.

So if only 30 people a year score perfect on the GMAT, does that mean that only 30 of the four million 22-year-olds in America each year would score 800 on the GMAT?  No, because there are many genius 22-year-olds who would have scored 800 on the GMAT had they decided to major in business, but instead are busy acing the LSAT or the GRE or the MCAT etc.

Only 23.6% of advanced degrees are in business, thus I estimate that only 23.6% of people who write graduate school admission tests are writing the GMAT.  But if 100% of aspiring grad students wrote it, then the number scoring perfect each year should jump from 30 to 127.

So assuming roughly 100% of U.S. 22-year-olds who would have scored perfect on grad school admission tests actually take said tests (and whatever shortfall is roughly balanced by foreigners and other age groups), and assuming only 23.6% of said testees take the GMAT in particular, then:

if all four million U.S. 22-year-olds took the GMAT, only 127 would score 800, which means that an 800 is a one in 31,497 level score.  This equates to an IQ of 160 (U.S. norms).

So given that:

GMAT 800 = IQ 160

and given that 700 = IQ 128 (per formula one), then the following formula equates high GMAT scores to IQ (U.S. norms):

formula two (for GMAT scores 700 to 800):

IQ = 0.32(GMAT score) – 96

I do not consider the GMAT or any other college admission test to be a particularly good measure of intelligence, however when scores from actual IQ tests are not known, the above conversions are a useful proxy.


Intelligence and problem solving

Commenter Gypsy recently sent me the following email:

I know it’s a little dumb to mention it again, especially after so much time has elapsed between convos about it, but I feel as though definitions of intelligence posed most commonly (Adaptability, problem solving, even potentially reasoning) lack an intuitive connection to an essential side of intelligence commonly ignored mainly because of our main practise for assessing intelligence: Problem POSING. We discover intelligence broadly by posing questions and assessing the ability of the candidate to deliver the correct answer, but the construction of a sophisticated plan is an essential and actually more used element of intelligence that is not immediately implied by the definitions we propose.

I know that problem proposition is implied by the definition, but the language doesn’t intuitively convey it and it is thus not immediately implied. I think the language used should be as intuitive as possible so as to immediately capture the essence of the thing itself all at once.

Thanks for reading,

If I understand Gypsy’s email correctly, he seems to be saying that the inherent flaw in how we define and measure intelligence is that we only look at the ability to solve problems, when a crucial part of being smart is identifying the problem itself.

Of course I would argue that it’s not our intelligence that identifies the problem, but rather it’s our feelings.  If we feel the slightest bit of discomfort, even if it’s something as trivial as an itch that needs to be scratched, it’s by definition a problem (since it’s bothering us), and our intelligence is just the brain’s problem solving computer that solves whatever problems our feelings identify.

Now we evolved to feel pleasure when we are engaging in behavior that enhances our genetic fitness (surviving, making money, making love, making friends) and feel pain when we are denied these achievements, and so we are generally motivated to use our intelligence to our genetic advantage,  at least to some degree, or it couldn’t have evolved in the first place.

However because everyone’s incentive structure is unique, one man’s problem is another man’s solution, so an IQ test must DECIDE for us what the problem is, so everyone’s problem solving computer (IQ) can be tested by the same standard.

However where Gypsy makes a very good point (if I understand him) is that the problem solving IQ tests often demand is very one dimensional, while in the real life strategic situations Gypsy is interested in, we have problems within problems within problems.

So instead of the problem being clearly defined like it is on most IQ tests (how do I fit the puzzle pieces together to make an animal?) it could be something as complex as “how do I win a war?”  This is such a complex problem that you have to break it down into lots of mini-problems, and solve them in the correct sequence, while at the same time, the problem is constantly changing because your enemy is adapting to each of your moves.

German military strategist Helmuth von Moltke famously stated ““No battle plan survives contact with the enemy.”

I actually don’t think IQ tests do a very good job at capturing this kind of dynamic interactive problem solving because all of the problems on IQ tests are static and simple enough to be solved in a few minutes.  What is needed is not so much an IQ test, but an interactive IQ contest, where people compete in a cognitively demanding zero sum game where one person must outsmart the other.

I used to think chess was the ultimate test of intelligence, but its sensitivity to practice and teaching, and the fact that computers do better than people, dampened my enthusiasm.

What is needed is a version of chess that’s constantly changing, so you can’t practice it or study openings, endgames, and traps, you must constantly invent your own; because one day the board has 64 squares, the next day it has 225.  One day each side has one queen, the next day each side has eight queens etc.  Perhaps some genius could write a computer chess program where such changes would occur randomly, so whoever had the highest rating on this constantly changing version of chess, would be judged the smartest person.

But unfortunately no matter how much you altered the size of the chess board or the number of pieces, computers would probably still beat people, so what is needed is a strategy game that computers can’t outsmart us at, if it’s going to have credibility as a test of intelligence.

Mug of Pee’s IQ

Commenter Philosopher wanted to know what I think Mug of Pee’s IQ is.

On college admission tests, Mug of Pee’s IQ equivalent is as high as 160, but on actual IQ tests like the Wechsler intelligence scale which he took at age nine, I crudely estimate Mug of Pee scored 127 (after adjusting for old norms).

Now, normally when someone has a huge achievement score > IQ score gap, they are described as overachievers, because they have learned far more than their IQ would predict.  Instead of humbly accepting this label, Mug of Pee has devoted his life to denying any distinction between IQ tests and achievement tests, irrationally claiming the SAT is the best IQ test and  pretentiously dismissing actual IQ tests as “soi-disant IQ tests”.

So what is his true IQ?

In order to answer this question, we must first note that IQ tests are largely considered valid to the extent that they measure g (general intelligence).  g is simply whatever causes all cognitive abilities to positively correlate.  The g loadings of the Wechsler and the SAT in the general U.S. population are not known, but by dividing their correlations with the Raven by the Raven’s g loading, it can be estimated that the Wechsler has a g loading of 0.94 and the SAT has a g loading of 0.68.  Further, based on the fact that Dartmouth students, largely selected based on SAT scores, regressed nearly half way to the U.S. mean on the WAIS, I estimate the WAIS and SAT correlate 0.52.

Armed with these three statistics, the following regression equation can be built (hat-tip to a member of Prometheus who suggested such equations to me many years ago):

Expected g Z score = 0.8(Wechsler Z score) + 0.26(college admission test Z score)

So, since I estimate Mug of Pee has a Z score of +1.8 on the Wechsler (IQ 127) and as high as +4 on college admission tests (IQ equivalent of 160), we’ll plug those values into the equation:

Expected g Z score = 0.8(1.8) + 0.26(4)

Expected g Z score = 1.44 + 1.04

Expected g Z score = 2.48

So on a hypothetically perfect measure of psychometric g, Mug of Pee would be expected to score 2.48 standard deviations above the U.S. mean (IQ 137).  We can say with 95% certainty that his true level of g is between 129 and 145.


The SAT is to IQ as shadows are to height


Arthur Jensen noted how measuring a trait indirectly can often lead to misleading conclusions.  He compared it to measuring a person’s height by measuring the height of their shadow.  The correlation between actual height and shadow height could be extremely strong under controlled conditions, but when the position of the sun moves, the measurements become meaningless.

I think giving someone an official IQ test like the Wechsler, is somewhat analogous to measuring their height directly on a stadiometer, while giving someone the SAT is like measuring height from one’s shadow.  Because you’re not directly observing how fast one can learn like you do on many Wechsler subtests, you’re indirectly inferring it from how much they know.

Of course shadow measurements can be extremely accurate.  If everyone is measured at the same time of day,  shadow height will correlate near perfectly with actual height, and when everyone takes the SAT with a similar academic background, the SAT correlates near perfectly with general intelligence (the g factor) as found in a sample from the University omTexas at San Antonio.

.However in America, there’s a strong class divide, so you have the upper class, who studies AP algebra, geometry, calculus and Shakespeare, and then you have the lower class, who attends working class schools and is dissuaded from going to college at all.  The lower class tend not to even take the SAT, but when they do, they tend to score below their genetic potential.  For example Bill Cosby had an IQ equivalent around 80 on the SAT despite being very intelligent on an official IQ test and known for his comic wit.  Other quick comic minds from working class backgrounds who underperformed on the SAT include Rosie O’Donnell and Howard Stern.

A good analogy would be the upper class has their shadow height measured in the morning where shadows are quite long.  The lower class has their shadow height measured in the afternoon, when shadow height is quite short.  Now within each class, shadow height may correlate near perfectly with stadiometer height, just as within each class, the SAT may correlate near perfectly with official IQ.  But when the ENTIRE population is aggregated, the correlation between shadow height and stadiometer height plummets because of the class inequality, just like the correlation between SATs and official IQ scores plummet.

This explains why people who are 46 IQ points above the U.S. mean on the new SAT regress to only 21 IQ points above the U.S. mean on the Raven IQ test, suggesting the new SAT correlates 21/46 = 0.46 with the Raven in the general U.S. population.  Arthur Jensen noted that the correlation between two tests is a product of their factor loadings, so assuming the only factor the SAT and Raven share is g, then dividing their 0.46 correlation by the 0.68 g loading of the Raven tells us the SAT also has a g loading of 0.68, or roughly 0.7 if you like round numbers.

A g loading of 0.7 is not low, and tells us the SAT is a reasonable proxy for g in the general U.S. population, but it’s nowhere near the 0.9 g loading the SAT enjoys in more socioeconomically homogenous subsets of America such as students at the University of Texas at San Antonio.  This is because the general U.S. population is analogous to people having their shadow heights measured at different times of day, while the students at a given local university are analogous to students all having their shadow height measured at the same time of day, thus maximizing the correlation between shadow height and real height. 

Reaction norms vs independent genetic effects

Commenter Mug of Pee writes:

he rots in the [Northern] california coast.


he sears/is sere and dies in the Saguaro National Park.


so [an HBDer] walks through Redwood National Park. he looks around and sees no cacti. he concludes that cacti are genetically inferior to redwoods.

he walks through Saguaro National Park. he looks around and sees no redwoods. he concludes that redwoods are genetically inferior to cacti.

Mug of Pee has been preaching the gospel of reaction norms for years and insisting that HBDers don’t grasp the concept.  I think I do, though I never heard of it until Mug of Pee mentioned it.  It was not discussed in any HBD book I read (though I’ve only read a few authors).

The Phenotype = Genotype + Environment model (independent genetic effects)

Of course I was well aware that environment affects IQ (the Flynn effect is proof of that), but my thinking was confined to the Phenotype = Genotype + Environment model which states that while the same genotype can have a very different phenotype depending on which environment it’s reared in,  the rank order of phenotypes will remain very constant regardless of which environment the genotypes are reared in, as long as they’re reared in the same one.

So for example, although environment drastically affects the heights of men and women (both sexes are much taller today than in the 19th century, and both sexes are much taller in the developed World than in the Third World) the male > female height gap remains of similar size and direction across time and place.

Reaction norm model (dependent genetic effects)

The reaction norms model is more subtle, instead arguing that genes for tallness in environment A might be genes for shortness in environment B, so while men might be taller than women in America, if those same genotypes were raised in a different country, the women might be taller than the men.

Of course in reality we don’t see this.  Men are taller than women on average in every country and time period I’m aware of, thus the sex-linked height genes would be what Mug of Pee calls “independent genetic effects”, meaning their effect on the phenotype is independent of the environment (the Phenotype = Genotype + Environment model) because no matter what environment you’re in, having a Y chromosome adds height.

You’ll be much taller if raised in 21st century Western Europe than in 19th century sub-Saharan Africa, but in both times and places, you’ll be much taller with a Y chromosome than without.  Mug of Pee concedes that physical genotypes (i.e. height genes) tend to have independent genetic effects, however he suspects that mental genotypes (IQ genes, personality genes, genes for autism and schizophrenia) have dependent genetic effects because humans are cultural creatures known for our behavioral plasticity.

Mug of Pee feels that most estimates of the heritability of IQ (whether from twin studies of Genome-wide Complex Trait Analysis) are too high because they are limited to people in the same country, and it could be that a certain genotype increases IQ all over that country, thus spuriously increasing heritability, when in another country it may decrease IQ.  To get a better estimate of heritability, Mug of Pee would like to see a study where identical twins reared apart were not just raised in different towns, but different developed countries (i.e. an American’s identical twin is raised in Japan, a Canadian’s identical twin is raised in Germany).  If this were done, Mug of Pee feels the adult IQ correlation of identical twins raised apart would drop precipitously because a genotype that intellectually benefits from one country’s language or education system might be stunted by another’s.

In support for the reaction norm model of IQ, Mug of Pee frequently cites Ashkenazi Jews who were intellectually super accomplished in the 20th century, but barely a blip on the radar screen in prior centuries.

Ability vs. success

The problem with thinking IQ follows a reaction norm model is that IQ is moderately correlated with physical traits like brain size and highly correlated with physiological abilities like the speed and consistency of complex reaction times and physical genotypes seem to have independent genetic effects.

However Mug of Pee might be half-right.  While IQ genes probably have independent genetic effects, success genes (i.e. wealth, status, power, eminence) probably follow the reaction norm model to a large degree.  So while Ashkenazi Jews may have had higher IQs than gentiles for many centuries, their achievements only surpassed those of Gentiles in the 20th century because the cultural and economic changes that occurred were both favorable to the Ashkenazi genotype and unfavourable to the Gentile genotype.

So wealth, status, and achievement genotypes may have environment dependent effects.  Billionaire Warren Buffet has stated that his brain is just wired for a certain type of thinking that happens to be valued in modern markets, but in other periods of history he would have been a loser.  So while Warren’s mathematical genotype is an independent genetic effect, his wealth genotype may be a dependent genetic effect.  Bill Maher has stated that he’s only rich because it’s a fluke to live in a society where you gain wealth by telling jokes.

So one can see how a reaction norm view of humanity would correlate with Marxism and commenter Mug of Pee is a devout Marxist.

I suspect the heritability of income, power, and status would drop much more when moving from a within country twin study to an international twin study, than the heritability of IQ would because I suspect achievement is much more culture dependent than IQ.  Nonetheless, I feel even the former variables are caused by some independent genetic effects.

Did Neanderthals go extinct because they weren’t smart enough to survive the cold?

Neanderthals had short stocky bodies perfectly suited to the cold.  Modern humans had tall skinny bodies, terribly suited to the cold. Yet despite being at a physical disadvantage, modern humans had the intelligence to adapt the situation to their advantage.  The BBC writes:

…Neanderthals, with their shorter and stockier bodies, were actually better adapted to Europe’s colder weather than modern humans. They came to Europe long before we did, while modern humans spent most of their history in tropical African temperatures.  Paradoxically, the fact that Neanderthals were better adapted to the cold may also have contributed to their downfall.

If that sounds like a contradiction, to some extent it is.

Modern humans have leaner bodies, which were much more vulnerable to the cold. As a result, our ancestors were forced to make additional technological advances. “We developed better clothing to compensate, which ultimately gave us the edge when the climate got extremely cold [about] 30,000 years ago,”…

If this analysis is correct, it provides strong support for the cold winters theory of human population differences in IQ, because it suggests cold climates were so cognitively demanding for hominins, that not even Neanderthals, whose bodies were physically adapted to the cold, could survive when it got really cold.


The brain-size IQ correlation comes roaring back to life!

During the 1990s to the early 2010s, it had become well documented the brain-size IQ correlation among adults living in developed countries was about 0.4.  Then in 2015, a meta-analysis by Jakob Pietschnig, Lars Penke, Jelte M. Wicherts, Michael Zeiler, and Martin Voracek surfaced claiming the brain size-IQ correlation was only 0.24!  The paper argued that the 0.4ish figure that was typically cited was inflated by publication bias and these authors went out of their way to counter this.

While much of the HBD-o-sphere and academic community uncritically accepted the results of this meta-analysis and routinely cited it in their articles, I was immediately suspicious and argued that failure to correct to for range restriction and other methodological problems had spuriously deflated the correlation and that the true correlation was much closer to the traditional 0.4 than to the 0.24 Pietschnig et al had reported.

Now a brand new meta-analysis by Gilles E. Gignac and Timothy C. Bates is being published in the peer reviewed journal Intelligence showing once again Pumpkin Person was right!  The authors reviewed the research cited by Pietschnig et al but corrected for range restriction, test quality, and sample quality and a 0.4 correlation was found.

Abstract below:


However even 0.4 might be an underestimate of the within sex correlation between brain size and IQ because no correction was made for the fact that some samples combined men and women which lowers the correlation because men have substantially larger brains than women, yet virtually the same IQs.  The within sex correlation might be closer to 0.45.

It seems the brain-size IQ correlation is very similar to the height-weight correlation. In a sample of male university students, the heigh-weight correlation was about 0.4.  Arguably brain size is to IQ as height is to weight.  A big brain helps make you smarter just as a tall height helps make you bigger but just as large brains are only one cause of IQ, greater height is only one cause of greater size, and it’s possible for small brained people to be brilliant just as it’s possible for very short people to be huge, and vice versa.

A genetic basis for IQ?

The brain-size IQ correlation is controversial because it suggests IQ is a biological variable, which in turn suggests it’s genetic.  While IQ skeptics have been cheering the failure of genome-wide association studies to identify many genetic variants associated with IQ, they would be wise to not get their hopes up. Davies et al (2011) genotyped 3511 unrelated adults and found heritabilities of 0.44 for crystallized intelligence (acquired knowledge) and 0.51 for fluid intelligence (abstract reasoning).  Taking the square root of these heritabilities suggests the IQ phenotype-genotype correlation may exceed 0.7!  It should be noted that unlike traditional twin studies which yielded even higher numbers, genome-wide complex trait analysis only quantifies the additive portion of heritability, so the full heritability may be higher still.

Of course as commenter “Mugabe” notes, research is needed across a much wider range of environments to determine whether these are independent genetic effects.

Marching up the evolutionary tree

Scientists commonly assert that evolution is not progressive and that organisms occupying lower branches on the evolutionary tree are not anymore primitive or ancestral than organism’s occupying higher branches, because all extant life are equivalent cases of time-tested evolutionary success.

For example, Harvard biologist Stephen Jay Gould wrote “evolution forms a conspicuously branching bush, not a unilinear progressive sequence…earth worms and crabs are not our ancestors; they are not even ‘lower’ or less complicated than humans in any meaningful sense.”

This web page even displays a helpful diagram of an evolutionary tree to debunk the idea of evolutionary progress:

It’s technically true that no extant species is precisely ancestral to humans.  It’s also true that smart, complex and impressive life forms could theoretically have split off at any point on the evolutionary tree, and that “progress” is a somewhat subjective term.  But if all extant life forms were truly equally evolved, and if lower branching life forms were in no sense ancestral to higher branching life forms,  there should be zero correlation between position on the tree, and “progressive” traits like brain size and encephalization quotient (ratio of brain size to expected brain size for body size).

In order to test this hypothesis, I decided to compare degree of branching on the evolutionary tree (which I defined as number of splits on the tree before a given taxa splits off) and brain size/enchephalization, in 1) three major kingdoms of life, 2) four major animal groups, 3) five major higher primate groups, 4) four species of the genus homo, and 5) nine populations of modern humans.

For each of these samples, the Pearson correlation coefficient (r) was computed.  Such correlations can range from -1.0 (an increase in X perfectly predicts a decrease in Y, and vice versa) to +1.0 (an increase in X perfectly predicts an increase in Y, and vice versa).  If evolution is progressive, we’d expect most of the correlations to fall between 0 and +1.0.  If evolution is regressive, we’d expect most of the correlations to fall between 0 and -1.0.  If evolution were neither progressive nor regressive, we’d expect a mix of positive and negative correlations, averaging out to roughly zero.

What I actually found was that all the correlations were positive, ranging from +0.5 to about +1.0.

Correlation between number of splits and encephalization among kingdoms: +0.5

The following tree shows that plants, animals and fungi are all descended from a common ancestor.  That lineage split into plants (on the left) and non-plants (on the right) and then the non-plant branch splits again into animals and fungi. So plants are descended from one split, but animals and fungi are descended from two.


Notice how animals, which are descended from two splits have a brain (averaging an encephalization quotient of about 0.5), but plants which are descended from only one split, do not.  This implies a positive correlation between number of splits and intelligence, but since brainless fungi are also descended from two splits, the correlation is only moderate: +0.5.


Correlation between number of splits and encephalization among four major animal groups: +0.9

Now among animals, in the below tree, worms are descended from one split, fish are descended from two, and birds and mammals are descended from three.


Because mammals have an average encephalization quotient (EQ)  of 1.0,  birds average EQs of 0.75, fish have EQs of 0.05, and worms have unknown EQs, but probably about 0.01, the correlation between EQ and number of splits among major animal groups in the above tree is an astonishing +0.9!



Correlation between brain size and number of splits among five major higher primates

Just from looking at the below two images, it should be obvious that there’s a positive correlation between primate brain size and number of splits on the hominoidea evolutionary tree.



As shown below, the correlation between a primate group’s brain size and its branching on the evolutionary tree is +0.67.



Correlation between brain size and number of splits among four species in the genus homo

Wikipedia states that Homo habilis (descended from one split in the below tree) had a brain size of 610 cm3 , Homo erectus (descended from two splits) had a brain size of 1093 cm3 and that modern humans and Neanderthals (both descended from three splits) have brain sizes of 1497 cm3 and 1427 cm3 respectively.

This results in an astonishing +0.995 correlation between brain size and number of splits:


Correlation between brain size and number of splits among nine modern human populations: +0.71

Lastly, I decided to explore the correlation between brain size and number of splits among nine modern human populations using brain size data from Richard Lynn, which he adapted from Smith and Beals (1990).


Genetic distance tree by Cavalli-Sforza



From Race Differences in Intelligence (2006) by Richard Lynn

Plotting the average brain size of each “race” as a function of number of splits before it branched off of Cavalli-Sforza’s tree gives a potent +0.71 correlation, suggesting that ancient splitting-off dates explain 50% of the variation in racial brain size.


Interpreting the results

The above correlation between brain size/encephalization and number of splits on the evolutionary tree, are all positive, and in some cases, extremely strong, suggesting 1) evolution is progressive, 2) some extant organisms are more evolved than others, 3) organisms that branch off the evolutionary tree prematurely, and don’t do anymore branching, tend to resemble the common ancestor of said tree.

Although the preliminary evidence I document here is strong, more research is needed because the choice of trees I decided to analyze was not random and one can imagine how a different set of trees might not produce such high correlations.  While I tried to find trees that compared taxa of relatively equal rank (i.e. comparing species with species within the same genus, or comparing race with race within the same species) many of the groupings are arbitrary, and the decision to lump or split various groups can result in fewer or more splits in the evolutionary trees and thus it’s crucial that these decisions are made based on objective criteria.

Nonetheless the fact that trees made by other people, who made them without considering the brains of the taxa, still correlated so consistently with brain size/encephalization, is compelling.

Explaining the trend

But if evolution is progressive, the question is why?  Stephen Jay Gould claimed that any trend towards complexity is merely an artifact of the fact that life started extremely simple, and had nowhere to go but up, so random variation in all directions will be progressive merely because there’s a floor on how simple life can get.  And yet there’s no floor on how small a brain can get, but I have yet to find a single phylogenetic tree where encephalization is negatively correlated with number of splits, and while such trees undoubtedly exist, they are conspicuously rare.  Thus an additional explanation for evolutionary progress is that because intelligence allows organisms to adapt behaviorally instead of genetically, it’s more efficient than evolving new traits every time the environment changes, and thus it tends to be highly favoured by natural selection.

An ancient tradition

Although the notion of evolutionary progress is today often dismissed as pseudoscience, it has a rich intellectual history that predates even the theory of evolution itself by over two millennia.



As J.P. Rushton noted, Aristotle suggested a scala naturae in which animals > plants > inanimate objects.   One of the most important ideas in Western thought,  Aristotle viewed higher ranked organisms as more perfect, God-like and rational.  The great Greek philosopher stated:

Now some simply like plants accomplish their reproduction according to the seasons; others take trouble as well to complete the nourishing of their young, but once accomplished they separate from them and have no further association;  but those that have the understanding and possess some memory continue the association, and have a more social relationship with their offspring.

Over 2000 years later this would be roughly known as the r/K scale where K selected organisms have lower reproduction rates, but higher survival rates, investing in more parenting than reproducing, while r selected organisms do the opposite.


Although r/K theory (and its applications to humans) has been severely criticised, it remains undeniable that regardless of its selective agents, there is an evolutionary trade-off between high quantity and high quality offspring and different organisms fall at different points on this continuum.

Modern theories of evolutionary progress

E.O. Wilson, the co-father of the r/K scale believed evolution was progressive dividing life’s history into four major stages:

(1) the emergence of life itself in the form of primitive prokaryotes with no nucleus

(2) the emergence of eukaryotes with nucleus and mitochondria

(3) the evolution of large multicellular organisms that have complex organs like eyes and brains

(4) the emergence of the human mind

Princeton biology professor John Bonner noted that there’s been an evolution from primitive bacteria billions of years ago to complex life forms today, and the newer animals have bigger brains than older animals and that it’s perfectly natural to say that older life forms are lower than newer life forms, because their fossils are literally found in lower strata. Even plants can be ranked he argued; angiosperm > slime molds.

Paleontologist Dale Russel noted that the mean encephalization of mammals had tripled in the last 65 million years and that the mean encephalization of dinosaurs steadily increased for over 140 million years.  Extrapolating from the latter trend, Russel argued that had dinosaurs not gone extinct 65 million years ago, they would have eventually evolved into big-brained bipeds.

While the specific humanoid form Russell imagined was highly speculative, the increase in encephalization seems quite plausible.

Inspired by such thinkers, in 1989 J.P. Rushton argued that archaic forms of the three main races (Negroids, Caucasoids, and Mongoloids) differed in antiquity, with newer races being more K selected than older races, though Rushton’s model has excited enormous criticism.

r/K Selection Theory: A Response to Rushton by RaceRealist and Afrosapiens

[Note from PP, June 24, 2017: the following is a guest post and does not necessarily reflect the views of Pumpkin Person.  Out of respect for the authors, please try to keep all comments on topic.  I understand conversations naturally evolve, but at least try to start all discussions with on topic comments]


Jean Phillipe Rushton (1943-2012) was a British-born Canadian psychologist known for his theories on genetically determined racial differences in cognition and behavior between Africans, Europeans, and East Asians. While marginal among experts, Rushton’s theories are still widely accepted amongst the proponents of eugenics and racialism. This article will focus on Rushton’s Differential K-theory which tries to apply the r/K selection model to racial differences in behavioral traits. To be fair, Rushton wasn’t the only one to use r/K selection as an explanation for psychological differences within humanity. For instance, some have associated the continuum with left-wing vs. right-wing ideologies. And although ecologists (the specialists of ecosystems) find applying r/K selection to humans inappropriate, the behavioral sciences have identified life-history patterns that roughly correspond to the colloquial fast vs. slow life differences in life history. For that reason, Rushton may have accidentally discussed variables and trends that are largely acknowledged by experts but his theory lies on a misunderstanding of core principles of the r/K model as well as using flawed (or non-existent) data.

Agents of selection

To begin, confusion about the modes of selection in an ecological context needs to be cleared up. There are classes of natural selection in ecological theory to be discussed: r-selection where the agent of selection acts in a density-independent way; K-selection where the agent of selection acts in a density-dependent way; and alpha selection which is selection for competitive ability (territoriality, aggression)Typical agents of K-selection include food shortage, endemic and infectious disease, and predation. Typical agents of r-selection temperature extremes, droughts, and natural disasters. Typical agents of alpha-selection are limited resources that can be collected or guarded, examples being shelter and food, showing that alpha-selection is closer to K than r (Anderson, 1991).

As you can see, the third mode of selection in ecological theory is alpha-selection—which Rushton failed to bring up as a mode of selection to explain racial differences in behavior. He didn’t explain his reasoning as to why he did not include it—especially since alpha-selection is selection for competitive ability. One may wonder why Rushton never integrated alpha-selection into his theory—either he was ignorant to the reality of alpha-selection or it could occur in numerous ecosystems—whether temperate/cold or tropical. The non-application of alpha-selection throws his theory into disarray and should have one questioning Rushton’s use of ecological theory in application to human races.

The Misuse of r/K Theory



Rushton’s model starts with the erroneous assumption that the populations he describes as humanities three main races qualify as ecological populations. When studying the adaptive strategies of organisms, ecologists only consider species within their evolutionary niche—that is, the location that the adaptation was hypothesized to have occurred. When it comes to humans, this can only be done by studying populations in their ancestral environments. For this reason, Africans, Europeans, Amerindians—any population that is not currently in their ancestral environments—are not suitable populations to study in an evolutionary ecological context. The three populations no longer inhabit the environment that the selection was hypothesized to have occurred, so any conclusions based on observing modern-day populations must be viewed with extreme caution (Anderson, 1991). Even in the Old World, constant gene flow between ecoregions, as well as alterations of the environment due to agriculture and then industrialization, make such a study virtually impossible as it would require ecologists to study only hunter-gatherers that have received no admixture from other areas.

Rushton’s next misuse of the theory is not discussing density-dependence and density-independence and how they relate to agents of selection and the r/K model. K-selection works in a density-dependent way while r-selection works in a density-independent way. Thusly, K-selection is expected to favor genotypes that persist at high densities (increasing K) whereas r-selection favors genotypes that increase more quickly at low densities (increasing r) (Anderson, 1991). Rushton also failed to speak about alpha-selection. Alpha-selection selection for competitive abilities and, like with K-selection, occurs at high population densities, but could also occur with low population densities. Alpha-selection, instead of favoring genotypes that increase at high densities “it favours genotypes that, owing to their negative effects on others, often reduce the growth rate and the maximum population size” (Anderson, 1991: 52).

The r/K continuum

The r/K continuum—proposed by Pianka (1970)—has been misused over the decades (Boyce, 1984) and that is where Rushton got the continuum and applied it to human racial differences. Different agents of r-selection produce different selection pressures, as does K-selection. However, where Rushton—and most who cite him—go wrong is completely disregarding the agents of selection, along with perhaps the most critical part, reversing r and K in application to human races (if it were applicable to human races, that is), which will be covered below.

Dobzhansky (1950: 221) notes that “Tropical environments provide more evolutionary challenges than do the environments of temperate and cold lands.” It is erroneously assumed that living in colder temperatures is somehow ‘harder’ than it is in Africa. People believe that since food is ‘readily available’, that it must be ‘harder’ to find food in the temperate/Arctic environments so, therefore, selection for high intelligence occurred in Eurasians while Africans have lower intelligence since it’s so ‘easy’ to live in Africa, as well as other tropical environments.

Africans, furthermore, have been in roughly the same environment since the OoA migration occurred (the Ice Age ‘ended’ about 11,700 ya, although we are still in an Ice Age since the planets caps still have ice), and so any assumptions about it being ‘harder’ for the ancestors of Eurasians to survive and pass on their genes is a baseless assumption. Tropical environments that provide more evolutionary challenges than temperate and cold lands whereas the migration that occurred Out of Africa introduced humans to novel environments. As described above, endemic disease is an agent of K-selection whereas migration to novel environments are agents of r-selection. Thus, cold temperatures would be an agent of r-selection, not K-selection as is commonly believed, whereas endemic disease would be an agent of K-selection.

Even though intelligence nor rule-following were not included on the list of variables that Pianka (1970) noted on his r/K continuum, Rushton chose to include the variables anyway, even though selection for intelligence and rule-following can occur due to agents of r- or K-selection (Anderson, 1991: 55; Graves, 2002: 134-144). Pianka (1970) never gave experimental rationalization as to why he placed the traits he did on his continuum (Graves, 2002: 135). This is one critical point that makes his theory unacceptable in application to racial differences in behavior. By Rushton’s own interpretation of the r/K model, Africans would be selected for intelligence while Eurasians would be selected to breed more since novel environments (i.e., colder temperatures) are agents of r-selection, not K. Using the terms r- and K-selection to describe the traits of an organism is inappropriate; Rushton’s application of r/K theory to the traits of the three races, while ignoring that r/K describes a mode of natural selection “indicates circular reasoning rather than support for Rushton’s hypothesis” (Anderson, 1991: 59).

Reznick et al, (2002: 1518) write: “The distinguishing feature of the r- and K-selection paradigm was the focus on density-dependent selection as the important agent of selection on organisms’ life histories. This paradigm was challenged as it became clear that other factors, such as age-specific mortality, could provide a more mechanistic causative link between an environment and an optimal life history (Wilbur et al. 1974, Stearns 1976, 1977). The r- and K-selection paradigm was replaced by new paradigm that focused on age-specific mortality (Stearns 1976, Charlesworth 1980).” r/K selection theory was dropped for the much stronger life-history approach (Graves, 2002)—which uses some elements of r and K, but otherwise those terms are no longer used since other factors are more important as agents of selection, rather than density dependence and independence as was commonly thought.

Simple models?

One of the main reasons that Rushton’s r/K continuum gets pushed is because it’s a ‘simple model’ that so ‘parsimoniously’ explains racial differences. (e.g., cold winters supposedly take more intelligence to survive in and supposedly are an agent of K-selection.) But ecological systems are never simple; there are numerous interactions between the physical environment and the biological system which interact in complex ways.

Rushton’s use of this ‘simple model’—the r/K continuum—and its application to human races are wrong because 1) the three races described are not local populations; 2) the r/K continuum as described by Pianka (1970) is a poor representation of multidimensional ecological processes; and 3) cold weather is normally an agent of r-selection while endemic disease in Africa—as described by Rushton—is an agent of K-selection. Simple models are not always best—especially for organisms as complex as humans—so attempting to reduce complex biological and environmental interactions into a linear continuum is mistaken (Boyce, 1984). The simpler the ecological model, the more complex ecological sophistication is needed to understand and apply said model. So, although Rushton prefers simple models, in this context it is not apt, as complex biological systems interacting with their environments should not be reduced to a ‘simple model’.

Applying r/K to human races

If the r/K model were applicable to humans, then Caucasoids and Mongoloids would be r-selected while Negroids would be K-selected. Endemic and infectious disease—stated by Rushton to be an r-selected pressure—is actually a K-selected pressure. So Negroids would have been subjected to K-selected pressures (disease) and r-selected pressures (drought). Conversely, for Mongoloids, they migrated into colder temperatures which act in a density-independent way—hence, cold winters (temperature extremes) are an agent of r-selection.

Pianka’s (1970) r/K continuum “confuses the underlying pattern of life history variation with density-dependence, a process potentially involved to explain the pattern” (Gaillard et al, 2016). Furthermore, one cannot make assumptions about an organism’s traits and the selection pressures that caused them without studying said organism in their natural habitatThis seems to be impossible since one would need to study non-admixed hunter-gatherer populations that have received no outside contact.

Gonadotropin levels, testosterone, prostate cancer and r/K theory

Numerous attempts have been made to validate Rushton’s r/K theory. One notable paper by Lynn (1990) attempts to integrate gonadotropin levels and testosterone into Rushton’s r/K continuum. Lynn cites studies showing that blacks have higher testosterone than whites who have higher testosterone than Asians. He then implicates higher levels of both testosterone and gonadotropin levels as the cause for the higher incidence of prostate cancer (PCa) in black Americans.

Lynn (1990) asserts that by having fewer children and showing more care, this is shifting to a K strategy. So, according to Lynn, the best way to achieve this would be a reduction in testosterone. However, there is a fault in his argument.

The study he uses for his assertion is Ross et al (1986). He states that the two groups were both “matched for possible environmental factors which might affect testosterone levels” (Lynn, 1990: 1204). However, this is an erroneous assumption. Ross et al (1986) did control for relevant variables, but made two huge errors. They did not control for waist circumference (WC), and, perhaps most importantly, did not assay the subjects in the morning as close to 8 am as possible.

Testosterone levels are highest at 8 am and lowest at 8 pm. When doing a study like this—especially one to identify a cause of a disease with a high mortality rate—all possible confounds must be identified then controlled for—especially confounds that fluctuate with age. The cohort was assayed between the hours of 10 am and 3 pm. Since testosterone assay time was all over the place for both groups, you cannot draw evolutionary hypotheses from the results. Further, the cohort was a sample of 50 black and white college students—a small sample and a non-representative population. So it’s safe to disregard this hypothesis, on the knowledge that blacks don’t have significantly higher testosterone levels than whites.

Another correlate that is used to show that blacks have higher levels of testosterone is the higher rate of crime they commit. However, physical aggression has a low correlation with testosterone (Archer, 1991; Book et al, 2001) and thusly cannot be the cause of crime. Furthermore, the .14 correlation that Book et al, 2001 found was found to be high. Archer, Graham-Kevan, and Lowe (2005) show that even the .14 correlation between testosterone and aggression is high in a reanalysis of Book et al (2001) since they included 15 studies that should have been omitted. The correlation was then reduced by almost half to .08.

Other theories have been developed to attempt to explain the racial crime gap which centers around testosterone (Ellis, 2017), however, the theory has large flaws which the author rightly notes. Exposure to high levels of testosterone in vitro supposedly causes a low 2d/4d ratio and blacks apparently have the lowest (Manning, 2008). Though, larger analyses show that Asians—mainly the Chinese—have a lower digit ratio compared to other ethnicities (Lippa, 2003; Manning et al, 2007).

Testosterone also does not cause PCa (Stattin et al, 2003; Michaud, Billups, and Partin, 2015). The more likely culprit is diet. Less exposure to sunlight along with low vitamin D intake (Harris, 2006Rostand, 2010) is a large cause for the prostate cancer discrepancy between the races since low vitamin D is linked to aggressive prostate cancer.

Even then, if there were, say, a 19 percent difference in testosterone between white and black Americans as asserted by Rushton and Lynn, it wouldn’t account for the higher rates of crime, nor higher acquisition and mortality from PCa. If their three claims are false (higher levels testosterone in African-Americans, larger penis size, and high levels of testosterone causing PCa), and they are, then this obliterates Rushton’s and Lynn’s theory.

Differential K Theory has, as noted above, has also been associated with a larger penis for black males in comparison to white males who have larger penises than black males (Lynn, 2012), which is not true, there is no reliable data and the data that does exist points to no evidence for the assertionLynn, (2012) also used data from a website with unverified and nonexistent sources. In a 2015 presentation, Edward Dutton cites studies showing that, again, Negroids have higher levels of testosterone than Caucasoids who have higher levels of testosterone than Mongoloids. Nevertheless, the claims by Dutton have been rebutted by Scott McGreal who showed that population differences in androgen levels don’t mean anything and that they fail to validate the claims of Lynn and Rushton on racial differences in penis size.

r/K selection theory as an attempt at reviving the scala naturae

Finally, to get to the heart of the matter, Rushton’s erroneous attempt to apply r/K selection theory to the human races is an attempt at reviving the scala naturae concept proposed by Aristotle (Hodos, 2009). The scala naturae organizes living and non-living organisms on a scale from ‘highest’ to ‘lowest’. However, these assumptions are erroneous and have no place in evolutionary biology (Gould, 1996). Rushton (1997: 293) attempted to apply r/K selection theory to human populations to try to revive the concept of the scala naturae, as can be clear by reading the very end of Race, Evolution, and Behavior.

This, of course, goes back to Rushton’s erroneous application of r/K selection theory to human races. He (and others) wrongly assert that Mongoloids are more K-selected than Africans who are more r-selected while Caucasians are in the middle—it also being asserted that K organisms, supposedly Mongoloids, “are the most K evolved” (Lynn, 2012). However, if r/K selection theory were applicable to humans, Mongoloids would be r and Africans would be K. Rushton further attempts to provide evidence for this ‘evolutionary progress’ by citing Dale Russel (1983; 1989) and his thought experiment troodon that he imagines would have eventually have gained human-like bipedalism and a large brain. Nevertheless, Rushton himself doesn’t say that it was only one dinosaur that would have supposedly had human-like intelligence and mobility, Reptile brains, however, lie outside of mammalian design (Hopson, 1977: 443; Gould, 1989: 318), and so, Russel’s theory is falsified.

This use of r/K selection theory as an attempt at bringing back the scala naturae may seem like an intuitive concept; some races/animals may seem more ‘advanced’ or ‘complex’ than others. However, since Rushton’s application of r/K selection theory is not correctly applied (nor does it apply to humans) and any of the claims that Rushton—or anyone else—makes while invoking the theory can be disregarded since he misused r and K selection.

In an attempt to “[restore] the concept of “progress” to its proper place in evolutionary biology,” Rushton (2004) proposed that g—the general factor of intelligence—sits atop a matrix of correlated traits that he proposes to show why evolution is synonymous with ‘progress’, including how and why K evolved organisms are so-called ‘more highly K evolved’—which is a sly attempt to revive the concept of scala naturae. Rushton’s (2004) paper is largely copy and pasted from his 1997 afterword in Race, Evolution, and Behavior—especially the part about ‘progress in evolution’ (which has been addressed in depth).

As can be seen, Ruston attempted to revive the scala naturae by giving it a new name, along with the misuse of ecological theory to make it seem like evolution is synonymous with progress and that K organisms are ‘more evolved’, makes no sense in the context of how ecological theory is (or was) applied to organisms. Rushton’s theory is correct, if and only if he applied r and K correctly to human races. Rushton did not apply r/K selection theory correctly to human races, so Rushton’s claims and any that follow from them are, on their face, immediately wrong. The claims by Rushton et al showing evolution to be ‘progressive’ have been shown to be demonstrably false since evolution is local change, not ‘progress’ (Gould, 1989; 1996).


Rushton’s r/K selection theory has enamored many since he proposed it in 1985. He was relentlessly attacked in the media for his proposals about black penis size, testosterone, brain size, sexual frequency, etc. However, the explanation for said racial differences in behavior—his r/K selection theory—has been rebutted summarily rebutted for misapplying ecological theory and not understanding evolution (Anderson, 1991; Graves, 2002). Even ignoring his racial comparisons, his application of the theory would still be unacceptable as he didn’t recognize agents of selection nor alpha selection.

Rushton is wrong because

(i) he misapplied r/K selection in application to human races (Africans would be K, Mongoloids would be r; rule-following and intelligence can be selected for in either environment/with any of the agents of r- or K-selection),

(ii) he arbitrarily designated Africans as r and Mongoloids as K due to current demographic trends (the true application of r and K is described above, which Rushton showed no understanding of),

(iii) the races do not differ in levels of testosterone nor penis size,

(iv) testosterone does not cause prostate cancer nor does it cause crime, so even if there was a large difference between blacks and whites, it would not explain higher rates of PCa in blacks, nor would it explain higher rates of crime,

(v) the scala naturae is a long-dead concept no longer in use by evolutionary biologists, along with its cousin ‘evolutionary progress’, while r/K selection is the attempt at reviving both,

(vi) human races are not local populations; since human races are not local populations then his application of r/K selection to humans is erroneous.

Rushton was informed numerous times he wrongly applied ecological theory to human populations. Yes, E.O. Wilson did say that if Rushton had noticed variation in any other animal that ‘no one would have batted an eye’, however, that does not say a word about Rushton’s incorrect application of r/K selection to human races. No race of humans is more ‘highly evolved’ than another.

Anyone who uses Rushton’s theory as an explanation for observed data is using incorrect/misapplied theory meaning that, therefore, by proxy, their theory is wrong. Rushton’s r/K theory is wrong, and people need to stop invoking it as an explanation for racial differences in behavior, politics, religion, and any other variable they can think of. If Rushton’s application of the theory is wrong, then it logically follows that anything based off of his theory is wrong as well.

Self-domestication in humans

When animals are directly selected for reduced reactive aggression (domestication), either naturally or artificially, they are indirectly selected for other traits too, like depigmentation, floppy ears, shorter muzzles, smaller teeth, docility, smaller brains, more frequent estrous cycles, juvenile behavior and curly tails.


Some scientists believe that the decrease in human brain size that occurred over the last 10,000 years may have been an indirect effect of domestication, but there are two problems with this theory:

  1. Head size has rapidly rebounded over the 20th century (as has height), suggesting the brain size reduction during the Holocene was perhaps not an evolutionary change, but merely suboptimum nutrition caused by disruption of healthy hunter-gatherer life style in aspiring agriculturalists and the peoples they colonized.
  2. If humans did self-domesticate ourselves, the evidence suggests it began hundreds of   thousands of years ago, not merely in the Holocene, and yet brain size reduction only occurred in the latter.

How might domestication have occurred?  One theory is that capital punishment, in which about 15% of the population (usually hyper-aggressive males who were bullying the rest of the tribe) were killed off in a “Revenge of the Nerds” scenario.

The fact that alpha males were such evolutionary losers is very humiliating and painful to commenters like “philosopher” who probably come from a long line of big husky rednecks, so they must convince themselves that nerds were selected for by masters looking for slaves, when in reality, nerds were the authors of their own evolutionary success, and simply murdered the bullies.

Because these alpha-male bullies tend to be very manly men, when their genes are removed, the tribe starts looking less like men and more like little boys.


little boy chimp (left); manly man chimp (right)

This may help explain why early humans looked more like manly chimps while later humans look more like baby chimps.  It may also explain why a lot of nerds act more like little boys than grown men, preferring to play video games or play chess, and watch Star Wars or Star Trek instead of pursuing money and sex.

But in the rare cases where nerds do pursue money (i.e. Bill Gates) they often slaughter the alpha male competition in record time because they are so much smarter, particularly if they’re self-aware enough to start their own business instead of trying to climb the corporate ladder which they often lack the charisma to do.

But this leads to a paradox.  If domestication reduces brain size and makes people more nerdy, why are nerds smart, and why is there no evidence of brain size reduction until the Holocene (and even that may simply be malnutrition) when other signs of domestication (facial size reduction) occurred hundreds of thousands of years earlier?

One possibility is that modern humans in general and nerds in particular, were shaped by two evolutionary forces:  One selecting for less reactive aggression (domestication) and the other selecting for intelligence, and the latter prevented brain size from shrinking.

For more information about this topic, please see the following video: