Why does shared environment affect some IQ tests more than others?


, , , , , , , ,

Using twin studies, scientists divide phenotypic variation into three categories: DNA variation, shared environmental variation, and unshared environmental variation. Shared environment are all the experiences MZ twins reared together have in common (same upbringing, same schools, same womb) while unshared environment are all the experiences they don’t share (position within the womb, getting hit on the head, having an inspiring teacher).

The best estimate using massive datasets suggest that within Western democracies, DNA explains 41% of IQ variation at age 9, 55% at age 12, 66% at age 17, and 74% in adulthood. By contrast shared environment explains 33% at age 9, 18% at age 12, 16% at age 17, and 10% in adulthood (Bouchard 2013, figure 2). That leaves unshared environment explaining 26% of the variation at age 9, 27% at age 12, 19% at 17, and 16% in adulthood.

You don’t have to believe these associations are causal, but they are real. They’ve been more or less replicated using studies comparing (1) MZ twins with DZ twins, (2) MZ twins raised apart, (3) unrelated people reared in the same home. Although all of these methods depend of different assumptions, they all converge on the same conclusion: the predictive power of DNA skyrockets from childhood to adulthood while the predictive power of shared environment plummets. The same pattern (known as the Wilson effect) has also been observed for other phenotypes and in other species.

But why? Shouldn’t environment get more important as we age since experience has increasing time to accumulate? One theory is that more and more genes become active as we age. A more popular theory is that we select environments that maximize our genotype, so environment becomes just a magnifier of genes, not a causal force in its own right. So genetically smart people will stay in school and genetically strong people will lift weights and take steroids etc. People invest in where they’re more likely to be rewarded.

But here’s where things get really interesting. The Wilson effect behaves differently on different types of IQ tests. In his book Does your Family make you smarter? James Flynn notes that cognitive inequality increases from childhood to later adulthood (because good genes cause good environments and bad genes cause bad environments, the smart get smarter and the dumb get dumber, relative to the average person their age) but this pattern is much more pronounced on some tests than others.

Flynn describes three types of tests:

  • Type 1: Tests that show large family effects (shared environment) that decay slowly. This include tests involving vocabulary (define “rudimentary”), general knowledge (How old is the Earth?) verbal abstraction (how are a brain and a computer the same?) and social comprehension (why do you need a passport to travel?)
  • Type 2: Tests that show small family effects that decay fast. These include spatial manipulation (use these two triangles to make a square) and noticing incongruities (what’s missing or absurd in a picture of a common object or scene).
  • Type 3: Tests that show that large family effects that decay fast. These tests include clerical speed and arithmetic.

Flynn argues that type 1 tests involve skills that children learn from observing their parents talk, hence the large family effect. By contrast he says of type 2 tests:

Aside from the occasional jigsaw puzzle, they have no part in everyday life. Children never see their parents performing these cognitive tasks as part of normal behavior. Family effects are weak, even among preschoolers. Since these subtests match environment with genetic potential so young, they would be an ideal measure (for, say, 5-year-olds) of genes for intelligence.

From pages 53-54 of Does Your Family Make You Smarter? by James Flynn

In other words, Type 2 tests measure “novel problem solving”, while type 1 tests measure acquired abilities. A more provocative interpretation is type 2 tests measure real intelligence, while type 1 just measure knowledge and experience. This is the age-old distinction between aptitude tests vs achievement tests, culture fair vs culture loaded, fluid vs crystallized.

And yet Flynn largely rejects Cattell-Horn-Carroll’s theory that fluid ability (novel problem solving) is invested to acquire crystallized ability (accumulated knowledge) writing:

…fluid skill is just as heavily influenced by family environment as the most malleable crystallized skill (vocabulary) and therefore, neither skill deserves to be called an investment and the other a dividend.

From page 132 of Does Your Family Make You Smarter? by James Flynn

Flynn of course is referring to the greatest irony in the history of psychometrics and the biggest mistake of Arthur Jensen’s career: the Raven Progressive Matrices (long worshiped by Jensen and Jensenistas as the most culture fair measure of pure intelligence ever invented) is a type 1 test!

Which of the 8 choices completes the above pattern? Image from  from Carpenter, P., Just, M., & Shell, P. (1990, July)

But let’s not throw the baby out with the bath water. There’s no need to abandon CHC investment theory just because a major test got mischaracterized. But at the same time, it doesn’t feel right to reclassify the Raven as a crystallized test, Research is needed to understand why the Raven is so culturally sensitive when it superficially looks like a measure of novel problem solving. Is it measuring some kind of implicit crystallized knowledge we’re not conscious of like being familiar with patterns, columns and rows and reasoning through the process of elimination, or are the family effects on the non-cognitive part of the test (having the motivation to persist and concentrate on such an abstract task). Flynn argues that the brain is like a muscle, but if so, the Raven is an exercise most have never done before, so why isn’t it a type 2 test?

Flynn might argue that if your family helped you with abstract problems in algebra or had philosophical discussions about hypothetical concepts, you’ve been exercising for the Raven all your life, but this seems like a bit of a stretch. All the research shows that cognitive training has narrow transfer (i.e. practicing chess will only make you slightly better at checkers, and not at all better at scrabble) though perhaps the Raven’s uniquely abstract (general) nature allows it to slightly buck this trend.

The true distribution of intelligence


, , ,

Modern IQ tests force test scores to fit a normal distribution, but I’ve long suspected the distribution of intelligence is anything but normal. After all, if you look at the distribution of wealth and income (a crude measure of intelligence), you find the richest people are worth millions of times more than the poorest. And that’s just because they’re not paying their fair share, as Elizabeth Warren would have us believe. We see the same skewed distribution in academic output, with the most productive scientists publishing orders of magnitude more than the least productive.

A member of Prometheus society once hypothesized that the human mind works in parallel, so that complex problem solving speed doubles every 10 IQ points (he later suggested 5)?

And yet at the same time, Arthur Jensen seemed to believe intelligence was normally distributed and in support of this cited cognitive measures that form a natural scale with equal intervals and a true zero point such as the total number of words a person knows or the number of digits they can repeat after one hearing, both of which were normally distributed. What was missing however from Jensen’s examples was complex on-the-spot problem solving.

Thanks to my research on how people today would score on the oldest Wechsler tests (the ancient WBI) I have some novel data on how long it takes people to solve visuo-spatial items. The WBI includes a subtest called Object Assembly where you have to fit a bunch of odd shaped cardboard cutouts together to make a familiar object, and over the past decade or so, this was administered to a relatively random sample of White young adults (n = 17). One seven piece item was easy enough that all 17 were able to complete it within the 3 minute time limit, yet hard enough that no one could solve it immediately.

Normally I wouldn’t show items from an actual IQ test, but the WBI is over 80 years old and Object Assembly is no longer part of current Wechsler subtests.

[update april 1, 2020, I decided to remove the photos to be safe, but it’s too bad because those were gorgeous photos I took]

When one test participant saw these cardboard cutouts being placed on the table, he apologized for being unable to contain his laughter. A painful reminder that my life’s work is considered a joke by much of the population. And yet for all their apparent absurdity, these silly little tests remain the crowning achievement of social science with one’s score being largely destiny .

Even on this one item, there seemed to be a correlation between IQ and occupation/education. The time taken to complete the puzzle ranged from 14 seconds (a professional with a Masters degree in Engineering from a top Canadian university) to a 137 seconds (a roofer with only a high school diploma). Below are the times of all 17 participants ranked from fastest to slowest.

14, 18, 21,30,31,33,34,35,48,58,60,65,68,69,82,89*,137

Mean = 52 seconds, Standard Deviation = 30 seconds

In a normal distribution, 68% fall within one standard deviation of the mean. In this distribution, 71% fell within one standard deviation of the mean (22 to 82 seconds) which is pretty damn close. Also in a normal distribution, 95% fall within two standard deviations (-8 to 112 seconds) and in my sample, 94% did (even closer!).

image found here

So simply by picking at least a moderately g loaded novel problem that is both easy enough that no one gives up, yet hard enough that everyone is forced to think, and measuring performance on a natural scale (time taken in seconds) a normal curve emerges, though a somewhat truncated one (the slowest time is much further from the mean than the fastest, perhaps because human hands can only assemble puzzles so fast, regardless of how quick the mind is).

To convert from time in seconds to IQ, all one needs to do is make the natural mean of 52 seconds equal to the IQ mean of 100, and make each natural standard deviation (30 seconds) faster or slower than 52 seconds, equal to 15 IQ points (the IQ standard deviation) above or below 100, respectively.

Thus the elite Masters degree in Engineering professional gets an IQ of 119 (14 seconds) and the high school only roofer gets an IQ of 58 (137 seconds). But note that even though IQ appears to be a true interval scale (meaning an X point gap between any two points on the IQ scale are equivalent), it is not a ratio scale, meaning IQs can not be meaningfully multiplied. So even though IQ 119 is about twice as high as IQ 58, the difference in actual problem solving speed is about an order of magnitude. This is because unlike height, weight and time in seconds to solve puzzles (which can be meaningfully multiplied) the IQ scale has no true zero point.

Of course the normal curve only applies to the biologically normal population, so it’s interesting to note that it’s now standard procedure to exclude pathological cases from IQ test norming samples. Indeed one man was excluded from my sample after he told me that years ago he had suffered a stroke (quite unusual for a man in his thirties). This man struggled greatly with the above puzzle, only joining 25% of the cuts within the 3 minute time limit. The only way to estimate what his time would have been had he not given up is divide 3 minutes by 25% which gives 12 minutes (720 seconds). This is more than 22 standard deviations slower than the mean of the normal sample, and equivalent to an IQ of -234! Such extreme deviations remind us how sensitive the normal curve is to the normality of the sample.

*one person solved the puzzle in 67 seconds, but the ear pieces were reversed, so only 75% of the cuts were correctly joined. I thus considered this equivalent to a perfect performance at 75% of the speed (67 seconds/0.75 = 89 seconds).

Why didn’t megafauna go extinct in Africa?


, , , , , ,

Have you ever wondered why we have to go all the way to Africa to see a safari? For Africa is the land of 13000 lb elephants and 18 foot tall giraffes.

But what many do not realize is that 40,000 years ago, the whole World looked like an African safari. North America and Eurasia were home to Pachystruthio dmanisensis, a flightless bird that stood 11.5 feet tall and weighed nearly a 1000 lbs.

image found here

North America was also home to the short-faced bear which stood up to 14 feet tall, weighed about 1700 lbs, and could run up to 40 miles per hour. And of course who could forget the 13000 lb mammoth, which lived on every continent except Australia.

Emily Lindsey writes:

Scientists call these giant animals “megafauna” (mega = big, and fauna = animals). We still have megafauna in the world, but there used to be a whole lot more of it. In fact, it appears that having a large number of large-bodied animals in an ecosystem is actually the normal state for our planet, at least for the geologic era we are living in today, the Cenozoic (or “Age of Mammals”) . But sometime in the past 50,000 years (very recent geologically), everywhere except for Africa, most of those large animals became extinct. And we still aren’t sure why!

Some scientists think megafauna survived in Africa because humans evolved there so large animals had more time to adapt to us. However members of the genus Homo have been living outside Africa for 2 million years, so Eurasian megafauna had time to adapt to us too. Another theory is that megafauna were killed off by the extreme climate changes that megafauna endured outside Africa.

But in asking why megafauna went extinct everywhere except Africa, politically correct scientists are forced to ignore the elephant in the room (pun intended): HBD. If Arthur Jensen was correct about the black-white IQ gap being genetic, perhaps Africans simply hadn’t evolved the intelligence to hunt large game.

But that can’t be the whole story. If racial differences in IQ evolved because we needed more intelligence to survive the non-tropics, how were Australian aboriginals (who retain a tropical phenotype) able to kill off 100% of their giant mammals? Migrating from Africa to Australia means their ancestors must have spent some time in the non-tropical ice age Middle East. Was this enough time for them to evolve the intelligence to hunt big game or was the big game in Australia simply easier to hunt because it had not had the time to evolve ways to avoid humans?

If cold climate selected humans were especially evolved for hunting big game, and if the big game on continents where humans had never been were especially bad at evading human predators, then these two factors predict the biggest megafauna massacre of all should have occurred in the Americas where both conditions were met: cold adapted hunters (humans entered the Americas from Siberia) entering a continent where humans had never been.

And indeed that seems to be the case. Paleo-biologist Rebecca Terry at Oregon State University says “pretty advanced weaponry was definitely present, and the extinctions in the New World in North America and South America were really extreme as a result.”

11,000 years ago (shortly after modern humans entered the New World), the average weight of a non-human mammal in North America was about 200 pounds compared to only 15 pounds today.

Northern American IQ: circa 1937 to circa 2014 (2nd edition)


, , , , ,

The following article is an updated revision of an article I published in August 2019 about how 21st century Northern Americans score on an IQ test normed before the second World War. The reason for the update is that in December 2019, the sample size of my study increased by 13% (from n = 15 to n = 17). I had originally hoped to collect more data before publishing an update but with the uncertainty surrounding the coronavirus crisis, it’s unclear when that will be.

The Flynn effect, popularized by James Flynn, refers to the fact that IQ tests supposedly get easier with time. Although by definition the average IQ of American or British (white) people is always 100, the older the IQ test, the easier it is to score 100. Thus to keep the average at 100, tests like the Wechsler must be renormed every 10 years or so, otherwise the average IQ would increase by about 3 points per decade.

Although scholars continue to debate whether the Flynn effect reflects a genuine increase in intelligence (perhaps caused by prenatal nutrition or mental stimulation) or just greater test sophistication caused by modernity, there’s been remarkably little skepticism about the existence of the Flynn effect itself.

Malcolm Gladwell writes:

If an American born in the nineteen-thirties has an I.Q. of 100, the Flynn effect says that his children will have I.Q.s of 108, and his grandchildren I.Q.s of close to 120—more than a standard deviation higher. If we work in the opposite direction, the typical teen-ager of today, with an I.Q. of 100, would have had grandparents with average I.Q.s of 82—seemingly below the threshold necessary to graduate from high school. And, if we go back even farther, the Flynn effect puts the average I.Q.s of the schoolchildren of 1900 at around 70, which is to suggest, bizarrely, that a century ago the United States was populated largely by people who today would be considered mentally retarded.

While few people believe our grandparents were genuinely mentally retarded, it’s taken for granted that they would have scored in the mentally retarded range by today’s standards.

But is this true? I began having doubts over a decade ago when I examined the items on the first Wechsler intelligence scale ever made: the ancient WBI (Wechsler Bellevue intelligence scale). Meticulously normed on New Yorkers in the 1930s, this test remains far and away the most comprehensive look we have at early 20th century white Northern American intelligence, and while some of the subtests looked easy by today’s standards, others, especially vocabulary, looked harder.

The Kaufman effect

What also struck me was how little instruction, probing or coaching people got when taking the ancient WBI, compared to its modern descendant the WAIS-IV. This matters a lot because the way the Flynn effect is calculated on the Wechsler is by giving a new sample of people both the newest Wechsler and its immediate predecessor, in random order to cancel out practice effects, and then seeing which version they score higher on. If they average 3 points lower on the WAIS-IV normed in 2006 than on the WAIS-III normed in 1995, it’s assumed IQ increased by 3 points in 11 years.

The problem with this method (as Alan Kaufman may have discovered before me) is that the subset of the sample that took the newer version first has a huge advantage on the older version compared to the norming sample of the older test (over and above the practice effect which is controlled for), because the norming sample of the older test was never given coaching and probing.

Statistical artifact

A Promethean once said maybe the Flynn effect is just a statistical artifact of some kind. He never told me what he meant, but it got me thinking:

One problem with how the Flynn Effect is calculated on the Wechsler is that it’s assumed that gains over time can be added. For example it’s assumed that you can add the supposed 7.8 IQ gain from WAIS normings 1953.5 -1978 to the 4.2 IQ gain from normings 1978 – 1995 to the 3.7 IQ gain from normings 1995-2006, for a grand total of 15.7 IQ points from normings 1953.5 – 2006.

This would make sense if he were talking about an absolute scale like height, but is problematic when talking about a sliding scale like IQ. For example, suppose the raw number of questions correctly answered in 1953.5 was 20 with an SD of 2. By 1953.5 standards, 20 = IQ 100 and every 2 points = 15 IQ points above or below 100. Now suppose in 1978, people averaged 22 with an SD of 1. That’s a gain of 15 IQ points by 1953.5 standards. Now suppose in 1995 people average 23 with an SD of 2. That’s a gain of 15 IQ points by 1978 standards. Adding the two gains together implies a 30 point gain from 1953.5 to 1995, but by both 1953 and 1993 standards, the difference is only 23 points.

Changing content

Another problem with studying the Flynn effect is the content of tests like the Wechsler is constantly changing. This is especially problematic when studying long-term trends in general knowledge and vocabulary. If words that are obscure in the 1950s become popular in the 1970s, then people in the 1970s will score high on the 1950s vocabulary test. Meanwhile the 1970s vocabulary test may contain words that don’t become popular until the 1990s, Thus adding the vocabulary gains from the 1950s to the 1970s to the gains from the 1970s to the 1990s, might give the false impression that people in the 1990s will do especially well on a 1950s vocabulary test, when in reality, many words from the 1950s may have peaked in the 1970s and are even more obscure in the 1990s than they were in the 1950s.

An ambitious study

Given the Kaufman effect, the statistical artifact, and changing content, I realized the only way to truly understand the Flynn effect is to take the oldest quality IQ test I could find and replicate its original norming on a modern sample.

In 2008 I made it my mission to replicate Wechsler’s 1935-1938 norming of the very first Wechsler scale. Ideally I should have flown to New York where Wechsler had normed his original scale, but if Wechsler could use white New Yorkers as representative of all of white America (WWI IQ tests showed white New Yorkers matched the national white average), I could use white Ontarians as representative of all of white Northern America (indeed white Americans and white Canadians have virtually the same IQs). The target age group was 20-34 because this was the reference age group Wechsler had used to norm his subtests.

It took over a decade but I was gradually able to arrange for 17 randomly selected white young adults to take the one hour test. They were non-staff recruited from about half a dozen fast food/ coffeehouse locations in lower to upper middle class urban and suburban Ontario. The final sample ranged in education from 9.5 years (early high school dropout) to 18 years (Masters Degree in Engineering from one of Canada’s top universities). The mean self-reported education level was 12.9 years (SD = 2.12) suggesting that despite the lack of female participants, the sample was fairly representative (the average Canadian over 25 has about 13 years of schooling); in cases where those below the age of 25 were in the process of finishing a degree, they were credited as having it.

Testing conditions were not optimum (environments were sometimes noisy, at least one person had a few beers before testing; another was literally falling asleep during the test) and 17 people is way to small a sample to draw statistically significant conclusions about 11 different subtests. One man with a conspicuously low score was removed from the sample after he stated that he had years ago suffered a stroke.

Nonetheless, the below table shows how whites tested in 2008 to 2019 compared to Wechsler’s 1935-1938 sample, with the last column showing the expected scores of the 21st century sample, extrapolating gains James Flynn calculated from 1953.5 to 2006 (see page 240 of his book Are We Getting SMARTER?) to the current study: circa 1937 to circa 2013.5.

Note: the 11 subtests were scaled to have a mean of 10 and an SD of 3 in the original young adult norming sample, while the verbal, performance and full-scale IQs were scaled to have a mean of 100 and an SD of 15. Note also that vocabulary is alternate test, not used to calculate either verbal or full-scale IQ on the WBI. One third of my sample did not take Digit Symbol so for these, Performance and full-scale IQs were calculated via prorating.

Test:Nationally representative sample of young white adults (NY, 1935 to 1938)Randomish sample of young white adults (2008 to 2019, ON, Canada)Expected WBI scores in 2008-2019 based on Flynn’s calculated rate of increase
Information (general knowledge test)10 (SD 3)8.41 ( SD 2.55)12.3
Similarites (verbal abstract reasoning)10 (SD 3)13.35 (SD 2.91)15.54
Arithmetic (mental math)10 (SD 3)9.18 (SD 4.34)(this subtest contained a unit conversion item that seemed biased against Canadians so for those who advanced far enough to fail this item, scores were prorated; had they not been the mean would have been 7.53 (SD 3.54))11.02
Vocabulary10 (SD 3)9 (SD 2.5)14.95
Comprehension (Common sense & social judgement)10 (SD 3)9.47(SD 2.93)13.93
Digit Span (attention & rote memory)10 (SD 3)9.71 (SD 2.63)11.46 
Picture Completion (visual alertness)10 (SD 3)10.71 (SD 3.1)14.52
Picture Arrangement (social interpretation)10 (SD 3)10.24 (SD 2.73)13.35
Block Design (spatial organization)10 (SD 3)13.12 (SD 3.31)12.91
Object Assembly (spatial integration)10 (SD 3)11.82 (SD 1.89)14.06
Digit Symbol (Rapid eye-hand coordination)10 (SD 3)11.12 (SD 2.82)(note: only 12 of the 17 subjects took this subtest)14.66
Verbal IQ100 (SD 15)103.8 (SD 14.73)
Performance IQ100 (SD 15)109.3 (SD 12.11)
Full-scale IQ100 (SD 15)106.9 (SD 13.63)122


The Flynn effect is dramatically smaller than we’ve been led to believe at least on tests of specific information that may become obscure over generations. By contrast Similarities (abstract reasoning) and Block Design (spatial analysis) have indeed increased by amounts comparable with Flynn’s research. These two abilities may conspire to explain why some of the largest Flynn effects have been claimed on the Raven Progress Matrices, an abstract reasoning test using a spatial medium.

It’s unclear if these are nutritional gains caused by increasing brain size, neuroplastic gains caused by cultural stimulation, or mere teaching to the test caused by schooling, computers and brain games.

Lynn (1990) argued the Flynn effect was caused by nutrition, citing a twin study proving nutrition gains are more pronounced on Performance IQ (consistent with the Flynn effect). Research on identical twins (where one twin gets better prenatal nutrition than the other) has shown that by age 13, the well nourished twin exceeds his less nourished counterpart by about 0.5 SD on both head circumference and Performance IQ, but not at all on verbal IQ. Thus it’s interesting that 21st century young Northern American men today exceed their WWII counterparts by about 0.5 SD on both head circumference (22.61″ vs 22.3″) and Performance IQ (109 vs about 100).

One possibility is that Performance IQ gains are entirely caused by improvements in the biological environment (prenatal health and nutrition), while verbal IQ gains are entirely caused by cultural advances (i.e. education); though somewhat negated by knowledge obsolescence.

Coronavirus thoughts

Loved ones keep begging to go out for a walk and get some fresh air. They say it’s not good for me to be hiding in the house all day and that I need exposure to the sun. I went for a half hour walk yesterday but spent most of my time on the road because I didn’t want to pass other humans on the sidewalk. But even when no one else is around, I’m still not sure if it’s safe to walk. What if this virus stays in the air just waiting for someone to walk through it?

This whole thing could have been minimized if we had just stopped international travel a few weeks earlier. Virtually every case in my city was brought here by someone traveling abroad.

I wonder if our failure to contain this thing is evidence of dysgenics?

It’s during times like this we see who are the truly intelligent. It’s the people who can adapt the situation to our advantage by making masks, test kits, ventilators, developing vaccines, and pushing common sense public policy.

The great writers, artists and philosophers may think they’re the most intelligent, but they’re not the ones who can save us.

Why did civilization occur when & where it did?



Commenter Melo writes:

An ad hoc assertion is essentially a hypothesis that has not been independently verified and is invoked to save a theory from falsification.

You’ve been doing that since I first started commenting here. To the point that the theory itself is just losing any real explanatory power it had. An example would be the statement that cold temperatures select for IQ but if the environment is too inhospitable, civilization will likely not emerge.

Actually it’s been independently verified over and over again and contradicted never. All six independent civilizations were created South of the Caspian sea by peoples who had spent tens of thousands of years outside the tropics. It’s one of the least ad hoc theories you’ve ever heard of because it has not a single exception.

image found here

So even though anatomically modern humans have been around for about 195,000 years, all six civilizations emerged 6000 to 3500 years ago. Coincidence? No. We didn’t evolve the cognitive ability to build civilization until we left the tropics and were exposed to the ice age, and even those who had evolved to the ice age could only build civilization if they ended up in bountiful land (i.e. South of the Caspian sea).

This perfectly explains both why civilization took so long, and why it was only created by certain peoples.

So because the IQ needed to build civilization evolved in high latitude (cold), but the lands amenable to civilization were low latitude (warm) it took our species 189,000 years before we had the right kind of humans on the right type of land during an interglacial period. These were humans who had migrated North but not too far North (i.e. Middle Easterners), or those who migrated as far North as the arctic circle, only to return to the tropics (i.e. the Mayans)

If cold winters select for high IQ, do they do so directly or indirectly?


In evolutionary biology, a spandrel is a phenotypic characteristic that is a byproduct of the evolution of some other characteristic, rather than a direct product of adaptive selection—–Wikipedia, March 15, 2020

Commenter Mug of Pee writes:

arctic peoples have large brains for a very NOT just so story reason; the exact same adaptation is found in arctic mammals and in ice age humans and it’s very simple, a big round head is more insulating than a small or long head, allen’s rule.

Arthur Jensen agrees writing:

climate also influenced the evolution of brain size apparently indirectly through its direct effect on head size, particularly the shape of the skull. Head size and shape are more related to climate than is the body as a whole. Because the human brain metabolizes 20 percent of the body’s total energy supply, it generates more heat in relation to its size than any other organ. The resting rate of energy output of the average European adult male’s brain is equal to about three-fourths that of a 100-watt light bulb. Because temperature changes in the brain of only four to five degrees Celsius are seriously adverse to the normal functioning of the brain, it must conserve heat (in a cold environment) or dissipate heat (in a hot environment). Simply in terms of solid geometry, a sphere contains a larger volume (or cubic capacity) for its total surface area than does any other shape. Conversely, a given volume can be contained in a sphere that has a smaller surface area than can be contained by a nonspherical shape, and less spherical shapes will lose more heat by radiation. Applying these geometric principles to head size and shape, one would predict that natural selection would favor a small head with a less spherical shape (brachycephalic) shape because of its better heat dissipation in hot climates, and would favor a more spherical (dollichocephalic) shape because of its better heat conservation in cold climates….

From The g Factor page 436

So even if cold climates didn’t require any extra intelligence to survive in, they did require more brain mass just to keep warm, and given the moderate causal correlation between IQ and brain size, they would have selected for intelligence indirectly as a byproduct of thermoregulation.

There is also likely a causal correlation between IQ and brain sphericity (independent of size) because a sphere is the shape that minimizes the distance between neurons and thus presumably maximizes brain efficiency.

So it seems that not only could cold winters have selected for high IQ directly because of the intelligence needed to survive the cold, but they also may have selected indirectly via thermoregulation of brain size and brain shape.

The question for HBDers is how do we test these three potential causes to determine how big a role (if any) each played in population differences in IQ?

Did cold winters select for higher IQ? A reply to W. Buckner


, , ,

Cold Winter Theory (CWT) is the theory that population differences in IQ are largely explained by the ancestral climates that the peoples evolved in, with colder climates selecting for higher IQs because of the difficulty figuring out how to build warm shelters, make warm clothes, create fire, get food etc. Modern CWT can be credited to Richard Lynn, though the idea is so intuitive that it was independently inferred by multiple historical thinkers throughout the centuries.

Now if you don’t believe population IQ differences are genetic in origin, then you don’t need an evolutionary theory like CWT to explain the correlation between a population’s ancestral climate and their mean IQ; in theory it might be explained by non-genetic factors like parasite load.

But if you do believe they’re genetic, CWT is the obvious cause: Hominoids have spent 25 million years adapting to the tropics so these may not require as much novel problem solving as the arctic, which we only encountered in the last 40,000 years. Among extant hunter-gatherers, the higher the latitude, the more diverse and complex their tool kit. There’s a reason people travel South for vacation and take vacation in the summer and why many camp grounds close for the winter. It seems cold weather is generally more challenging than warm weather.

Nonetheless I’ve been alerted to yet another attempt to debunk CWT, this time by W. Buckner on a blog called TRADITIONS OF CONFLICT (hat-tip to MeLo & RR). Buckner makes three main arguments against CWT.

Argument 1: Climate can’t explain the low IQ of Bushmen because the Kalahari is sometimes cold.


While it’s true that temperatures can sink as low as 0 °C in the Kalahari desert, this is nothing compared to the lows of -30°C in Ukraine, -52°C in Kazakhstan, and -68°C reached in Russia; three countries that makeup the Pontic-Caspian steppe, the homeland of the Indo-Europeans, far and away the most successful language group on the planet, giving rise to nearly half of the World’s population. With their wits perhaps sharpened by millennia of surviving extreme cold, they domesticated the horse and used them to brilliantly exploit the wheel, allowing their chariots to conquer almost everyone from Europe to India in record time.

Argument 2: Cold climates don’t require more intelligence to hunt because tropical people hunt too.

CWT claims that because plant foods are scarce in cold, high latitude places, people needed to be smart enough to cooperatively and strategically hunt large game, while tropical peoples could mindlessly pick berries all day. Buckner debunks this claim by noting that hunter-gatherers of all latitudes depend roughly equally on hunted animals for subsistence.


While Buckner might be correct that today, even tropical hunter-gathers depend as much on hunted animals as their Northern counterparts (at least land-animals, Northern hunter-gatherers do more fishing); it was likely untrue in the Paleolithic when population differences were evolving.

Smithsonian Magazine writes:

Living in Eurasia 300,000 to 30,000 years ago…in places like the Polar Urals and southern Siberia—not bountiful in the best of times, and certainly not during ice ages. In the heart of a tundra winter, with no fruits and veggies to be found, animal meat—made of fat and protein—was likely the only energy source.

Further evidence that cold climate Paleolithic peoples were more hunting dependent than their tropical counterparts is the fact that the former likely drove the mammoth to extinction, while the tropical dwelling elephant remains extant.

Argument 3: cold climates don’t require more intelligence to make clothes because tropical tribes can make clothes too.

Part of CWT is that the need for warm clothing as humans migrated North selected for high intelligence because those lacking the cognitive ability to make such clothes quickly froze to death (or their babies did) leaving those with high IQ DNA as the survivors.

To counter this point, Buckner mentions the elaborate costumes donned by the Bororo hunter-gatherers of Mato Grasso, Brazil during ceremonies, to prove that tropical people evolved just as much tailoring talent.

Anthropologist Vincent Petrullo is quoted:

The dancer was painted red with urucum and down pasted on his breast. His face was also smeared with urucum. Around his arms were fastened armlets made from strips of burity palm leaf, and his face was covered with a mask made of woman’s hair. The foreskin of the penis was tied with a narrow strip of burity palm leaf, for these men under their tattered European clothing still carry this string. A skirt of palm leaf strips was worn, and a jaguar robe was thrown over his shoulders. The skins of practically every speeies of snake to be found in the pantanal hung from his head down his back over the jaguar robe, which was worn with the fur on the ontside. The inner surface of the hide was painted with geometrie patterns, in red and black, but no one could explain the symbolism. A magnificent headdress consisting of many pieces, and containing feathers of many birds of the pantanal completed the costume with the addition of deerhoof rattles worn on the right ankle.

Bororo ceremony from Lowie (1963)


There are three problems with Buckner’s thesis:

Firstly, although the Bororo currently live in the tropics, they are descended from cold adapted people who crossed the Beringia land bridge from Siberia to present-day Alaska during the Ice Age, and then spread southward throughout the Americas over the following generations. Their tailoring skills may have evolved during those ancestral cold journeys.

Secondly, just because some members of the Bororo have elaborate tailoring skills does not mean these people on average have the tailoring skills of high lattitidue hunter-gatherers. The existence of a few talented tropical tailors no more debunks the tailoring supremacy of high latitude people than the existence of a few really tall women debunks the male height advantage.

Lastly, although the Bororo costume is elaborate, it mostly just consists of wearing many skins on top of one another and attaching lots of things to one’s body. While this is impressive, it is nowhere near the proficiency of making body hugging clothes that cling to one snugly during the fierce winter. It reminds me a bit of cold nghts where I throw more and more blankets on myself to feel warm. This never works as well as putting on a pair of tightly knit jogging pants and a figure hugging sweater.

For all their pomp and circumstance, one dressed only in ceremonial Bororo costume could expect frostbite in less than 5 minutes during ice age winter Russia. Clearly this is nowhere near solving the problem of warm clothing, and that’s because it makes no use of one of the most revolutionary inventions of all time.

According to journalist Jacob Pagano  “…researchers found that humans developed eyed sewing needles in what is now Siberia and China as early as 45,000 years ago.” 

So crucial were eyed sewing needles that they are credited with allowing our species to out-survive the Neanderthals. A 2010 article in the guardian describes archaeologist Brian Fagan’s view:

While Neanderthals shivered in rags in winter, humans used vegetable fibres and needles – created by using stone awls – to make close-fitting, layered clothing and parkas: the survival of the snuggest, in short..

Of course no one is suggesting that cold climate was the only cause of population IQ gaps (it certainly doesn’t explain the high IQs of Ashkenazi Jews who largely descend from the warm Middle East). But it may help explain the more ancient differences between macro-level populations like North East Asians, West Eurasians and those from the tropics. I find it interesting to note that IQ tests involving spatial ability show larger gaps between humans from warm and cold regions than tests involving verbal skill. This is the opposite of what a culture bias explanation would predict, but is consistent with CWT since natural selection may have favored spatial ability in the cold for sewing, building shelters, making fires and hunting etc.

Top horror-themed games you can enjoy on your mobile phone


, ,

Not written by Pumpkin Person

Horror video games have been scaring players for several decades now, and today that fear factor has moved inside people’s pockets! While that might not be the best place for you to see your fear residing at, but in times when mobile horror games are ruling the roost, everyone is coming on board and is having a gala time playing these games! Furthermore, there is no denying the fact that mobile gaming is on the rise like never before, with more and more people of all age groups becoming proud owners of high-end mobile phones today. Here in this short article, we will acquaint you with some of the top-rated horror themed games that you can easily enjoy on your mobile phone today.


Available for both iOS and Android platforms, developed by Jesse Makkonen and priced around $ 4.99, Distraint utilises the concept of side scrolling adventure and fuels plenty of new visual elements into it. The dialogues are concise and the puzzles are kept light, but the story is guaranteed to make you go bonkers. What sets this game apart from others in the same genre is how it creatively uses sound. If played using headphones, you will hear abstract sounds and hissing TV screens, from one ear to the other, as the visuals change based on the aural features. You’ll see insects dancing, walls bleeding and lights flickering, delivering that scary feeling of desolation. The concept in particular is amongst the smartest ones around today.

The Five Nights at Freddy’s

Scott Cawthorn, the developer of this game has ensured that if creepy animatronic animals and jump scares are your piece of cake, this game will freak you out! Priced at a reasonable $ 2.99 and available for both iOS and Android platforms, the characters with their surreal looks make this game a standout in its genre. You get to control cameras like a night watch guard, thereby providing the interactivity element. The aesthetics are quite like that of Child’s Play and SAW, hence are good enough to keep you on your toes! In fact, Scott has claimed that no one has been able to figure out the game’s real story yet! The fear you witness in this game is more because of the suspense element than blood or violence.

Sinister Edge

Developed by Everbyte and offered free of cost, this mobile themed horror game for iOS and Android platforms is a part puzzler and part walking simulator which is clearly inspired from Resident Evil and other urban legends like Slenderman. The game does its trick by creating a very spooky atmosphere. You witness storms rolling around in the sky, over a well-created mansion which screams ‘Don’t Enter’. There is a spectral masked antagonist that will pop up every now and then at random places, ensuring that he scares the wits out of you, as you go about exploring its dark corridors, in search of the keys. It is also one of the best horror themed games that makes excellent use of motion control in the form of puzzles. Resultantly you feel heightening of the tension as you go about frantically twisting and turning your device.

Charles Murray’s Philosophically Nonexistent Defense of Race in “Human Diversity”


, , , ,

[The following is a guest article written by RR. It does not necessarily reflect the views of Pumpkin Person]

Charles Murray published his Human Diversity: The Biology of Gender, Race, and Class on 1/28/2020. I have an ongoing thread on Twitter discussing it.

Murray talks of an “orthodoxy” that denies the biology of gender, race, and class. This orthodoxy, Murray says, are social constructivists. Murray is here to set the record straight. I will discuss some of Murray’s other arguments in his book, but for now I will focus on the section on race.

Murray, it seems, has no philosophical grounding for his belief that the clusters identified in these genomic runs are races—and this is clear with his assumptions that groups that appear in these analyses are races. But this assumption is unfounded and Murray’s assumption that the clusters are races without any sound justification for his belief actually undermines his claim that races exist. That is one thing that really jumped out at me as I was reading this section of the book. Murray discusses what geneticists say, but he does not discuss what any philosophers of race say. And that is to his downfall.

Murray discusses the program STRUCTURE, in which geneticists input the number of clusters they want and, when DNA is analyzed (see also Hardimon, 2017: chapter 4). Rosenberg et al (2002) sampled 1056 individuals from 52 different populations using 377 microsatellites. They defined the populations by culture, geography, and language, not skin color or race. When K was set to 5, the clusters represented folk concepts of race, corresponding to the Americas, Europe, East Asia, Oceania, and Africa. (See Minimalist Races Exist and are Biologically Real.) Yes, the number of clusters that come out of STRUCTURE are predetermined by the researchers, but the clusters “are genetically structured … which is to say, meaningfully demarcated solely on the basis of genetic markers” (Hardimon, 2017: 88).

Races as clusters

Murray then discusses Li et al, who set K to 7 and North Africa and the Middle East were new clusters. Murray then provides a graph from Li et al:

So, Murray’s argument seems to be “(1) If clusters that correspond to concepts of race setting K to 5-7 appear in STRUCTURE and cluster analyses, then (2) race exists. (1). Therefore (2).” Murray is missing a few things here, namely conditions (see below) that would place the clusters into the racial categories. His assumption that the clusters are races—although (partly) true—is not bound by any sound reasoning, as can be seen by his partitioning Middle Easterners and North Africans as separate races. Rosenberg et al (2002) showed the Kalash in K=6, are they a race too?

No, they are not. Just because STRUCTURE identifies a population as genetically distinct, it does not entail that the population in question is a race because they do not fit the criteria for racehood. The fact that the clusters correspond to major areas means that the clusters represent continental-level minimalist races so races, therefore, exist (Hardimon, 2017: 85-86). But to be counted as a continental-level minimalist race, the group must fit the following conditions (Hardimon, 2017: 31):

(C1) … a group is distinguished from other groups of human beings by patterns of visible physical features
(C2) [the] members are linked by a common ancestry peculiar to members of that group, and
(C3) [they] originate from a distinctive geographic location


…what it is for a group to be a race is not defined in terms of what it is for an individual to be a member of a race. What it means to be an individual member of a minimalist race is defined in terms of what it is for a group to be a race.

Murray (paraphrased): “Cluster analyses/STRUCTURE spit out these continental microsatellite divisions which correspond to commonsense notions of race.” What is Murray’s logic for assuming that clusters are races? It seems that there is no logic behind it—just “commonsense.” (See also Fish, below.) Due to not finding any arguments for accepting X number of clusters as the races Murray wants, I can only assume that Murray just chose which one agreed with his notions and use for his book.  (If I am in error, then if there is an argument in the book then maybe someone can quote it.) What kind of justification is that?

Compared to Hardimon’s argument and definition. Homo sapiens is:

… a subdivision of Homo sapiens—a group of populations that exhibits a distinctive pattern of genetically transmitted phenotypic characters that corresponds to the group’s geographic ancestry and belongs to a biological line of descent initiated by a geographically separated and reproductively isolated founding population. (Hardimon, 2017: 99)


Step 1. Recognize that there are differences in patterns of visible physical features of human beings that correspond to their differences in geographic ancestry.

Step 2. Observe that these patterns are exhibited by groups (that is, real existing groups).

Step 3. Note that the groups that exhibit these patterns of visible physical features correspond to differences in geographical ancestry satisfy the conditions of the minimalist concept of race.

Step 4. Infer that minimalist race exists. (Hardimon, 2017: 69)

While Murray is right that the clusters that correspond to the folk races appear in K = 5, you can clearly see that Murray assumes that ALL clusters would then be races and this is where the philosophical emptiness of Murray’s account comes in. Murray has no criteria for his belief that the clusters are races, commonsense is not good enough.

Philosophical emptiness

Murray then lambasts the orthodoxy for claiming that race is a social construct.

Advocates of “race is a social construct” have raised a host of methodological and philosophical issues with the cluster analyses. None of the critical articles has published a cluster analysis that does not show the kind of results I’ve shown.

Murray does not, however, discuss a more critical article of Rosenberg et al (2002)Mills (2017) – Are Clusters Races? A Discussion of the Rhetorical Appropriation of Rosenberg et al’s “Genetic Structure of Human Populations.” Mills (2017) discusses the views of Neven Sesardic (2010)—philosopher—and Nicholas Wade—science journalist and author of A Troublesome Inheritance (Wade, 2014). Both Wade and Seasardic are what Kaplan and Winther (2014) term “biological racial realists” whereas Rosenberg et al (2002)Spencer (2014), and Hardimon (2017) are bio-genomic/cluster realists. Mills (2017) discusses the “misappropriation” of the bio-genomic cluster concept due to the “structuring of figures [and] particular phrasings” found in Rosenberg et al (2002). Wade and Seasardic shifted from bio-genomic cluster realism to their own hereditarian stance (biological racial realism, Kaplan and Winther, 2014). While this is not a blow to the positions of Hardimon and Spencer, this is a blow to Murray et al’s conception of “race.”

Murray (2020: 144)—rightly—disavows the concept of folk races but wrongly accepting the claim that we dispense with the term “race”:

The orthodoxy is also right in wanting to discard the word race. It’s not just the politically correct who believe that. For example, I have found nothing in the genetics technical literature during the last few decades that uses race except within quotation marks. The reasons are legitimate, not political, and they are both historical and scientific.

Historically, it is incontestably true that the word race has been freighted with cultural baggage that has nothing to do with biological differences. The word carries with it the legacy of nineteenth-century scientific racism combined with Europe’s colonialism and America’s history of slavery and its aftermath.


The combination of historical and scientific reasons makes a compelling case that the word race has outlived its usefulness when discussing genetics. That’s why I adopt contemporary practice in the technical literature, which uses ancestral population or simply population instead of race or ethnicity …

[Murray also writes on pg 166]

The material here does not support the existence of the classically defined races.

(Nevermind the fact that Murray’s and Herrnstein’s The Bell Curve was highly responsible for bringing “scientific racism” into the 21st century—despite protestations to the contrary that his work isn’t “scientifically racist.”)

In any case, we do not need to dispense with the term race. We only need to deflate the term (Hardimon, 2017; see also Spencer, 2014). Rejecting claims from those termed biological racial realists by Kaplan and Winther (2014), both Hardimon (2017) and Spencer (20142019) deflate the concept of race—that is, their concepts only discuss what we can see, not what we can’t. Their concepts are deflationist in that they take the physical differences from the racialist concept (and reject the psychological assumptions). Murray, in fact, is giving into this “orthodoxy” when he says that we should stop using the term “race.” It’s funny, Murray cites Lewontin (an eliminativist about race) but advocates eliminativism of the word but still keeping the underlying “guts” of the concept, if you will.

We should only take the concept of “race” out of our vocabulary if, and only if, our concept does not refer. So for us to take “race” out of our vocabulary it would have to not refer to any thing. But “race” does refer—to proper names for a set of human population groups and to social groups, too. So why should we get rid of the term? There is absolutely no reason to do so. But we should be eliminativist about the racialist concept of race—which needs to exist if Murray’s concept of race holds.

There is, contra Murray, material that corresponds to the “classically defined races.” This can be seen with Murra’s admission that he read the “genetics technical literature”. He didn’t say that he read any philosophy of race on the matter, and it clearly shows.

To quote Hardimon (2017: 97):

Deflationary realism provides a worked-out alternative to racialism—it is a theory that represents race as a genetically grounded, relatively superficial biological reality that is not normatively important in itself. Deflationary realism makes it possible to rethink race. It offers the promise of freeing ourselves, if only imperfectly, from the racialist background conception of race.

Spencer (2014) states that the population clusters found by Rosenberg et al’s (2002) K = 5 run are referents of racial terms used by the US Census. “Race terms” to Spencer (2014: 1025) are “a rigidly designating proper name for a biologically real entity …” Spencer’s (2019b) position is now “radically pluralist.” Spencer (2019a) states that the set of races in OMB race talk (Office of Management and Budget) is one of many forms “race” can take when talking about race in the US; the set of races in OMB race talk is the set of continental human populations; and the continental set of human populations is biologically real. So “race” should be understood as proper names—we should only care if our terms refer or not.

Murray’s philosophy of race is philosophically empty—Murray just uses “commensense” to claim that the clusters found are races, which is clear with his claim that ME/NA people constitute two more races. This is almost better than Rushton’s three-race model but not by much. In fact, Murray’s defense of race seems to be almost just like Jensen’s (1998: 425) definition, which Fish (2002: 6) critiqued:

This is an example of the kind of ethnocentric operational definition described earlier. A fair translation is, “As an American, I know that blacks and whites are races, so even though I can’t find any way of making sense of the biological facts, I’ll assign people to my cultural categories, do my statistical tests, and explain the differences in biological terms.” In essence, the process involves a kind of reasoning by converse. Instead of arguing, “If races exist there are genetic differences between them,” the argument is “Genetic differences between groups exist, therefore the groups are races.”

So, even two decades later, hereditarians are STILL just assuming that race exists WITHOUT arguments and definitions/theories of race. Rushton (1997) did not define “race”, and also just assumed the existence of his three races—Caucasians, Mongoloids, and Negroids; Levin (1997), too, just assumes their existence (Fish, 2002: 5). Lynn (2006: 11) also uses a similar argument to Jensen (1998: 425). Since the concept of race is so important to the hereditarian research paradigm, why have they not operationalized a definition and rely on just assuming that race exists without argument? Murray can now join the list of his colleagues who also assume the existence of race sans definition/theory.


Hardimon’s and Spencer’s concepts get around Fish’s (2002: 6) objection—but Murray’s doesn’t. Murray simply claims that the clusters are races without really thinking about it and providing justification for his claim. On the other hand, philosophers of race (Hardimon, 2017Spencer, 20142019ab) have provided sound justification for the belief in race. Murray is not fair to the social constructivist position (great accounts can be found in Zack (2002)Hardimon (2017)Haslanger (2000)). Murray seems to be one of those “Social constructivists say race doesn’t exist!” people, but this is false: Social constructs are real and the social can does have potent biological effectsSocial constructivists are realists about race (Spencer, 2012Kaplan and Winther, 2014Hardimon, 2017), contra Helmuth Nyborg.

Murray (2020: 17) asks “Why me? I am neither a geneticist nor a neuroscientist. What business do I have writing this book?” If you are reading this book for a fair—philosophical—treatment for race, look to actual philosophers of race and don’t look to Murray et al who do not, as shown, have a definition of race and just assume its existence. Spencer’s Blumenbachian Partitions/Hardimon’s minimalist races are how we should understand race in American society, not philosophically empty accounts.

Murray is right—race exists. Murray is also wrong—his kinds of races do not exist. Murray is right, but he doesn’t give an argument for his belief. His “orthodoxy” is also right about race—since we should accept pluralism about race then there are many different ways of looking at race, what it is, and its influence on society and how society influences it. I would rather be wrong and have an argument for my belief then be right and appeal to “commonsense” without an argument.