The incredible long-term stability of Wechsler IQ

For all the talk we hear about neuroplasticity, it seems IQ, at least as measured by the Wechsler, is incredibly stable.  According to a study by Erik Lykke Mortensen and his colleagues, there was an astonishing 0.89 correlation between WISC full-scale IQ measured at age 9.5, and WAIS full-scale IQ measured at 23.5 in a sample of 26 low birth-weight kids.  That’s absolutely colossal.  To put that in perspective, when a sample of 16-year-olds (n = 80) took the WAIS and WISC within an interval of one to six weeks, the correlation was 0.88 (WAIS-R manual, pg 48).

In other words, WISC IQ measured at age 9.5 predicts a young adult’s current WAIS IQ about as well as his WISC IQ measured a few weeks ago!

Wechsler IQ appears to be much more more stable than even height!  For example, the correlation between adult height and height at age 13 was 0.7 in a sample of Copenhagen men.

And given the moderate to high correlation IQ has with everything from lifetime income to occupational status, its long-term stability is even more compelling.  A psychologist can give a 9-yea-old an hour’s worth of silly games involving cartoon pictures, jig-saw puzzles, blocks, and funny riddles, and from that predict the trajectory of his life better than his teachers and parents.  They are modern day prophets.  The cartoon drawings of black children on the WISC-R are their tarot cards.  Indeed the South Asian woman who gave me the WISC-R at age 12 even dressed like a fortune teller.


And yet they’re also scientists.  Intelligence researchers were the ones who invented correlation, factor analysis, and other techniques scientists in all fields depend on, and they invented IQ tests, one of the single most stable, predictive, and fascinating measures science has ever seen.

Of course you can dismiss the study I cited above because the sample was not large and representativeness enough, but Steve Hsu independently reported similar data, from other tests:

From fig 4.7 in Eysenck‘s Structure and Measurement of Intelligence. This is using data in which the IQ was tested *three times* over the interval listed and the results averaged. A single measurement at age 5 would probably do worse than what is listed below. Unfortunately there are only 61 kids in the study.

age range       correlation with adult score

42,48,54 months               .55
5,6,7                               .85
8,9,10                             .87
11,12,13                          .95
14,15,16                          .95

The results do suggest that g is fixed pretty early and the challenge is actually in the measuring of it as opposed to secular changes that occur as the child grows up. That is consistent with the Fagan et al. paper cited above. But it doesn’t remove the uncertainty that a parent has over the eventual IQ of their kid when he/she is only 5 years old.

Another study of 141 adults found a near perfect 0.9 correlation between WAIS full-scale IQ measured at age 50 and 70 (a 20-year interval!).

Of course none of this conclusively proves IQ is as immutable as height or as solid as a rock.  It could be that society is forcing stability on the brain by giving mental stimulation only to those who show promise young.



Open thread April 22 to April 28, 2018

Here’s another video about the Ukranian girl raised by dogs:

An extreme and unethical heritability study would be if dozens of twin pairs were separated at birth and one twin randomly assigned to be raised by dogs and the other twin raised by Ivy League billionaires.  Then at age 30, the Ivy League billionaires would be tested on the WAIS-IV and the dog raised twins could be tested on a dog intelligence test (see video below).

The IQ gap between each twin and her co-twin would be colossal but it would be fascinating if a positive correlation was nonetheless observed.   It would also be interesting to see if a human raised by dogs would score substantially higher than dogs on a dog intelligence test.  Obviously they would if they were rescued and socialized, but what about before then?

Speaking of IQ, here’s Jordan Peterson discussing it again:


Genomic predictors & the heritability of height


Thanks to Louis Lello and his colleagues, we now can predict a person’s height just from his DNA and these height predictors correlate about 0.64 with actual within sex height.

Of course this correlation is based on a UK sample and as Mug of Pee has long argued, when the environment is that narrow, you can’t be sure the  genome is actually causing the height, or if some genomes just grow tall in particular countries for local reasons, but would not have a height advantage elsewhere (see reaction norms vs independent genetic effects)

Thus I was heartened to learn that this genomic predictor was tested in environments as diverse as South Asia, China and Africa.  One of the studies authors Steve Hsu writes:

Note, despite the reduction in power our predictor still captures more height variance than any other existing model for S. Asians, Chinese, Africans, etc.

So the predictive power falls below 0.64 as we move far from the country the predictor was created in, but “still captures more height variance than any other existing model”.

So how well does any other existing model do?  In their paper they write:

Recent studies using data from the interim release of the UKBB reported prediction correlations of about 0.5 for human height using roughly 100K individuals in the training

So this one genomic predictor correlates at least 0.5+ with within-sex height all over the world, suggesting a truly causal relationship.  Squaring the correlation tells us that within sex height heritability (in the most meaningful causal sense of the term) is at least 0.25.

But why only 0.25, when twin studies suggest heritabilities of roughly triple that?

One possibility is all the flaws in twin studies, but another possibility is that common additive genetic variants only account for a small fraction of the heritability of height and other complex polygenetic traits like IQ.  To find rare genetic variants you must look at the entire genome, but considering how expensive that is and how rare the rare variants are, few are willing to spend the money.  In addition, there may be non-additive gene on gene interactions and if these are sufficiently complex, they may never be found.

How much more heritability is left to be found?

Perhaps a study of cattle might provide a clue:

Common sequence variants captured 83%, 77%, 76% and 84% of the total genetic variance for fat, milk, and protein yields and fertility, respectively

If human height is anything like these cattle phenotypes, then maybe we’ve found 80%, suggesting genomic predictors could go from explaining 25% of the within sex height variance to 31%, implying a predictive correlation of 0.56.

And if genomic predictions can achieve that much precision for within sex height, they can likely do the same for IQ once they genotype a sufficiently large sample (one million people) taking a sufficiently valid test (the WAIS).

The IQ correlation of MZ twins reared TRULY apart

The famous Bouchard twin study found a potent 0.75 IQ correlation for MZ twins reared apart.  Note that the phenotype correlation for MZ twins reared apart is a direct estimate of broad sense heritability (H^2).

But even MZ twins raised apart may spend their early years together, grow up in similar homes, have contact in later life, and be self-selected for similarity.  For this reason, back in August 2014, commenter Mug of Pee cited a little known critic of twin studies named Susan Farber. Mug of Pee wrote:

Farber investigated and found the right figure for IQ’s h^2 in the US is more like 20%.

Clicking on the above NY Times article, the source for the 20% figure seems to be this paragraph (emphasis mine):

 Defining ”reared apart” poses another great difficulty for researchers of twins. Different studies have used different criteria – such as age of separation, frequency of encounters between the twins or knowledge of the other twin’s existence. Dr. Farber, in her original and synthesizing role, has turned this confusion into an advantage. She devised a mathematical index with which she could measure the degree of separateness and used this information to correct the correlations found between the I.Q. test scores of twins reared separately. So corrected, the calculated correlation between twins’ I.Q. scores fell from a modest degree of within-pair similarity (accounting for about one-half of the variance) to a much lower degree of similarity (accounting for one-fifth of the variance). In other words, on the average, the more separately the twins were reared, the greater the difference between their I.Q. scores.

Presumably, the statement “one-fifth of the variance” is where Mug of Pee got his 20% heritability statistic, but 20% seems to actually be a squaring of the correlation between MZ twins apart (to get the percentage of variance explained).  Taking the square root of 20% suggests that the corrected IQ correlation for MZ twins reared apart is 0.45.

This is much smaller than the 0.75 heritability found in the Bouchard study, but it’s still pretty high when you consider that heritability itself is a square of the genotype-phenotype correlation.  Thus square rooting 0.45 implies a 0.67 correlation between genotype and IQ (among people raised in random homes).

So even after one of the biggest critics of twin studies corrects the data in a very biased way (according to her critics) genotype still predicts IQ about as accurately as SAT scores do (at least in countries like the U.S.)

The case of Isabelle

One of the most extreme cases from the annals of IQ research is Isabelle:

  • Isabelle was discovered living in a darkened room with her deaf-mute mother as her only contact.
  • When Isabelle was discovered she was almost seven years old and had no sense of language.
  • She had been deprived of learning how to speak because of her mother being both deaf and mute.
  • As a result, when authorities found her they believed that she was also deaf and mute like her mother, because she could only make noises.
  • This was proven wrong when she started to speak after receiving intense training.
  • When Isabelle was initially tested, at almost seven years old, her mental age concluded to be at about 19 months old.
  • Within two months of being trained, Isabelle was putting together logical sentences.
  • Within a year she was already learning how to read.
  • While her IQ score was extremely low when she was found; at almost nine years old she was completely caught up with her peers and had a normal IQ.

The case is fascinating because when Isabelle was tested in 1938, being almost seven, she likely had a far bigger brain than the average 19 month old, yet scored the same as the average 19 month old.

Commenter Race Realist argues that IQ tests measure exposure to the culture, and he’s partly right because Isabelle’s lack of culture caused her development to be extremely delayed.

But what’s interesting is that it takes the average baby 19 months of culture to acquire the same level of skill that Isabelle acquired almost instantly.  This shows that IQ tests are not merely measuring cultural exposure, but the brain’s physical development, and being almost seven, Isabelle likely had a much bigger and more complex brain than the average baby.  Having the physical brain of a seven-year-old, Isabelle was able to learn in a few months what the smalled brained average baby learns in 19 months, and in just 2 years she had acquired 7.5 years of childhood intellectual development.

Once she had caught up to her chronological peers at age 9, her progress became completely average because her neurological development no longer exceeded her intellectual development.

She graduated from high school an average student.

The question is was Isaebelle ever less intelligent than her chronological peers, or was the test simply culturally biased against a girl who had no language or culture for the first 6.5 years of life?  One wonders if on a more culture fair test, like the one the crow below is taking, she would have had an average IQ from the moment she was discovered.

Of course it’s worth noting that Isabelle was discovered young.  Not all cases of extreme deprivation end so well:

Was Simon Baron-Cohen questioned by Pumpkin Person fan?

I was listening to an autism lecture by the great Simon Baron-Cohen and around the 49 minute mark, someone in the audience asks what Simon thinks of the research showing autism is a slow life history strategy while schizophrenia is a fast life history strategy.  Simon is unfamiliar with the research but agrees it sounds plausible.

The research he’s citing sounds like my June 3, 2014 article Autism, schizophrenia and social class, where I wrote:

I’ve come across some fascinating research showing that autism is more common in higher social classes and schizophrenia is more common in lower social classes.  In my opinion, this is because the higher social classes tend to be more nerdy (K selected) and the lower social classes tend to be more cool (r selected).  The higher classes are nerdy in that they are more educated, more monogamous, more scrawny, and less sexually active.  By contrast, the lower classes are “cool” because they are more blue collar, more muscular, more likely to get arrested, more into sex, drugs and rock ‘n’ roll.

Of course I’m not the first to speculate that autism might be linked to slow life history.  There was a 2001 web page arguing that autism may have been inherited from Neanderthals:

Under harsh conditions it’s advantageous to mature and grow slower. This means individuals can survive on fewer resources. A consequence of slower maturing is longer life. Jack Cuozzo shows that Neanderthals matured slower than us, and probably got older. Autistic children often develop according to another slower scheme than other children, and may continue to develop into their 30s. 105 106 It is also believed that a key factor in ADHD might be slower mental maturation. 107 Similar findings exists for schizophrenia

However this article did not make my point (made by Simon’s questioner) that schizophrenics have fast life histories, instead arguing they have slow life histories.  And I’ve never believed that autism was inherited from Neanderthals, though I have speculated it might partly be an evolutionary adaptation to extreme cold.  It may also be an adaptation to civilization as Philosopher has argued.

Commenters like Race Realist are constantly arguing IQ tests are pseudoscience, but as Jordan Peterson cleverly noted, if you reject IQ, then you have to reject all of psychology, because IQ is the best validated construct psychologists have ever come up with.

Race Realist argues that IQ tests are based on circular logic because tests are constructed so that people considered smart score well.  While that’s partly true, we’re now at the point where IQ tests can be constructed by wholly objective criteria such as the degree to which test items correlate with the general intelligence factor (g) derived from a factor analysis of a large battery of tests.

If Race Realist thinks IQ tests are circular and pseudoscientific, what does he think of AQ questionnaires (Autism Quotient measures)?   These tests have questions like “do you like numbers?” and then report that math majors are more autistic.  This seems much more circular to me than IQ tests .

I think the best way to study autism is to avoid the questionnaires and instead just look at the most extreme cases (like Rain Man) than everyone in every culture could immediately agree is autistic.  If autism is truly linked to STEM talent (as Simon Baron-Cohen argues) or slow life history (as I’ve argued), it should be evident in the non-autistic siblings and parents of people like Rain Man who would regress to a milder (sub-clinical) variant of the condition.   On the other hand, if Rain Man’s relatives are just as likely to be bartenders as engineers, then autism is simply a disorder, and not a pathological extreme of normal (adaptive) variation.



Another Jordan Peterson video

Talks about how there are no jobs for people below 85 IQ and claims even a lot of lawyers will be out of work soon.

Most interesting thing he says is that people below IQ 80 take tens of hours to learn how to do a job most of his psychology students could learn in 10 minutes.  This is consistent with the theory once proposed by a member of Prometheus that complex learning/problem solving speed doubles every 5 or 10 IQ points.

On the other hand, why does research show that total vocabulary is normally distributed?  High verbal IQ people don’t have vocabs orders of magnitude greater than low verbal IQ people.  Maybe there just aren’t enough common words for such a pattern to emerge?

IQ stability vs neuroplasticity: Is intelligence like height or like muscle?

For years scientists in psychology and neurology have been pushing two rather contradictory ideas:

Psychologists:  Intelligence is like height.  Very stable and genetic, especially after puberty.  Aside from pathological cases like organic dementia or brain damage, the IQ you have in youth, you pretty much die with.  Even when low IQ kids are given the most extreme cultural enrichment imaginable, their real world intelligence doesn’t improve. See comments in below videos from Jordan Peterson:

Neurologists: Intelligence is like a muscle.  The more you exercise it, the stronger it becomes.  The brain is marvellously plastic.  Every time you learn a new skill you alter the chemical and physical structure of the brain.  It’s even possible to largely recover from brain damage.  See video below with Lara Boyd:

How can both be these views be true?

I think overall intelligence (what IQ tests try to measure) is almost as hard to change as height, but all the specific parts of intelligence (mental arithmetic, sense of direction, understanding irony) are like muscles that can be exercised.

Every time you exercise a part of your brain, that region gets bigger, just like every time you exercise a muscle, that muscle gets bigger.  One difference is, muscles are outside the skeleton, so they have room to expand indefinitely, but the brain is inside the skeleton, so its expansion is limited by cranial capacity.

Thus, the only way to make a part of the brain bigger is to make another part smaller.  So while you’re exercising your arithmetic IQ, your sense of direction IQ is slowly atrophying.  Start exercising your sense of direction IQ and your arithmetic IQ decays.

So while very specific parts of intelligence can be greatly improved, overall intelligence is limited by the size of the cranium and many other very finite resources.  So when you get a university degree, learn a new instrument, or acquire a new language, you haven’t actually made yourself much smarter overall, you’ve just reallocated cognitive resources from one ability to another.

So when low IQ children are adopted into extremely enriched environments, their IQs do shoot up but it doesn’t much translate into real world intelligent behavior, because all they have done is invested all their brain power in abilities measured by the test, but they haven’t actually increased the amount of brain power, so when new learning challenges inevitably show up,  they’re right back to where they were before the intervention.

Arthur Jensen referred to such IQ gains as “hollow with respect to g”, the general factor of intelligence.  He found for example that adoption into the upper class would improve the IQs of children from lower class homes, but the degree of improvement was uncorrelated with the g loadings of the specific tests.  It was hollow with respect to g, and he predicted that high IQ upper class adopted kids would not do as well in later life as their equally high IQ non-adopted siblings, because the former high IQ was hollow, while the latter was flowing with genetic g.

On the other hand, James Flynn argued g was irrelevant, citing the example of fetal alcohol syndrome (FAS) where IQ is obviously impaired with real world effects, but the degree of impairment is unrelated to g.  However I’d argue FAS is a pathological case, and thus not relevant to normal biological functioning.


