Imagine a pushy mother really wants her average 9.5-year-old son to get into a gifted class. Because IQ tests are normed for age and her son looks young for his age, she decides to tell the psychologist he’s only 6.5, so that he will get a much higher score.

Now assuming the boy is average for a 9.5-year-old in all cognitive domains he should score an age-ratio IQ of 146 in all domains, if the psychologist believes he’s 6.5 (because 9.5 is 146% of 6.5), however modern IQ tests use the deviation scale, where scores are assigned not by how many years advanced a kid is, but by one’s rank, relative to other American kids in one’s own age group. Being in the top 99% gives one an IQ of 65, being in the top 90% gives one an IQ of 80, being in the top 50% gives one an IQ of 100, being in the top 10% gives one an IQ of 120, and being in the top 1% gives one an IQ of 135, and being in the top 0.1% gives one an IQ of 146 etc.

Now obviously the average 9.5-year-old pretending to be a 6.5-year-old will get a high IQ in every domain, but his margin of superiority varies dramatically depending on the test. If he taken the WISC-R for example (before the norms expired), here’s how he’d have scored (for the readers convenience, I converted subtest scores, normally scaled from 1 to 19 into IQ equivalents):

Test:IQ based on age ratio method: MA 9.5/CA 6.5(100)Deviation IQDeviation IQ corrected for reliability
Information (general knowledge test)146130137
Similarities (verbal abstract reasoning)146120122
Arithmetic (mental math)146140145
Vocabulary146138144
Comprehension (Common sense & social judgement)146130136
Digit Span (attention & rote memory)146120123
Picture Completion (visual alertness)146120122
Picture Arrangement (social interpretation)146120123
Block Design (spatial organization)146120122
Object Assembly (spatial integration)146115117
Mazes (visual planning)146120122
Verbal IQ146141143
Performance IQ146128129
Full-scale IQ146139140
Because more reliable tests show larger correlations with age, the fourth column corrects deviation IQs for reliability. This was done by dividing the deviation from the mean by the square root of the reliability at the age the boy is claiming to be. Note that Digit Span and Mazes are optional tests not used to calculate the composite IQs (in bold) unless they are substituted for a core test. Digit Span was not used to calculate any composite score but Mazes was substituted for one of the core tests (Coding) because Coding is not the same test at all ages.

Now after adjusting for test reliability, our gifted 6.5-year-old (who is secretly an average 9.5-year-old) had deviation IQs ranging from 117 (object assembly) to 145 (Arithmetic).

Why such a huge discrepancy? The most obvious answer is that 9.5-year-olds have two cognitive advantages over 6.5-year-olds. Not only are their brains bigger and more developed, but they’ve also have three extra years of life experience. On tests that require novel problem solving, their advantage will be modest because it only reflects their neurological superiority. It can not reflect their life experience advantage because by definition, novel problems are things we’ve had little experience with.

By contrast, on tests that required acquired knowledge they have two advantages. Not only does the 9.5-year-old have more neurological ability to reason arithmetically, learn and remember facts, and infer the meaning of words, but he’s also had three extra years to acquire number concepts, general knowledge, and vocabulary. Because “crystallized” tests require both neurolgical ability and experience, they show a much steeper age progression than fluid tests (and a much steeper decline in old age, though this is confounded with the Flynn effect), which require only neurological ability (beyond some basic experience threshold that virtually all Americans reach by age 5 or so). Indeed looking at the age progression is a good way to quantify crystalized vs fluid.

Notice also that the tests that show the biggest age effects also tend to be the ones that show the biggest family effects per James Flynn’s method. Because both reflect experience.

Now let’s imagine when the pseudo-gifted boy turned 16.83 and he wanted to get into Mensa, and they were unwilling to accept scores from the past. He is still exactly average in all domains for his true age, but still looking young he tells the psychologist he is only 10.5. She doesn’t buy it, but she’s not about to turn down $800 for a few hours work so she plays along.

Here are his results:

Test:IQ based on age ratio method: MA 16.83/CA 10.5(100)Deviation IQ Deviation IQ corrected for unreliability
Information (general knowledge test)160133136
Similarites (verbal abstract reasoning)160130134
Arithmetic (mental math)160120123
Vocabulary160138141
Comprehension (Common sense & social judgement)160130136
Digit Span (attention & rote memory)160110 112
Picture Completion (visual alertness)160120124
Picture Arrangement (social interpretation)160115118
Block Design (spatial organization)160125127
Object Assembly (spatial integration)160125131
Mazes (visual planning)160118122
Verbal IQ160139141
Performance IQ160130132
Full-scale IQ160139140
Because more reliable tests show larger correlations with age, the fourth column corrects deviation IQs for reliability. This was done by dividing the deviation from the mean by the square root of the reliability at age 10.5

Once again, his years of extra life experience (relative to the age group he’s pretending to be) gave him a huge advantage on knowledge based tests like Vocabulary and Information. And once again, novel tasks like Block Design, Picture Arrangement, Mazes and Digit Span showed less advantage.

But what happened to Arithmetic? His advanced age gave him a huge advantage when he was a 9.5-year-old pretending to be 6.5, but now that he is 16.83 pretending to be 10.5, this subtest is even less age dependent than Block Design. The likely explanation is that once kids acquire basic number concepts in school (addition, subtraction, multiplication, and division) arithmetic depends less on experience and more on neurology.

This is why one can never say, categorically, that a given test measures fluid or crystallized ability. It depends on the population taking it. For example the Raven progressive matrices might be a fluid test within generations but a crystallized test between generations. Even something as seemingly crystallized as the math SAT might measure fluid ability in 17-year-olds with at least four years of advanced math. Among all 17-year-olds, it will still correlate with (but not so much be directly caused by) fluid ability because those with more fluid ability are likely to take advanced math in the first place.

Another interesting case is Similarities, which requires one to infer the link between common things (how are chess and scrabble alike?). This showed small age effects in early childhood, but large age effects in later childhood. At the lower end this test is just about the ability to see associations, but at the higher end, it becomes increasingly dependent on diction and sometimes esoteric concepts, making it more experience dependent.

One question is why, if crystallized tests are more culture dependent, do they often load more on psychometic g (the general factor of IQ tests believed to be a property of the physical brain). Perhaps as some have suggested, once you control for age, in countries where everyone has opportunity for schooling, knowledge tests measure both the ability to learn over an entire lifetime and the ability to store and retrieve a lifetime of learning. By contrast fluid tests only measure the ability to learn in the testing room, and not the ability to store and retrieve it years later. One theory is the more parts of the brain a test samples, the more g loaded it is, which would make sense if g is just overall cognition.