• About

Pumpkin Person

~ The psychology of horror

Pumpkin Person

Tag Archives: James Flynn

Why does shared environment affect some IQ tests more than others?

04 Saturday Apr 2020

Posted by pumpkinperson in Uncategorized

≈ 131 Comments

Tags

Arthur Jensen, Cattel-Horn-Caroll, Does Your Family Make You Smarter?, fluid vs crystallized, heritability, IQ, James Flynn, Raven Progressive Matrices, shared environment

Using twin studies, scientists divide phenotypic variation into three categories: DNA variation, shared environmental variation, and unshared environmental variation. Shared environment are all the experiences MZ twins reared together have in common (same upbringing, same schools, same womb) while unshared environment are all the experiences they don’t share (position within the womb, getting hit on the head, having an inspiring teacher).

The best estimate using massive datasets suggest that within Western democracies, DNA explains 41% of IQ variation at age 9, 55% at age 12, 66% at age 17, and 74% in adulthood. By contrast shared environment explains 33% at age 9, 18% at age 12, 16% at age 17, and 10% in adulthood (Bouchard 2013, figure 2). That leaves unshared environment explaining 26% of the variation at age 9, 27% at age 12, 19% at 17, and 16% in adulthood.

You don’t have to believe these associations are causal, but they are real. They’ve been more or less replicated using studies comparing (1) MZ twins with DZ twins, (2) MZ twins raised apart, (3) unrelated people reared in the same home. Although all of these methods depend of different assumptions, they all converge on the same conclusion: the predictive power of DNA skyrockets from childhood to adulthood while the predictive power of shared environment plummets. The same pattern (known as the Wilson effect) has also been observed for other phenotypes and in other species.

But why? Shouldn’t environment get more important as we age since experience has increasing time to accumulate? One theory is that more and more genes become active as we age. A more popular theory is that we select environments that maximize our genotype, so environment becomes just a magnifier of genes, not a causal force in its own right. So genetically smart people will stay in school and genetically strong people will lift weights and take steroids etc. People invest in where they’re more likely to be rewarded.

But here’s where things get really interesting. The Wilson effect behaves differently on different types of IQ tests. In his book Does your Family make you smarter? James Flynn notes that cognitive inequality increases from childhood to later adulthood (because good genes cause good environments and bad genes cause bad environments, the smart get smarter and the dumb get dumber, relative to the average person their age) but this pattern is much more pronounced on some tests than others.

Flynn describes three types of tests:

  • Type 1: Tests that show large family effects (shared environment) that decay slowly. This include tests involving vocabulary (define “rudimentary”), general knowledge (How old is the Earth?) verbal abstraction (how are a brain and a computer the same?) and social comprehension (why do you need a passport to travel?)
  • Type 2: Tests that show small family effects that decay fast. These include spatial manipulation (use these two triangles to make a square) and noticing incongruities (what’s missing or absurd in a picture of a common object or scene).
  • Type 3: Tests that show that large family effects that decay fast. These tests include clerical speed and arithmetic.

Flynn argues that type 1 tests involve skills that children learn from observing their parents talk, hence the large family effect. By contrast he says of type 2 tests:

Aside from the occasional jigsaw puzzle, they have no part in everyday life. Children never see their parents performing these cognitive tasks as part of normal behavior. Family effects are weak, even among preschoolers. Since these subtests match environment with genetic potential so young, they would be an ideal measure (for, say, 5-year-olds) of genes for intelligence.

From pages 53-54 of Does Your Family Make You Smarter? by James Flynn

In other words, Type 2 tests measure “novel problem solving”, while type 1 tests measure acquired abilities. A more provocative interpretation is type 2 tests measure real intelligence, while type 1 just measure knowledge and experience. This is the age-old distinction between aptitude tests vs achievement tests, culture fair vs culture loaded, fluid vs crystallized.

And yet Flynn largely rejects Cattell-Horn-Carroll’s theory that fluid ability (novel problem solving) is invested to acquire crystallized ability (accumulated knowledge) writing:

…fluid skill is just as heavily influenced by family environment as the most malleable crystallized skill (vocabulary) and therefore, neither skill deserves to be called an investment and the other a dividend.

From page 132 of Does Your Family Make You Smarter? by James Flynn

Flynn of course is referring to the greatest irony in the history of psychometrics and the biggest mistake of Arthur Jensen’s career: the Raven Progressive Matrices (long worshiped by Jensen and Jensenistas as the most culture fair measure of pure intelligence ever invented) is a type 1 test!

Which of the 8 choices completes the above pattern? Image from  from Carpenter, P., Just, M., & Shell, P. (1990, July)

But let’s not throw the baby out with the bath water. There’s no need to abandon CHC investment theory just because a major test got mischaracterized. But at the same time, it doesn’t feel right to reclassify the Raven as a crystallized test, Research is needed to understand why the Raven is so culturally sensitive when it superficially looks like a measure of novel problem solving. Is it measuring some kind of implicit crystallized knowledge we’re not conscious of like being familiar with patterns, columns and rows and reasoning through the process of elimination, or are the family effects on the non-cognitive part of the test (having the motivation to persist and concentrate on such an abstract task). Flynn argues that the brain is like a muscle, but if so, the Raven is an exercise most have never done before, so why isn’t it a type 2 test?

Flynn might argue that if your family helped you with abstract problems in algebra or had philosophical discussions about hypothetical concepts, you’ve been exercising for the Raven all your life, but this seems like a bit of a stretch. All the research shows that cognitive training has narrow transfer (i.e. practicing chess will only make you slightly better at checkers, and not at all better at scrabble) though perhaps the Raven’s uniquely abstract (general) nature allows it to slightly buck this trend.

Share this:

  • Click to share on X (Opens in new window) X
  • Click to share on Facebook (Opens in new window) Facebook
Like Loading...

Northern American IQ: circa 1937 to circa 2014 (2nd edition)

28 Saturday Mar 2020

Posted by pumpkinperson in Uncategorized

≈ 26 Comments

Tags

Flynn effect, IQ, James Flynn, nutrition, Richard Lynn, Wechsler Bellevue

The following article is an updated revision of an article I published in August 2019 about how 21st century Northern Americans score on an IQ test normed before the second World War. The reason for the update is that in December 2019, the sample size of my study increased by 13% (from n = 15 to n = 17). I had originally hoped to collect more data before publishing an update but with the uncertainty surrounding the coronavirus crisis, it’s unclear when that will be.

The Flynn effect, popularized by James Flynn, refers to the fact that IQ tests supposedly get easier with time. Although by definition the average IQ of American or British (white) people is always 100, the older the IQ test, the easier it is to score 100. Thus to keep the average at 100, tests like the Wechsler must be renormed every 10 years or so, otherwise the average IQ would increase by about 3 points per decade.

Although scholars continue to debate whether the Flynn effect reflects a genuine increase in intelligence (perhaps caused by prenatal nutrition or mental stimulation) or just greater test sophistication caused by modernity, there’s been remarkably little skepticism about the existence of the Flynn effect itself.

Malcolm Gladwell writes:

If an American born in the nineteen-thirties has an I.Q. of 100, the Flynn effect says that his children will have I.Q.s of 108, and his grandchildren I.Q.s of close to 120—more than a standard deviation higher. If we work in the opposite direction, the typical teen-ager of today, with an I.Q. of 100, would have had grandparents with average I.Q.s of 82—seemingly below the threshold necessary to graduate from high school. And, if we go back even farther, the Flynn effect puts the average I.Q.s of the schoolchildren of 1900 at around 70, which is to suggest, bizarrely, that a century ago the United States was populated largely by people who today would be considered mentally retarded.

While few people believe our grandparents were genuinely mentally retarded, it’s taken for granted that they would have scored in the mentally retarded range by today’s standards.

But is this true? I began having doubts over a decade ago when I examined the items on the first Wechsler intelligence scale ever made: the ancient WBI (Wechsler Bellevue intelligence scale). Meticulously normed on New Yorkers in the 1930s, this test remains far and away the most comprehensive look we have at early 20th century white Northern American intelligence, and while some of the subtests looked easy by today’s standards, others, especially vocabulary, looked harder.

The Kaufman effect

What also struck me was how little instruction, probing or coaching people got when taking the ancient WBI, compared to its modern descendant the WAIS-IV. This matters a lot because the way the Flynn effect is calculated on the Wechsler is by giving a new sample of people both the newest Wechsler and its immediate predecessor, in random order to cancel out practice effects, and then seeing which version they score higher on. If they average 3 points lower on the WAIS-IV normed in 2006 than on the WAIS-III normed in 1995, it’s assumed IQ increased by 3 points in 11 years.

The problem with this method (as Alan Kaufman may have discovered before me) is that the subset of the sample that took the newer version first has a huge advantage on the older version compared to the norming sample of the older test (over and above the practice effect which is controlled for), because the norming sample of the older test was never given coaching and probing.

Statistical artifact

A Promethean once said maybe the Flynn effect is just a statistical artifact of some kind. He never told me what he meant, but it got me thinking:

One problem with how the Flynn Effect is calculated on the Wechsler is that it’s assumed that gains over time can be added. For example it’s assumed that you can add the supposed 7.8 IQ gain from WAIS normings 1953.5 -1978 to the 4.2 IQ gain from normings 1978 – 1995 to the 3.7 IQ gain from normings 1995-2006, for a grand total of 15.7 IQ points from normings 1953.5 – 2006.

This would make sense if he were talking about an absolute scale like height, but is problematic when talking about a sliding scale like IQ. For example, suppose the raw number of questions correctly answered in 1953.5 was 20 with an SD of 2. By 1953.5 standards, 20 = IQ 100 and every 2 points = 15 IQ points above or below 100. Now suppose in 1978, people averaged 22 with an SD of 1. That’s a gain of 15 IQ points by 1953.5 standards. Now suppose in 1995 people average 23 with an SD of 2. That’s a gain of 15 IQ points by 1978 standards. Adding the two gains together implies a 30 point gain from 1953.5 to 1995, but by both 1953 and 1993 standards, the difference is only 23 points.

Changing content

Another problem with studying the Flynn effect is the content of tests like the Wechsler is constantly changing. This is especially problematic when studying long-term trends in general knowledge and vocabulary. If words that are obscure in the 1950s become popular in the 1970s, then people in the 1970s will score high on the 1950s vocabulary test. Meanwhile the 1970s vocabulary test may contain words that don’t become popular until the 1990s, Thus adding the vocabulary gains from the 1950s to the 1970s to the gains from the 1970s to the 1990s, might give the false impression that people in the 1990s will do especially well on a 1950s vocabulary test, when in reality, many words from the 1950s may have peaked in the 1970s and are even more obscure in the 1990s than they were in the 1950s.

An ambitious study

Given the Kaufman effect, the statistical artifact, and changing content, I realized the only way to truly understand the Flynn effect is to take the oldest quality IQ test I could find and replicate its original norming on a modern sample.

In 2008 I made it my mission to replicate Wechsler’s 1935-1938 norming of the very first Wechsler scale. Ideally I should have flown to New York where Wechsler had normed his original scale, but if Wechsler could use white New Yorkers as representative of all of white America (WWI IQ tests showed white New Yorkers matched the national white average), I could use white Ontarians as representative of all of white Northern America (indeed white Americans and white Canadians have virtually the same IQs). The target age group was 20-34 because this was the reference age group Wechsler had used to norm his subtests.

It took over a decade but I was gradually able to arrange for 17 randomly selected white young adults to take the one hour test. They were non-staff recruited from about half a dozen fast food/ coffeehouse locations in lower to upper middle class urban and suburban Ontario. The final sample ranged in education from 9.5 years (early high school dropout) to 18 years (Masters Degree in Engineering from one of Canada’s top universities). The mean self-reported education level was 12.9 years (SD = 2.12) suggesting that despite the lack of female participants, the sample was fairly representative (the average Canadian over 25 has about 13 years of schooling); in cases where those below the age of 25 were in the process of finishing a degree, they were credited as having it.

Testing conditions were not optimum (environments were sometimes noisy, at least one person had a few beers before testing; another was literally falling asleep during the test) and 17 people is way to small a sample to draw statistically significant conclusions about 11 different subtests. One man with a conspicuously low score was removed from the sample after he stated that he had years ago suffered a stroke.

Nonetheless, the below table shows how whites tested in 2008 to 2019 compared to Wechsler’s 1935-1938 sample, with the last column showing the expected scores of the 21st century sample, extrapolating gains James Flynn calculated from 1953.5 to 2006 (see page 240 of his book Are We Getting SMARTER?) to the current study: circa 1937 to circa 2013.5.

Note: the 11 subtests were scaled to have a mean of 10 and an SD of 3 in the original young adult norming sample, while the verbal, performance and full-scale IQs were scaled to have a mean of 100 and an SD of 15. Note also that vocabulary is alternate test, not used to calculate either verbal or full-scale IQ on the WBI. One third of my sample did not take Digit Symbol so for these, Performance and full-scale IQs were calculated via prorating.

Test:Nationally representative sample of young white adults (NY, 1935 to 1938)Randomish sample of young white adults (2008 to 2019, ON, Canada)Expected WBI scores in 2008-2019 based on Flynn’s calculated rate of increase
Information (general knowledge test)10 (SD 3)8.41 ( SD 2.55)12.3
Similarites (verbal abstract reasoning)10 (SD 3)13.35 (SD 2.91)15.54
Arithmetic (mental math)10 (SD 3)9.18 (SD 4.34)(this subtest contained a unit conversion item that seemed biased against Canadians so for those who advanced far enough to fail this item, scores were prorated; had they not been the mean would have been 7.53 (SD 3.54))11.02
Vocabulary10 (SD 3)9 (SD 2.5)14.95
Comprehension (Common sense & social judgement)10 (SD 3)9.47(SD 2.93)13.93
Digit Span (attention & rote memory)10 (SD 3)9.71 (SD 2.63)11.46 
Picture Completion (visual alertness)10 (SD 3)10.71 (SD 3.1)14.52
Picture Arrangement (social interpretation)10 (SD 3)10.24 (SD 2.73)13.35
Block Design (spatial organization)10 (SD 3)13.12 (SD 3.31)12.91
Object Assembly (spatial integration)10 (SD 3)11.82 (SD 1.89)14.06
Digit Symbol (Rapid eye-hand coordination)10 (SD 3)11.12 (SD 2.82)(note: only 12 of the 17 subjects took this subtest)14.66
Verbal IQ100 (SD 15)103.8 (SD 14.73)
Performance IQ100 (SD 15)109.3 (SD 12.11)
Full-scale IQ100 (SD 15)106.9 (SD 13.63)122

Conclusion

The Flynn effect is dramatically smaller than we’ve been led to believe at least on tests of specific information that may become obscure over generations. By contrast Similarities (abstract reasoning) and Block Design (spatial analysis) have indeed increased by amounts comparable with Flynn’s research. These two abilities may conspire to explain why some of the largest Flynn effects have been claimed on the Raven Progress Matrices, an abstract reasoning test using a spatial medium.

It’s unclear if these are nutritional gains caused by increasing brain size, neuroplastic gains caused by cultural stimulation, or mere teaching to the test caused by schooling, computers and brain games.

Lynn (1990) argued the Flynn effect was caused by nutrition, citing a twin study proving nutrition gains are more pronounced on Performance IQ (consistent with the Flynn effect). Research on identical twins (where one twin gets better prenatal nutrition than the other) has shown that by age 13, the well nourished twin exceeds his less nourished counterpart by about 0.5 SD on both head circumference and Performance IQ, but not at all on verbal IQ. Thus it’s interesting that 21st century young Northern American men today exceed their WWII counterparts by about 0.5 SD on both head circumference (22.61″ vs 22.3″) and Performance IQ (109 vs about 100).

One possibility is that Performance IQ gains are entirely caused by improvements in the biological environment (prenatal health and nutrition), while verbal IQ gains are entirely caused by cultural advances (i.e. education); though somewhat negated by knowledge obsolescence.

Share this:

  • Click to share on X (Opens in new window) X
  • Click to share on Facebook (Opens in new window) Facebook
Like Loading...

contact pumpkinperson at easiestquestion@hotmail.ca

Recent Comments

God's Word's avatarGod's Word on Which better predicts populati…
pumpkinperson's avatarpumpkinperson on Which better predicts populati…
RaceRealist's avatarRaceRealist on Which better predicts populati…
RaceRealist's avatarRaceRealist on Which better predicts populati…
RaceRealist's avatarRaceRealist on Which better predicts populati…
RaceRealist's avatarRaceRealist on Which better predicts populati…
RaceRealist's avatarRaceRealist on Which better predicts populati…
RaceRealist's avatarRaceRealist on Which better predicts populati…
Anime's avatarAnime on Which better predicts populati…
Anime's avatarAnime on Which better predicts populati…
Anime's avatarAnime on Which better predicts populati…
Anime's avatarAnime on Which better predicts populati…
Anime's avatarAnime on Which better predicts populati…
Anime's avatarAnime on Which better predicts populati…
RaceRealist's avatarRaceRealist on Which better predicts populati…

Archives

  • December 2025
  • November 2025
  • October 2025
  • September 2025
  • August 2025
  • July 2025
  • June 2025
  • May 2025
  • April 2025
  • March 2025
  • February 2025
  • January 2025
  • December 2024
  • November 2024
  • October 2024
  • September 2024
  • August 2024
  • July 2024
  • June 2024
  • May 2024
  • April 2024
  • March 2024
  • February 2024
  • January 2024
  • December 2023
  • November 2023
  • October 2023
  • September 2023
  • August 2023
  • July 2023
  • June 2023
  • May 2023
  • April 2023
  • March 2023
  • February 2023
  • January 2023
  • December 2022
  • November 2022
  • October 2022
  • September 2022
  • August 2022
  • July 2022
  • June 2022
  • May 2022
  • April 2022
  • March 2022
  • February 2022
  • January 2022
  • December 2021
  • November 2021
  • October 2021
  • September 2021
  • August 2021
  • July 2021
  • June 2021
  • May 2021
  • April 2021
  • March 2021
  • February 2021
  • January 2021
  • December 2020
  • November 2020
  • October 2020
  • September 2020
  • August 2020
  • July 2020
  • May 2020
  • April 2020
  • March 2020
  • February 2020
  • January 2020
  • December 2019
  • November 2019
  • October 2019
  • September 2019
  • August 2019
  • July 2019
  • June 2019
  • May 2019
  • April 2019
  • March 2019
  • February 2019
  • January 2019
  • December 2018
  • November 2018
  • October 2018
  • September 2018
  • August 2018
  • July 2018
  • June 2018
  • May 2018
  • April 2018
  • March 2018
  • February 2018
  • January 2018
  • December 2017
  • November 2017
  • October 2017
  • September 2017
  • August 2017
  • July 2017
  • June 2017
  • May 2017
  • April 2017
  • March 2017
  • February 2017
  • January 2017
  • December 2016
  • November 2016
  • October 2016
  • September 2016
  • August 2016
  • June 2016
  • February 2016
  • January 2016
  • November 2015
  • May 2015
  • January 2015
  • December 2014
  • November 2014
  • October 2014

Categories

  • ethnicity
  • heritability
  • income
  • Oprah
  • Uncategorized

Meta

  • Create account
  • Log in
  • Entries feed
  • Comments feed
  • WordPress.com

Recent Comments

God's Word's avatarGod's Word on Which better predicts populati…
pumpkinperson's avatarpumpkinperson on Which better predicts populati…
RaceRealist's avatarRaceRealist on Which better predicts populati…
RaceRealist's avatarRaceRealist on Which better predicts populati…
RaceRealist's avatarRaceRealist on Which better predicts populati…
RaceRealist's avatarRaceRealist on Which better predicts populati…
RaceRealist's avatarRaceRealist on Which better predicts populati…
RaceRealist's avatarRaceRealist on Which better predicts populati…
Anime's avatarAnime on Which better predicts populati…
Anime's avatarAnime on Which better predicts populati…
Anime's avatarAnime on Which better predicts populati…
Anime's avatarAnime on Which better predicts populati…
Anime's avatarAnime on Which better predicts populati…
Anime's avatarAnime on Which better predicts populati…
RaceRealist's avatarRaceRealist on Which better predicts populati…

Archives

  • December 2025
  • November 2025
  • October 2025
  • September 2025
  • August 2025
  • July 2025
  • June 2025
  • May 2025
  • April 2025
  • March 2025
  • February 2025
  • January 2025
  • December 2024
  • November 2024
  • October 2024
  • September 2024
  • August 2024
  • July 2024
  • June 2024
  • May 2024
  • April 2024
  • March 2024
  • February 2024
  • January 2024
  • December 2023
  • November 2023
  • October 2023
  • September 2023
  • August 2023
  • July 2023
  • June 2023
  • May 2023
  • April 2023
  • March 2023
  • February 2023
  • January 2023
  • December 2022
  • November 2022
  • October 2022
  • September 2022
  • August 2022
  • July 2022
  • June 2022
  • May 2022
  • April 2022
  • March 2022
  • February 2022
  • January 2022
  • December 2021
  • November 2021
  • October 2021
  • September 2021
  • August 2021
  • July 2021
  • June 2021
  • May 2021
  • April 2021
  • March 2021
  • February 2021
  • January 2021
  • December 2020
  • November 2020
  • October 2020
  • September 2020
  • August 2020
  • July 2020
  • May 2020
  • April 2020
  • March 2020
  • February 2020
  • January 2020
  • December 2019
  • November 2019
  • October 2019
  • September 2019
  • August 2019
  • July 2019
  • June 2019
  • May 2019
  • April 2019
  • March 2019
  • February 2019
  • January 2019
  • December 2018
  • November 2018
  • October 2018
  • September 2018
  • August 2018
  • July 2018
  • June 2018
  • May 2018
  • April 2018
  • March 2018
  • February 2018
  • January 2018
  • December 2017
  • November 2017
  • October 2017
  • September 2017
  • August 2017
  • July 2017
  • June 2017
  • May 2017
  • April 2017
  • March 2017
  • February 2017
  • January 2017
  • December 2016
  • November 2016
  • October 2016
  • September 2016
  • August 2016
  • June 2016
  • February 2016
  • January 2016
  • November 2015
  • May 2015
  • January 2015
  • December 2014
  • November 2014
  • October 2014

Categories

  • ethnicity
  • heritability
  • income
  • Oprah
  • Uncategorized

Meta

  • Create account
  • Log in
  • Entries feed
  • Comments feed
  • WordPress.com

Create a free website or blog at WordPress.com.

Privacy & Cookies: This site uses cookies. By continuing to use this website, you agree to their use.
To find out more, including how to control cookies, see here: Cookie Policy
  • Subscribe Subscribed
    • Pumpkin Person
    • Join 686 other subscribers
    • Already have a WordPress.com account? Log in now.
    • Pumpkin Person
    • Subscribe Subscribed
    • Sign up
    • Log in
    • Report this content
    • View site in Reader
    • Manage subscriptions
    • Collapse this bar
 

Loading Comments...
 

    %d