HBD, slavery & Professor Black Truth

Tags

, , ,

I have always doubted the historical consensus that Africans sold their own people into slavery. The whole point of slavery is that you’re forced to work without pay. If whites were powerful enough to force millions of blacks to work without pay, they would have been powerful enough to take those workers from black countries without having to pay for them in the first place.

Also, if they were going to pay for slaves, then why go all the way to Africa to get them? The whole point of going to a less technologically advanced region to get slaves is that you can take the slaves by force.

Lastly, if you believe in HBD (which many people dismiss as racist pseudoscience), the average white American is about 15 IQ points higher than the average black American and that gap would have perhaps been 20 points before white genes entered the U.S. black gene pool in large numbers. A 20 point IQ gap (if it reflects a genuine gap in intelligence) is so large that whites would have been dominant enough to simply help themselves to whatever resources they wanted in sub-Saharan Africa (including human ones) without having to pay anyone anything. In fact if whites really had to pay for slaves, it calls HBD into question because it implies a business deal among equal races, not an advanced race enslaving a more primitive one.

Ironically, Professor Black Truth agrees that whites stole (rather than paid for) black slaves but he uses it to argue their moral inferiority, not their cognitive superiority:

Social media brutally attacks Bill Gates

One of the dangers of having an IQ way off in the stratosphere is that the average American is literally mentally retarded compared to you, and because they can’t understand how you made all your money, they assume you must be some kind of evil witch, out to do them harm.

If Bill Gates’s IQ is anywhere near as high as the SAT measured it (IQ 170), then Bill Gates is to the U.S. population as the average American is to the Downs Syndrome population.

If the average American woke up to find the rest of the country had Downs Syndrome, they would understand what it’s like to be Bill Gates. They would easily become the richest person without even trying, but they would soon encounter jealousy and resentment (as Bill Gates faced when the justice department attacked Microsoft in the 1990s) and if they tried to help the masses, it might backfire spectacularly.

The New York Times writes:

Mr. Gates, 64, the Microsoft co-founder turned philanthropist, has now become the star of an explosion of conspiracy theories about the coronavirus outbreak. In posts on YouTube, Facebook and Twitter, he is being falsely portrayed as the creator of Covid-19, as a profiteer from a virus vaccine, and as part of a dastardly plot to use the illness to cull or surveil the global population…

…Misinformation about Mr. Gates is now the most widespread of all coronavirus falsehoods tracked by Zignal Labs, a media analysis company. The misinformation includes more than 16,000 posts on Facebook this year about Mr. Gates and the virus that were liked and commented on nearly 900,000 times, according to a New York Times analysis. On YouTube, the 10 most popular videos spreading lies about Mr. Gates posted in March and April were viewed almost five million times.

…Mr. Gates, who is worth more than $100 billion, has effectively assumed the role occupied by George Soros, the billionaire financier and Democratic donor who has been a villain for the right. That makes Mr. Gates the latest individual — along with Dr. Anthony Fauci, the leading U.S. infectious disease expert — to be ensnared in the flow of right-wing punditry that has denigrated those who appear at odds with Mr. Trump on the virus…

…His disdain for Mr. Trump, whom he has met several times, has also become public. In 2018, footage surfaced of Mr. Gates recounting how Mr. Trump needed help distinguishing H.I.V., which refers to the human immunodeficiency virus and causes AIDS, from HPV, which is the human papillomavirus, a sexually transmitted infection.

“Both times he wanted to know if there was a difference between H.I.V. and HPV, so I was able to explain that those are rarely confused with each other,” Mr. Gates said to laughter in comments to his foundation.

Billy takes the WAIS-IV

Commenter Billy was kind enough to share with us the raw scores he obtained on the WAIS-IV and I have converted these into scaled scores and composite scores. I apologize to the other people who have also asked me to do this for them, but Billy just happened to ask when I had more free time and his demographics make him a unique case study.

Billy is a young American black man born to an upper-class immigrant family. Despite spending his early childhood in sub-Saharan Africa and not learning English until coming to the U.S. at around age eight, Billy obtained one of his highest scores on the culturally loaded Vocabulary subtest.

Below are Billy’s scores. Note that the subtests are scored on a scale from 1 to 19, where 10 is the U.S. mean for one’s age and 3 is the standard deviation. This is comparable to the distribution of male U.S. height, where in the peak age group, the mean is about 10 inches (above five feet) and the standard deviation is about 3 inches.

Presumably Billy took the WAIS-IV circa 2020 (14 years after it was normed). Because older tests tend to give inflated results, in the far right column I have adjusted all of the scores for the Flynn effect. The unadjusted scores are probably too high but the adjusted scores are probably too low because the Flynn effect might not be as large as folks think and may have plateaued or even reversed since 2006 (we’ll know better when the WAIS-V comes out).

scores before Flynn effect adjustmentsadjusted for the Flynn effect
Vocabulary (word knowledge)1917.73
Similarities (verbal abstraction & thought organization)19+18.11+
Comprehension (socio-understanding & common sense)1716.49
Matrices (visual pattern recognition)1817.24
Visual Puzzles (spatial reasoning)1413.62
Figure Weights (quantitative comparison)1716.24
Digit Span (rote memory & attention)1716.62
Arithmetic (mental math)1212
Coding (rapid eye-hand coordination)98.75
Symbol Search (visual scanning)1514.75
Verbal comprehension index150+145+
Perceptual Reasoning index136133
Working Memory index125125
Processing speed index111111
Full-scale IQ141+138+
Adjustments for Flynn effect were made using page 240 of Are We Getting SMARTER? by James Flynn. I assumed that the rate of change that occurred between the norming of the WAIS-III (1995) and the WAIS-IV (2006) has continued to 2020. Flynn had no data for Visual Puzzles, Figure Weights or Symbol Search so rates for Block Design, Matrix Reasoning & Coding were assumed for each of those subtests respectively.

Normally the WAIS-IV includes the subtests Information (general knowledge) and Block Design (spatial analysis) but for whatever reason, Billy’s examiner decided to use optional subtests (Comprehension and Figure Weights) respectively. Such substitutions are allowed as long as the examiner makes them a priori, and not to help or hurt a particular subject’s scores.

Billy wrote the following in the comment section:

Like I said, it’s unofficial, so I was tested out of convenience; not from a professional. Though my personal reason outside of knowing my IQ is knowing where I’m deficient – or my cognitive profile. Based on my Mensa scores, a wonderlic at 75th percentile while a passing RAIT score just doesn’t hint at a stable profile for me. I seem to have a hard time concentrating on tasks, so I figured it’s probably a working memory and or processing speed issue. I’ve never been tested for a mental illness.

Billy’s profile shows Verbal Comprehension > Perceptual Reasoning > Working Memory > Processing Speed.

Unfortunately the WAIS-IV lumps Visual Abstraction (Matrix Reasoning and Figure Weights) into the same category as Spatial Reasoning (Visual Puzzles) creating a meaningless hybrid known as “Perceptual Reasoning index”. However if we untangle these two abilities, we see that Billy’s Visual Abstraction is almost as high as his Verbal Comprehension (both domains are highly g loaded) and that his Spatial Reasoning is somewhat lower than his Working Memory.

He’s comfortably above average in all domains, but his weak point is Processing Speed, probably because this is the least g loaded domain. It might also imply a weakness in Executive Function, but we’d need neurological testing to infer that. His very superior score in Comprehension implies high social intelligence; on the other hand Comprehension was lower than his other Verbal Comprehension scores. Had the WAIS-IV not foolishly removed the Picture Arrangement subtest, we’d have a fuller look at his social cognition.

Billy’s relatively low scores on Processing Speed and Arithmetic probably explain why he underperformed on the Wonderlic, a speed based test with a lot of number crunching.

Overall Billy is a man of incredibly high intelligence who should succeed at almost anything he sets his mind to. The results may even understimate his IQ, especially in the verbal comprehension sphere, because of his delayed exposure to English and U.S. culture.

On the other hand, the fact that Billy’s English vocabulary is so high despite such delayed exposure to English is consistent with research showing the impotence of early childhood intervention. For a more extreme example, see The Case of Isabelle.

James Flynn writes:

…Current environment is surprisingly self-contained: it influences one’s current cognitive abilities with very little interference from past environments. Most of us assume that your early family environment leaves some sort of indelible mark on your intelligence throughout life. But the literature shows that this is simply not so.

From page 6 of Does Your Family Make You Smarter? by James Flynn.

Arthur Jensen told racists to BUZZ OFF!

Tags

, , , , , , , , ,

In 1969, Arthur Jensen wrote an article in the prestigious Harvard Educational Review [HER} that transformed him from highly respected, but little known scholar, to one of the most controversial and influential psychologists of all time. So influential was Jensen, that a new word entered the English language: Jensenism; and a platoon of famous scholars made a career out of trying to debunk him.

The three tenets of Jensenism are:

  • Compensatory education fails to improve the IQ or scholastic skills of culturally deprived kids.
  • Genetics explains more of the variance in American IQ than culture.
  • Genetics likely explains some part of the 15 point black-white IQ gap in the United States

So powerful was Jensenism that President Nixon assigned his staff to report to him on Jensen’s HER article. In 1974 Daniel Patrick Moynihan stated, “The winds of Jensenism are blowing through Washington with gale force.”.

Frank Miele writes:

According to John Ehrlichman, Richard Nixon told him that he believed America’s Blacks could only marginally benefit from federal programs because they were genetically inferior to Whites. All the federal money and programs we could devise could not change that fact. Though he believed that Blacks could never achieve parity in intelligence, economic success, or social qualities, we should still do what we could for them, within limits, because it was the “right” thing to do.

From page 151 of Intelligence, Race and Genetics by Frank Miele

While Nixon was clearly a Jensenista, Jensen was a self-described liberal, stating:

In fact, I voted for Johnson in the 1964 presidential election. I felt strongly enough about it that I voted by absentee ballot because I was in London on sabbatical leave working as a Guggenheim Fellow in Eysenck’s department.

I believed in the Great Society proposals, particularly with respect to education and Head Start. When I returned to California I gave talks at schools, PTA meetings, and conferences and conventions explaining why these things were important and should be promoted. I have always been opposed to racial segregation and discrimination. They go against everything in my personal philosophy, which includes maximizing individual liberties and regarding every individual in terms of his or her own characteristics rather than the person’s racial or ethnic background. How could I think otherwise when at the time I had been steeped in Gandhian philosophy for over 20 years?

From pages 33-34 of Intelligence, Race and Genetics by Frank Miele

But by 1969 he was clearly less liberal when it came to compensatory education for disadvantaged kids. Was he changing his views to gain political traction in Nixon’s more conservative America? Jensen states:

Absolutely false! That way of thinking is completely foreign to me. I am almost embarrassed by my lack of interest in politics and I was even less interested in those days than I am now. The idea of providing any kind of “ammunition,” scientific or otherwise, to help any political regime promote its political agenda is anathema in my philosophy. One always hopes, of course, that politicians will pay attention to scientific findings and take them into consideration in formulating public policy. But I absolutely condemn the idea of doing science for political reasons.

I have only contempt for people who let their politics or religion influence their science. And I rather dread the approval of people who agree with me only for political reasons.

From page 35 of Intelligence, Race and Genetics by Frank Miele

Nonetheless, some racists reached out to Jensen, for help. Jensen states:

After the publicity surrounding the HER article, I did receive a number of letters from so-called citizens’ groups in various Southern states, asking if I would write letters to their local newspapers in support of racial segregation in public schools. I replied that I was, and always have been, absolutely opposed to racial segregation of any kind. One of these people wrote back calling me “just another Berkeley pinko!” He at least gave me the satisfaction of knowing I had angered him.

From page 21 of Intelligence, Race and Genetics by Frank Miele

So after 30 years of arguing that races differ in genetic IQ, what did Jensen think of affirmative action. Jensen states:

When the original concept of Affirmative Action was just catching on in the 1960s it was not a quota system. That only came later. I approved two main facets of its original intent, and I still do: (1) We should make special efforts to ensure that historically underrepresented minorities are fully aware the educational opportunities in colleges and universities, in job training programs, and in employment opportunities are open to all, provided they meet the usual qualifications; and (2) colleges and universities, job training programs, and employers should actively seek out and recruit minority persons who could qualify by the usual standards, including the use of academic talent searches at the high school level, special inducements, and scholarships to encourage academically promising minority students to go on to college.

From page 177 of Intelligence, Race and Genetics by Frank Miele

Arthur Jensen lived from 1923 to 2012.

Professor Black Truth’s IQ

Tags

, , , ,

Professor Black Truth is a youtube personality who reminds me of a black version of “Philosopher” and other alt-right extremists. Just as Philosopher thinks Ashkenazi elites conspire to undermine white interests (by propping up blacks), professor Black Truth thinks the elite serves white interests and conspires to oppress blacks (by propping up tools for white supremacy).

I estimate Professor Black Truth to be more verbally intelligent than 99.75% of Americans (verbal IQ 142(U.S. norms)) but he draws bad conclusions on social issues. He thinks Michael Jackson was innocent and has yet to progress beyond Chomsky talking points when it comes to U.S. foreign policy.

Estimated social IQ? 104 (slightly above the U.S. average)

Estimated overall IQ? About 128 (higher than 97% of America).

Of course that’s very rough because as a listener I can only observe his verbal skills and social understanding, not the many other abilities that are also part of intelligence.

In my opinion he is bitter that his high IQ didn’t take him as far in life as he thought it would and is resentful of the patronizing praise he probably got from much less intelligent white frat boys in college. Unlike Obama (who he views as a closet homosexual), his high IQ probably made him more of a freak than a star, and so he rationalizes his modest success by viewing Obama and other black elites as tools for white supremacy,

Despite his flawed analysis, he’s an extremely talented broadcaster with a darkly entertaining exaggerated delivery, much like a comic book villain. In this episode he accuses Ocasio-Cortez of being an anti-black bigot:

Why does experience affect some IQ tests more than others?

Imagine a pushy mother really wants her average 9.5-year-old son to get into a gifted class. Because IQ tests are normed for age and her son looks young for his age, she decides to tell the psychologist he’s only 6.5, so that he will get a much higher score.

Now assuming the boy is average for a 9.5-year-old in all cognitive domains he should score an age-ratio IQ of 146 in all domains, if the psychologist believes he’s 6.5 (because 9.5 is 146% of 6.5), however modern IQ tests use the deviation scale, where scores are assigned not by how many years advanced a kid is, but by one’s rank, relative to other American kids in one’s own age group. Being in the top 99% gives one an IQ of 65, being in the top 90% gives one an IQ of 80, being in the top 50% gives one an IQ of 100, being in the top 10% gives one an IQ of 120, and being in the top 1% gives one an IQ of 135, and being in the top 0.1% gives one an IQ of 146 etc.

Now obviously the average 9.5-year-old pretending to be a 6.5-year-old will get a high IQ in every domain, but his margin of superiority varies dramatically depending on the test. If he taken the WISC-R for example (before the norms expired), here’s how he’d have scored (for the readers convenience, I converted subtest scores, normally scaled from 1 to 19 into IQ equivalents):

Test:IQ based on age ratio method: MA 9.5/CA 6.5(100)Deviation IQDeviation IQ corrected for reliability
Information (general knowledge test)146130137
Similarities (verbal abstract reasoning)146120122
Arithmetic (mental math)146140145
Vocabulary146138144
Comprehension (Common sense & social judgement)146130136
Digit Span (attention & rote memory)146120123
Picture Completion (visual alertness)146120122
Picture Arrangement (social interpretation)146120123
Block Design (spatial organization)146120122
Object Assembly (spatial integration)146115117
Mazes (visual planning)146120122
Verbal IQ146141143
Performance IQ146128129
Full-scale IQ146139140
Because more reliable tests show larger correlations with age, the fourth column corrects deviation IQs for reliability. This was done by dividing the deviation from the mean by the square root of the reliability at the age the boy is claiming to be. Note that Digit Span and Mazes are optional tests not used to calculate the composite IQs (in bold) unless they are substituted for a core test. Digit Span was not used to calculate any composite score but Mazes was substituted for one of the core tests (Coding) because Coding is not the same test at all ages.

Now after adjusting for test reliability, our gifted 6.5-year-old (who is secretly an average 9.5-year-old) had deviation IQs ranging from 117 (object assembly) to 145 (Arithmetic).

Why such a huge discrepancy? The most obvious answer is that 9.5-year-olds have two cognitive advantages over 6.5-year-olds. Not only are their brains bigger and more developed, but they’ve also have three extra years of life experience. On tests that require novel problem solving, their advantage will be modest because it only reflects their neurological superiority. It can not reflect their life experience advantage because by definition, novel problems are things we’ve had little experience with.

By contrast, on tests that required acquired knowledge they have two advantages. Not only does the 9.5-year-old have more neurological ability to reason arithmetically, learn and remember facts, and infer the meaning of words, but he’s also had three extra years to acquire number concepts, general knowledge, and vocabulary. Because “crystallized” tests require both neurolgical ability and experience, they show a much steeper age progression than fluid tests (and a much steeper decline in old age, though this is confounded with the Flynn effect), which require only neurological ability (beyond some basic experience threshold that virtually all Americans reach by age 5 or so). Indeed looking at the age progression is a good way to quantify crystalized vs fluid.

Notice also that the tests that show the biggest age effects also tend to be the ones that show the biggest family effects per James Flynn’s method. Because both reflect experience.

Now let’s imagine when the pseudo-gifted boy turned 16.83 and he wanted to get into Mensa, and they were unwilling to accept scores from the past. He is still exactly average in all domains for his true age, but still looking young he tells the psychologist he is only 10.5. She doesn’t buy it, but she’s not about to turn down $800 for a few hours work so she plays along.

Here are his results:

Test:IQ based on age ratio method: MA 16.83/CA 10.5(100)Deviation IQ Deviation IQ corrected for unreliability
Information (general knowledge test)160133136
Similarites (verbal abstract reasoning)160130134
Arithmetic (mental math)160120123
Vocabulary160138141
Comprehension (Common sense & social judgement)160130136
Digit Span (attention & rote memory)160110 112
Picture Completion (visual alertness)160120124
Picture Arrangement (social interpretation)160115118
Block Design (spatial organization)160125127
Object Assembly (spatial integration)160125131
Mazes (visual planning)160118122
Verbal IQ160139141
Performance IQ160130132
Full-scale IQ160139140
Because more reliable tests show larger correlations with age, the fourth column corrects deviation IQs for reliability. This was done by dividing the deviation from the mean by the square root of the reliability at age 10.5

Once again, his years of extra life experience (relative to the age group he’s pretending to be) gave him a huge advantage on knowledge based tests like Vocabulary and Information. And once again, novel tasks like Block Design, Picture Arrangement, Mazes and Digit Span showed less advantage.

But what happened to Arithmetic? His advanced age gave him a huge advantage when he was a 9.5-year-old pretending to be 6.5, but now that he is 16.83 pretending to be 10.5, this subtest is even less age dependent than Block Design. The likely explanation is that once kids acquire basic number concepts in school (addition, subtraction, multiplication, and division) arithmetic depends less on experience and more on neurology.

This is why one can never say, categorically, that a given test measures fluid or crystallized ability. It depends on the population taking it. For example the Raven progressive matrices might be a fluid test within generations but a crystallized test between generations. Even something as seemingly crystallized as the math SAT might measure fluid ability in 17-year-olds with at least four years of advanced math. Among all 17-year-olds, it will still correlate with (but not so much be directly caused by) fluid ability because those with more fluid ability are likely to take advanced math in the first place.

Another interesting case is Similarities, which requires one to infer the link between common things (how are chess and scrabble alike?). This showed small age effects in early childhood, but large age effects in later childhood. At the lower end this test is just about the ability to see associations, but at the higher end, it becomes increasingly dependent on diction and sometimes esoteric concepts, making it more experience dependent.

One question is why, if crystallized tests are more culture dependent, do they often load more on psychometic g (the general factor of IQ tests believed to be a property of the physical brain). Perhaps as some have suggested, once you control for age, in countries where everyone has opportunity for schooling, knowledge tests measure both the ability to learn over an entire lifetime and the ability to store and retrieve a lifetime of learning. By contrast fluid tests only measure the ability to learn in the testing room, and not the ability to store and retrieve it years later. One theory is the more parts of the brain a test samples, the more g loaded it is, which would make sense if g is just overall cognition.

Why does shared environment affect some IQ tests more than others?

Tags

, , , , , , , ,

Using twin studies, scientists divide phenotypic variation into three categories: DNA variation, shared environmental variation, and unshared environmental variation. Shared environment are all the experiences MZ twins reared together have in common (same upbringing, same schools, same womb) while unshared environment are all the experiences they don’t share (position within the womb, getting hit on the head, having an inspiring teacher).

The best estimate using massive datasets suggest that within Western democracies, DNA explains 41% of IQ variation at age 9, 55% at age 12, 66% at age 17, and 74% in adulthood. By contrast shared environment explains 33% at age 9, 18% at age 12, 16% at age 17, and 10% in adulthood (Bouchard 2013, figure 2). That leaves unshared environment explaining 26% of the variation at age 9, 27% at age 12, 19% at 17, and 16% in adulthood.

You don’t have to believe these associations are causal, but they are real. They’ve been more or less replicated using studies comparing (1) MZ twins with DZ twins, (2) MZ twins raised apart, (3) unrelated people reared in the same home. Although all of these methods depend of different assumptions, they all converge on the same conclusion: the predictive power of DNA skyrockets from childhood to adulthood while the predictive power of shared environment plummets. The same pattern (known as the Wilson effect) has also been observed for other phenotypes and in other species.

But why? Shouldn’t environment get more important as we age since experience has increasing time to accumulate? One theory is that more and more genes become active as we age. A more popular theory is that we select environments that maximize our genotype, so environment becomes just a magnifier of genes, not a causal force in its own right. So genetically smart people will stay in school and genetically strong people will lift weights and take steroids etc. People invest in where they’re more likely to be rewarded.

But here’s where things get really interesting. The Wilson effect behaves differently on different types of IQ tests. In his book Does your Family make you smarter? James Flynn notes that cognitive inequality increases from childhood to later adulthood (because good genes cause good environments and bad genes cause bad environments, the smart get smarter and the dumb get dumber, relative to the average person their age) but this pattern is much more pronounced on some tests than others.

Flynn describes three types of tests:

  • Type 1: Tests that show large family effects (shared environment) that decay slowly. This include tests involving vocabulary (define “rudimentary”), general knowledge (How old is the Earth?) verbal abstraction (how are a brain and a computer the same?) and social comprehension (why do you need a passport to travel?)
  • Type 2: Tests that show small family effects that decay fast. These include spatial manipulation (use these two triangles to make a square) and noticing incongruities (what’s missing or absurd in a picture of a common object or scene).
  • Type 3: Tests that show that large family effects that decay fast. These tests include clerical speed and arithmetic.

Flynn argues that type 1 tests involve skills that children learn from observing their parents talk, hence the large family effect. By contrast he says of type 2 tests:

Aside from the occasional jigsaw puzzle, they have no part in everyday life. Children never see their parents performing these cognitive tasks as part of normal behavior. Family effects are weak, even among preschoolers. Since these subtests match environment with genetic potential so young, they would be an ideal measure (for, say, 5-year-olds) of genes for intelligence.

From pages 53-54 of Does Your Family Make You Smarter? by James Flynn

In other words, Type 2 tests measure “novel problem solving”, while type 1 tests measure acquired abilities. A more provocative interpretation is type 2 tests measure real intelligence, while type 1 just measure knowledge and experience. This is the age-old distinction between aptitude tests vs achievement tests, culture fair vs culture loaded, fluid vs crystallized.

And yet Flynn largely rejects Cattell-Horn-Carroll’s theory that fluid ability (novel problem solving) is invested to acquire crystallized ability (accumulated knowledge) writing:

…fluid skill is just as heavily influenced by family environment as the most malleable crystallized skill (vocabulary) and therefore, neither skill deserves to be called an investment and the other a dividend.

From page 132 of Does Your Family Make You Smarter? by James Flynn

Flynn of course is referring to the greatest irony in the history of psychometrics and the biggest mistake of Arthur Jensen’s career: the Raven Progressive Matrices (long worshiped by Jensen and Jensenistas as the most culture fair measure of pure intelligence ever invented) is a type 1 test!

Which of the 8 choices completes the above pattern? Image from  from Carpenter, P., Just, M., & Shell, P. (1990, July)

But let’s not throw the baby out with the bath water. There’s no need to abandon CHC investment theory just because a major test got mischaracterized. But at the same time, it doesn’t feel right to reclassify the Raven as a crystallized test, Research is needed to understand why the Raven is so culturally sensitive when it superficially looks like a measure of novel problem solving. Is it measuring some kind of implicit crystallized knowledge we’re not conscious of like being familiar with patterns, columns and rows and reasoning through the process of elimination, or are the family effects on the non-cognitive part of the test (having the motivation to persist and concentrate on such an abstract task). Flynn argues that the brain is like a muscle, but if so, the Raven is an exercise most have never done before, so why isn’t it a type 2 test?

Flynn might argue that if your family helped you with abstract problems in algebra or had philosophical discussions about hypothetical concepts, you’ve been exercising for the Raven all your life, but this seems like a bit of a stretch. All the research shows that cognitive training has narrow transfer (i.e. practicing chess will only make you slightly better at checkers, and not at all better at scrabble) though perhaps the Raven’s uniquely abstract (general) nature allows it to slightly buck this trend.

The true distribution of intelligence

Tags

, , ,

Modern IQ tests force test scores to fit a normal distribution, but I’ve long suspected the distribution of intelligence is anything but normal. After all, if you look at the distribution of wealth and income (a crude measure of intelligence), you find the richest people are worth millions of times more than the poorest. And that’s just because they’re not paying their fair share, as Elizabeth Warren would have us believe. We see the same skewed distribution in academic output, with the most productive scientists publishing orders of magnitude more than the least productive.

A member of Prometheus society once hypothesized that the human mind works in parallel, so that complex problem solving speed doubles every 10 IQ points (he later suggested 5)?

And yet at the same time, Arthur Jensen seemed to believe intelligence was normally distributed and in support of this cited cognitive measures that form a natural scale with equal intervals and a true zero point such as the total number of words a person knows or the number of digits they can repeat after one hearing, both of which were normally distributed. What was missing however from Jensen’s examples was complex on-the-spot problem solving.

Thanks to my research on how people today would score on the oldest Wechsler tests (the ancient WBI) I have some novel data on how long it takes people to solve visuo-spatial items. The WBI includes a subtest called Object Assembly where you have to fit a bunch of odd shaped cardboard cutouts together to make a familiar object, and over the past decade or so, this was administered to a relatively random sample of White young adults (n = 17). One seven piece item was easy enough that all 17 were able to complete it within the 3 minute time limit, yet hard enough that no one could solve it immediately.

Normally I wouldn’t show items from an actual IQ test, but the WBI is over 80 years old and Object Assembly is no longer part of current Wechsler subtests.

[update april 1, 2020, I decided to remove the photos to be safe, but it’s too bad because those were gorgeous photos I took]

When one test participant saw these cardboard cutouts being placed on the table, he apologized for being unable to contain his laughter. A painful reminder that my life’s work is considered a joke by much of the population. And yet for all their apparent absurdity, these silly little tests remain the crowning achievement of social science with one’s score being largely destiny .

Even on this one item, there seemed to be a correlation between IQ and occupation/education. The time taken to complete the puzzle ranged from 14 seconds (a professional with a Masters degree in Engineering from a top Canadian university) to a 137 seconds (a roofer with only a high school diploma). Below are the times of all 17 participants ranked from fastest to slowest.

14, 18, 21,30,31,33,34,35,48,58,60,65,68,69,82,89*,137

Mean = 52 seconds, Standard Deviation = 30 seconds

In a normal distribution, 68% fall within one standard deviation of the mean. In this distribution, 71% fell within one standard deviation of the mean (22 to 82 seconds) which is pretty damn close. Also in a normal distribution, 95% fall within two standard deviations (-8 to 112 seconds) and in my sample, 94% did (even closer!).

image found here

So simply by picking at least a moderately g loaded novel problem that is both easy enough that no one gives up, yet hard enough that everyone is forced to think, and measuring performance on a natural scale (time taken in seconds) a normal curve emerges, though a somewhat truncated one (the slowest time is much further from the mean than the fastest, perhaps because human hands can only assemble puzzles so fast, regardless of how quick the mind is).

To convert from time in seconds to IQ, all one needs to do is make the natural mean of 52 seconds equal to the IQ mean of 100, and make each natural standard deviation (30 seconds) faster or slower than 52 seconds, equal to 15 IQ points (the IQ standard deviation) above or below 100, respectively.

Thus the elite Masters degree in Engineering professional gets an IQ of 119 (14 seconds) and the high school only roofer gets an IQ of 58 (137 seconds). But note that even though IQ appears to be a true interval scale (meaning an X point gap between any two points on the IQ scale are equivalent), it is not a ratio scale, meaning IQs can not be meaningfully multiplied. So even though IQ 119 is about twice as high as IQ 58, the difference in actual problem solving speed is about an order of magnitude. This is because unlike height, weight and time in seconds to solve puzzles (which can be meaningfully multiplied) the IQ scale has no true zero point.

Of course the normal curve only applies to the biologically normal population, so it’s interesting to note that it’s now standard procedure to exclude pathological cases from IQ test norming samples. Indeed one man was excluded from my sample after he told me that years ago he had suffered a stroke (quite unusual for a man in his thirties). This man struggled greatly with the above puzzle, only joining 25% of the cuts within the 3 minute time limit. The only way to estimate what his time would have been had he not given up is divide 3 minutes by 25% which gives 12 minutes (720 seconds). This is more than 22 standard deviations slower than the mean of the normal sample, and equivalent to an IQ of -234! Such extreme deviations remind us how sensitive the normal curve is to the normality of the sample.

*one person solved the puzzle in 67 seconds, but the ear pieces were reversed, so only 75% of the cuts were correctly joined. I thus considered this equivalent to a perfect performance at 75% of the speed (67 seconds/0.75 = 89 seconds).

Why didn’t megafauna go extinct in Africa?

Tags

, , , , , ,

Have you ever wondered why we have to go all the way to Africa to see a safari? For Africa is the land of 13000 lb elephants and 18 foot tall giraffes.

But what many do not realize is that 40,000 years ago, the whole World looked like an African safari. North America and Eurasia were home to Pachystruthio dmanisensis, a flightless bird that stood 11.5 feet tall and weighed nearly a 1000 lbs.

image found here

North America was also home to the short-faced bear which stood up to 14 feet tall, weighed about 1700 lbs, and could run up to 40 miles per hour. And of course who could forget the 13000 lb mammoth, which lived on every continent except Australia.

Emily Lindsey writes:

Scientists call these giant animals “megafauna” (mega = big, and fauna = animals). We still have megafauna in the world, but there used to be a whole lot more of it. In fact, it appears that having a large number of large-bodied animals in an ecosystem is actually the normal state for our planet, at least for the geologic era we are living in today, the Cenozoic (or “Age of Mammals”) . But sometime in the past 50,000 years (very recent geologically), everywhere except for Africa, most of those large animals became extinct. And we still aren’t sure why!

Some scientists think megafauna survived in Africa because humans evolved there so large animals had more time to adapt to us. However members of the genus Homo have been living outside Africa for 2 million years, so Eurasian megafauna had time to adapt to us too. Another theory is that megafauna were killed off by the extreme climate changes that megafauna endured outside Africa.

But in asking why megafauna went extinct everywhere except Africa, politically correct scientists are forced to ignore the elephant in the room (pun intended): HBD. If Arthur Jensen was correct about the black-white IQ gap being genetic, perhaps Africans simply hadn’t evolved the intelligence to hunt large game.

But that can’t be the whole story. If racial differences in IQ evolved because we needed more intelligence to survive the non-tropics, how were Australian aboriginals (who retain a tropical phenotype) able to kill off 100% of their giant mammals? Migrating from Africa to Australia means their ancestors must have spent some time in the non-tropical ice age Middle East. Was this enough time for them to evolve the intelligence to hunt big game or was the big game in Australia simply easier to hunt because it had not had the time to evolve ways to avoid humans?

If cold climate selected humans were especially evolved for hunting big game, and if the big game on continents where humans had never been were especially bad at evading human predators, then these two factors predict the biggest megafauna massacre of all should have occurred in the Americas where both conditions were met: cold adapted hunters (humans entered the Americas from Siberia) entering a continent where humans had never been.

And indeed that seems to be the case. Paleo-biologist Rebecca Terry at Oregon State University says “pretty advanced weaponry was definitely present, and the extinctions in the New World in North America and South America were really extreme as a result.”

11,000 years ago (shortly after modern humans entered the New World), the average weight of a non-human mammal in North America was about 200 pounds compared to only 15 pounds today.

Northern American IQ: circa 1937 to circa 2014 (2nd edition)

Tags

, , , , ,

The following article is an updated revision of an article I published in August 2019 about how 21st century Northern Americans score on an IQ test normed before the second World War. The reason for the update is that in December 2019, the sample size of my study increased by 13% (from n = 15 to n = 17). I had originally hoped to collect more data before publishing an update but with the uncertainty surrounding the coronavirus crisis, it’s unclear when that will be.

The Flynn effect, popularized by James Flynn, refers to the fact that IQ tests supposedly get easier with time. Although by definition the average IQ of American or British (white) people is always 100, the older the IQ test, the easier it is to score 100. Thus to keep the average at 100, tests like the Wechsler must be renormed every 10 years or so, otherwise the average IQ would increase by about 3 points per decade.

Although scholars continue to debate whether the Flynn effect reflects a genuine increase in intelligence (perhaps caused by prenatal nutrition or mental stimulation) or just greater test sophistication caused by modernity, there’s been remarkably little skepticism about the existence of the Flynn effect itself.

Malcolm Gladwell writes:

If an American born in the nineteen-thirties has an I.Q. of 100, the Flynn effect says that his children will have I.Q.s of 108, and his grandchildren I.Q.s of close to 120—more than a standard deviation higher. If we work in the opposite direction, the typical teen-ager of today, with an I.Q. of 100, would have had grandparents with average I.Q.s of 82—seemingly below the threshold necessary to graduate from high school. And, if we go back even farther, the Flynn effect puts the average I.Q.s of the schoolchildren of 1900 at around 70, which is to suggest, bizarrely, that a century ago the United States was populated largely by people who today would be considered mentally retarded.

While few people believe our grandparents were genuinely mentally retarded, it’s taken for granted that they would have scored in the mentally retarded range by today’s standards.

But is this true? I began having doubts over a decade ago when I examined the items on the first Wechsler intelligence scale ever made: the ancient WBI (Wechsler Bellevue intelligence scale). Meticulously normed on New Yorkers in the 1930s, this test remains far and away the most comprehensive look we have at early 20th century white Northern American intelligence, and while some of the subtests looked easy by today’s standards, others, especially vocabulary, looked harder.

The Kaufman effect

What also struck me was how little instruction, probing or coaching people got when taking the ancient WBI, compared to its modern descendant the WAIS-IV. This matters a lot because the way the Flynn effect is calculated on the Wechsler is by giving a new sample of people both the newest Wechsler and its immediate predecessor, in random order to cancel out practice effects, and then seeing which version they score higher on. If they average 3 points lower on the WAIS-IV normed in 2006 than on the WAIS-III normed in 1995, it’s assumed IQ increased by 3 points in 11 years.

The problem with this method (as Alan Kaufman may have discovered before me) is that the subset of the sample that took the newer version first has a huge advantage on the older version compared to the norming sample of the older test (over and above the practice effect which is controlled for), because the norming sample of the older test was never given coaching and probing.

Statistical artifact

A Promethean once said maybe the Flynn effect is just a statistical artifact of some kind. He never told me what he meant, but it got me thinking:

One problem with how the Flynn Effect is calculated on the Wechsler is that it’s assumed that gains over time can be added. For example it’s assumed that you can add the supposed 7.8 IQ gain from WAIS normings 1953.5 -1978 to the 4.2 IQ gain from normings 1978 – 1995 to the 3.7 IQ gain from normings 1995-2006, for a grand total of 15.7 IQ points from normings 1953.5 – 2006.

This would make sense if he were talking about an absolute scale like height, but is problematic when talking about a sliding scale like IQ. For example, suppose the raw number of questions correctly answered in 1953.5 was 20 with an SD of 2. By 1953.5 standards, 20 = IQ 100 and every 2 points = 15 IQ points above or below 100. Now suppose in 1978, people averaged 22 with an SD of 1. That’s a gain of 15 IQ points by 1953.5 standards. Now suppose in 1995 people average 23 with an SD of 2. That’s a gain of 15 IQ points by 1978 standards. Adding the two gains together implies a 30 point gain from 1953.5 to 1995, but by both 1953 and 1993 standards, the difference is only 23 points.

Changing content

Another problem with studying the Flynn effect is the content of tests like the Wechsler is constantly changing. This is especially problematic when studying long-term trends in general knowledge and vocabulary. If words that are obscure in the 1950s become popular in the 1970s, then people in the 1970s will score high on the 1950s vocabulary test. Meanwhile the 1970s vocabulary test may contain words that don’t become popular until the 1990s, Thus adding the vocabulary gains from the 1950s to the 1970s to the gains from the 1970s to the 1990s, might give the false impression that people in the 1990s will do especially well on a 1950s vocabulary test, when in reality, many words from the 1950s may have peaked in the 1970s and are even more obscure in the 1990s than they were in the 1950s.

An ambitious study

Given the Kaufman effect, the statistical artifact, and changing content, I realized the only way to truly understand the Flynn effect is to take the oldest quality IQ test I could find and replicate its original norming on a modern sample.

In 2008 I made it my mission to replicate Wechsler’s 1935-1938 norming of the very first Wechsler scale. Ideally I should have flown to New York where Wechsler had normed his original scale, but if Wechsler could use white New Yorkers as representative of all of white America (WWI IQ tests showed white New Yorkers matched the national white average), I could use white Ontarians as representative of all of white Northern America (indeed white Americans and white Canadians have virtually the same IQs). The target age group was 20-34 because this was the reference age group Wechsler had used to norm his subtests.

It took over a decade but I was gradually able to arrange for 17 randomly selected white young adults to take the one hour test. They were non-staff recruited from about half a dozen fast food/ coffeehouse locations in lower to upper middle class urban and suburban Ontario. The final sample ranged in education from 9.5 years (early high school dropout) to 18 years (Masters Degree in Engineering from one of Canada’s top universities). The mean self-reported education level was 12.9 years (SD = 2.12) suggesting that despite the lack of female participants, the sample was fairly representative (the average Canadian over 25 has about 13 years of schooling); in cases where those below the age of 25 were in the process of finishing a degree, they were credited as having it.

Testing conditions were not optimum (environments were sometimes noisy, at least one person had a few beers before testing; another was literally falling asleep during the test) and 17 people is way to small a sample to draw statistically significant conclusions about 11 different subtests. One man with a conspicuously low score was removed from the sample after he stated that he had years ago suffered a stroke.

Nonetheless, the below table shows how whites tested in 2008 to 2019 compared to Wechsler’s 1935-1938 sample, with the last column showing the expected scores of the 21st century sample, extrapolating gains James Flynn calculated from 1953.5 to 2006 (see page 240 of his book Are We Getting SMARTER?) to the current study: circa 1937 to circa 2013.5.

Note: the 11 subtests were scaled to have a mean of 10 and an SD of 3 in the original young adult norming sample, while the verbal, performance and full-scale IQs were scaled to have a mean of 100 and an SD of 15. Note also that vocabulary is alternate test, not used to calculate either verbal or full-scale IQ on the WBI. One third of my sample did not take Digit Symbol so for these, Performance and full-scale IQs were calculated via prorating.

Test:Nationally representative sample of young white adults (NY, 1935 to 1938)Randomish sample of young white adults (2008 to 2019, ON, Canada)Expected WBI scores in 2008-2019 based on Flynn’s calculated rate of increase
Information (general knowledge test)10 (SD 3)8.41 ( SD 2.55)12.3
Similarites (verbal abstract reasoning)10 (SD 3)13.35 (SD 2.91)15.54
Arithmetic (mental math)10 (SD 3)9.18 (SD 4.34)(this subtest contained a unit conversion item that seemed biased against Canadians so for those who advanced far enough to fail this item, scores were prorated; had they not been the mean would have been 7.53 (SD 3.54))11.02
Vocabulary10 (SD 3)9 (SD 2.5)14.95
Comprehension (Common sense & social judgement)10 (SD 3)9.47(SD 2.93)13.93
Digit Span (attention & rote memory)10 (SD 3)9.71 (SD 2.63)11.46 
Picture Completion (visual alertness)10 (SD 3)10.71 (SD 3.1)14.52
Picture Arrangement (social interpretation)10 (SD 3)10.24 (SD 2.73)13.35
Block Design (spatial organization)10 (SD 3)13.12 (SD 3.31)12.91
Object Assembly (spatial integration)10 (SD 3)11.82 (SD 1.89)14.06
Digit Symbol (Rapid eye-hand coordination)10 (SD 3)11.12 (SD 2.82)(note: only 12 of the 17 subjects took this subtest)14.66
Verbal IQ100 (SD 15)103.8 (SD 14.73)
Performance IQ100 (SD 15)109.3 (SD 12.11)
Full-scale IQ100 (SD 15)106.9 (SD 13.63)122

Conclusion

The Flynn effect is dramatically smaller than we’ve been led to believe at least on tests of specific information that may become obscure over generations. By contrast Similarities (abstract reasoning) and Block Design (spatial analysis) have indeed increased by amounts comparable with Flynn’s research. These two abilities may conspire to explain why some of the largest Flynn effects have been claimed on the Raven Progress Matrices, an abstract reasoning test using a spatial medium.

It’s unclear if these are nutritional gains caused by increasing brain size, neuroplastic gains caused by cultural stimulation, or mere teaching to the test caused by schooling, computers and brain games.

Lynn (1990) argued the Flynn effect was caused by nutrition, citing a twin study proving nutrition gains are more pronounced on Performance IQ (consistent with the Flynn effect). Research on identical twins (where one twin gets better prenatal nutrition than the other) has shown that by age 13, the well nourished twin exceeds his less nourished counterpart by about 0.5 SD on both head circumference and Performance IQ, but not at all on verbal IQ. Thus it’s interesting that 21st century young Northern American men today exceed their WWII counterparts by about 0.5 SD on both head circumference (22.61″ vs 22.3″) and Performance IQ (109 vs about 100).

One possibility is that Performance IQ gains are entirely caused by improvements in the biological environment (prenatal health and nutrition), while verbal IQ gains are entirely caused by cultural advances (i.e. education); though somewhat negated by knowledge obsolescence.