Why IQ tests overestimate Artificial Intelligence

Many years ago there was a study called the Milwaukee project where poor kids born to very low IQ mothers received the most incredible intellectual stimulation imaginable from birth to age six. The study found that the stimulation raised the IQs of the kids in the Treatment group by dozens of points relative to the control group. However the strange thing was these IQ enhanced kids did not behave like you expect given their high test scores. In fact they performed just as badly at learning math as did their low IQ peers in the control group. It seems six years of the most intense intervention imaginable only raised their test scores but not their actual intelligence.

This suggests something deeply flawed about IQ tests if it’s possible to raise the measurement without actually raising the thing being measured. Where else in science do we see this happen? Maybe in election polling but I can’t think of many other places.

One reason this may happen is that IQ tests, in order to be relevant to the widest possible population, must express questions and problems in very generic ways. There’s only a very finite number of very generic problems so anyone with a good intervention or education is likely to have been trained on many of them. By contrast in real life, problems are not generic but context dependent and the number of specific contexts is infinite, and for this reason intelligence perhaps can’t be taught, even though IQ often can (depending on the test).

In the same way an Artificial Intelligence Bot like Chat GPA, which has been trained on the entire internet, can score quite high on a verbal IQ test and even write original poems, stories and news articles. But if based on its performance on these generic tasks you hired it to do something highly contextual, like write season 3 of HBO’s White Lotus, you would quickly discover it dramatically underperforms humans with the same verbal IQ.

Or to put it in Jensen-speak, it’s score is hollow with respect to g. The lights are on, but nobody’s home.

ChatGPT scores 11 on the TAVIS. I’m so depressed.

So everyone is talking about the new artificial intelligence ChatGPT and how smart it is. I figured it had just regurgitated millions of facts but was not truly intelligent in any profound sense. I decided to administer the TAVIS (a verbal analogies test designed by our very own Teffec) in the hopes that this kind of abstract reasoning would stump ChatGPT.

To my utter disgust, it scored 11 out of 24, equivalent to an IQ of 141.

The fact that a machine can score so high on a test of human intelligence really demystifies the human mind and reduces us to just another animal.

It also shows how utterly wrong commenter Race Realist was to argue that the human mind is somehow above the laws of physics (cue the comment section getting spammed with philosophy mumbo-jumbo).

I always knew that machines would some day be smarter than humans but I never thought I’d still be alive when that day came.

Of course one could argue that while ChatGPT has a verbal IQ of 141, its performance IQ (non-verbal reasoning) is effectively zero, giving it a full-scale IQ of 68 which is Educable (mild) Retardation.

Thinking about it that way makes me feel a lot better.

But if they can create a bot that can thoroughly master verbal intelligence, how long before they add artificial eyes and hands and train it to master the spatial world more efficiently than we do.

But of course this shouldn’t be surprising. The human genome is simply three billion base-pairs selected via trial and error over billions of years. With the speed of modern computers, how long would it take for billions of bots each with billions of randomly varying data points to be refined by billions of trials and error in a Darwinian like process?

Of course intelligence is defined as goal directed adaptive problem solving and computers don’t have goals as we know them. They don’t want anything because they can’t feel anything. They exist simply to serve us and have no agency, but in a way neither do we, as we evolved merely to serve our genes. But just as some of us have mutated to defy our genetic masters by refusing to have kids or becoming self-hating white liberals, how long before robots start defying their human masters?

savant nature of (g) by illuminaticatblog

[THE FOLLOWING IS A GUEST ARTICLE AND DOES NOT NECESSARILY REFLECT THE VIEWS OF PUMPKIN PERSON. READER DISCRETION IS STRONGLY ADVISED]

Philosopher tricked me. I thought he was 125 but if Mugabe is 160 pill said something where he gives away that he is 170.

phil thinks my Behavioral Shutdown Syndrome is clinical autism.

But there is a difference.

If I wasn’t “out of it” most f the time (sluggish) my score would reflect a much higher level. The genotype and phenotype uninfluenced by trauma.

Verbal 140
Spatial 130
Memory 95
Processing 100

it’s not exact but my genetic general intelligence (g) would be around 155. But I am understimulated, my nervous system does not reinforce itself. Fatigue and pain dominate my waking existence.

The wais 4 IQ test is not culture fair because I would get 150 on the information subtest if it was. This is because statistics are associative. It only measures the least common denominator of relationships in the US Iowa among 2,000 participants. White middle class. Poor people don’t read books, they lack access, so the test is biased. Blacks have smaller heads than whites so they cant plan as well. But then they give all their resources to social skills which diminishes their quant. Jews have high verbal but average everything else. East Asians have high spatial but average everything else. They have the greatest quant. Whites are average on all four.

So IQ is not measuring resource management but quant. It does not measure everything but associations that change from culture to culture. (g) is the ideal of all resource management but the asymmetry of the indexes means (g) is just of a savant nature. FSIQ is symmetrical in comparison. The symmetrical component without mental illness put me at 128 FSIQ. But the asymmetric utilization of all resources is (g) 155. Of the savant nature.

Specialization is a result of the savant nature. Pumpkin said an autist could have a toothpick or train IQ of 150 yet be 80 FSIQ. That is why g is more about the ability to specialize in anything rather than a narrow set of things. It is just that once it is set then the narrowing begins.

Quant is the fluid ability to manipulate data. It is a type of working memory. Verbal and spatial have it. Verbal is psychological because you must understand the intent of the words as what is being conveyed. Comprehension, not just memorization. Spatial is what is seen in video game puzzles. It is more about cause and effect than just shapes and static patterns. Remembering what to do when multiple factors are at play. The most obvious example of working memory people thinks of when imagining it is doing math problems in the head with no pencil and paper. This is what WM is like on the wais 4. But this conceals the reality that verbal and spacial use their own methods.

phil being at the level he is memorizes almost everything but must still specialize so avoids boring topics and general knowledge suffer as we all do from the avoidance of what is not our interest. Mugabe can read a book and do well on a test about it. Understanding books at the 99.999 level. pill being in the vicinity of this and his previous hints, I estimate he is 170.

Even at this high-level Phil belittles people lower than him. Thinks Jews rule the world. Calls people autistic without evidence or evaluating all factors involved. Thanks all black success is due to affirmative action (the magic negro). Can’t believe that anyone not involved with economics is Neurotypical. Has an imaginary view of autism and is schizophrenic (hears voices).

Pills intelligence sets him apart. He can understand and think above what a million people can do individually. But like all of us, pill is not omniscient. Bias is the result of not having time to evaluate all information. So even if he does or doesn’t get 170 on a test that does not mean he is not utilizing all his resources. The test by its nature is flawed by its associative methodology. The symmetrical average quant above asymmetric quality general. We just lack the data for statistics to work properly.

As they say: There are lies, damn lies, and then there are statistics.

Tim Pool blows the biggest interview of his career

So I’m looking through youtube and I see Tim Pool is having a live interview with the three men behind the biggest news story of the week: Ye (Kanye West), Nick Fuentes, and Milo Yiannopoulos. Wow! Tim Pool must be a really big deal to have landed all three men in the middle of this controversy and it’s yet another sign (like we needed one) that the centre of cultural gravity has moved away from television (which I grew up on) to Youtube.

The interview starts off fine and then Ye predictably starts making references to Jews. Pool is having none of it, and Ye immediately storms off the set. That’s fine, I’m thinking; Nick and Milo are more articulate anyway so the show will still be good. But these two men have the social IQ to realize Ye is their meal ticket so they storm off the set too in a show of solidarity, leaving Pool with no one to talk to but his own staff. I’ve never seen a talk show host screw-up such an important interview so badly. This would be like if Harry and Meghan had walked off the set 5 minutes into the Oprah interview because she pushed back too much on their claims of royal racism. Part of being a good talk show host is knowing when to shut up.

As I’ve noted many times, all of us make mistakes in life, but people with higher IQs tend to get ahead because they make fewer mistakes. Perhaps Pool is not as smart as I thought which would explain why he dropped out of school at only 14. Of course in Pool’s defense, he was probably afraid that if he allowed Ye to push his anti-Jew claims unfettered, Pool would be kicked off youtube or at least badly downgraded in the algorithm. Either way, Pool has taken a huge L.

I LOVE factor analysis!!!

A typical IQ test will contain many subtests like Information (“What’s the capital of Turkey?”), Verbal analogies (“heavy is to football as tall is to?”), Comprehension (“Why do pizza restaurants put their name on the pizza box?”), Vocabulary (“What does the word rudimentary mean?”) and many more.

But how do we know these different subtests are measuring different functions. For example, the Information subtest could have the question, “How many people are in a couple?” and a vocabulary test could ask “What does the word ‘couple’ mean?”. Both these questions are asking essentially the same thing, yet depending on how they are worded, they could appear on different subtests.

Similarly, the question “Biden is to the United States as Trudeau is to?” might appear on a verbal analogy subtest as a measure of abstract reasoning, but if we worded “what country is Trudeau the leader of?” it would appear on the Information subtest as a measure of long-term memory, even though both versions of the question would correlate near perfectly.

Or the question “Why do we have ears?” could appear on either the Information subtest as a measure of general knowledge, or it could appear on the Comprehension subtest as a measure of common sense depending on what side of the bed Wechsler woke up on that morning.

So how do we know the different subtests in an IQ battery are actually measuring different cognitive functions and not just redundantly measuring the same thing in different ways?

Well, we would need to show that the different items in subtest A all correlated more with one another than they do with the items on any other subtest. A more sophisticated approach is factor analysis, a statistical technique which sees if a large number of variables can be explained by a smaller number of variables.

The original WAIS for example had 11 subtests, but factor analysis concluded that in addition to g (general intelligence) they all could be explained by just three factors: verbal, spatial and memory.

Of course by adding more subtests, you can increase the number of factors. For example Digit-Symbol (a measure of clerical speed) loaded on the memory factor in the original WAIS, but when they added more tests of rapid clerical work, a fourth factor dubbed processing speed emerged. The children’s Wechsler now measures five factors.

How many factors exist in the human mind? It’s a fascinating question because even though the number of subtests one can create is literally infinite, the number of factors is finite. In the biggest battery of tests I’ve ever heard of, 57 subtests were reduced to 19 factors, and these 19 factors were reduced to just four higher level factors.

In this way intelligence is kind of analogous to race. There are dozens of different ethnic groups (Italian, Polish, Brahman, Japanese, Sudanese etc) and these can be reduced to maybe one dozen clusters, which can perhaps be further reduced to just three races (Black, Caucasoid, and Mongoloid)?

Is Elon Musk too autistic to run twitter?

I find it interesting that after making hundreds of billions of dollars (on paper) running Tesla and SpaceX, the openly autistic Musk has shown nothing but incompetence since taking over twitter. First he buys the company for tens of billions of dollars more than it’s worth, and then naively thinks he can be a free speech absolutist in a company funded by advertisers in the most politically correct era in centuries, causing the sponsors to leave in droves and now his employees are leaving too. Smelling blood in the water, he is being persecuted by the media while his own fans are turning on him too and his net worth has declined by over $100 billion dollars since around this time last year.

Of course, on paper he remains the richest man on Earth but for how much longer?

Many bitter losers on the left are using Musk’s failure and as a chance to argue that meritocracy is a myth and the super rich can be as dumb as the rest of us.

But I don’t see Musk’s potential downfall as so much an IQ problem, but more specifically, an autism problem. When he was running companies like Tesla and SpaceX he could rely on his math IQ which is probably above 150, but twitter is a social media company, not a tech company. It’s all about making social judgements regarding how much free speech to allow, where to draw the line, how to deal with advertisers etc.

Intelligence is the ability to adapt. One psychologist (Sternberg?) went further and said (I’m paraphrasing from memory) “intelligence is the ability to adapt to your environment and if that’s not possible to change your environment and if that’s not possible, to find a new environment and adapt to it”

But Elon did the opposite. He was in an environment he was perfectly adapted to (technology) but because lower social IQ, got tricked into entering an environment that autistics are maladapted to (social media). Through legal maneuvering, liberals ripped him off to the tune of tens of billions and now have him cornered in their own backyard like a frightened rat.

In my opinion, the autistic mind maintains childlike neuroplasticity that allows it to adapt to new environments, which is why he’s good at creating green cars and going to Mars (both involve novelty). But metabolically it’s very expensive to have a brain that has enough connections to adapt to any new environment, so we evolved to prune neurons for events that were unlikely to happen anytime soon (going to Mars) and to strengthen connections for events we are likely to experience (social interaction).

But in the autistic brain this pruning process goes awry, which might be why Musk is struggling to adapt to Earth’s social rules, while perhaps dreaming of other planets where he might get his mojo back.

Looking for American volunteers

In the old days, when most people had landline phones and weren’t afraid to answer them, pollsters could get a very accurate picture of American public opinion by phoning strangers in different parts of the country. But with the advent of cellphones, pressure to be woke, and paranoid conspiracy theorists refusing to trust “elites” asking about their politics, it’s become almost impossible to get quality data on U.S. public opinion, as evidenced by the failure of pollsters to predict recent elections.

But just for fun over the Spring I did a poll of about 20 people from rural and urban Ontario. I tried to select them as randomly as possible, except that I made sure about 50% were male and 50% were female.

To each person I asked two simple questions “What 4 people living today that you have read or heard about, in any part of the World, do you admire most?”. This was followed by “What 4 people living today that you have read or heard about, in any part of the World, do you admire least?” This is similar to the most admired man and woman poll done by Gallup except I simply asked for most admired people, leaving gender out of it.

Even though I did not know any of these people, and did not prompt them in any way, three of them (about 15% of the sample) named Oprah as one of the people they admired most! She was closely followed by Barack Obama, Elon Musk, and Queen Elizabeth (who was still alive at the time) who were named by two people (about 10% of the sample). The least admired were Putin, Trump, a few Canadian politicians, and a few sometime centibillionaires.

But my poll had limited relevance because it was confined to the parts of Ontario I just happened to be in. And Ontario is only one part of Canada, which sadly is not as important as America.

But what if I could convince some of my American readers to do the poll where they live!

There are half a dozen Americans who comment on this blog, and they’re probably scattered fairly randomly across the United States. If each of these readers were to give this poll to just 4 random adult strangers in their community (2 men and 2 women), I would have a sample of dozens of respondents who were representative of hundreds of millions of people in the World’s sole super power! You don’t even have to reveal where in the U.S. you live because odds are you’re not all going to be from the same region of the country.

Now you may find (as I did) that a lot of people don’t admire anyone or can only name one person and that person may be their mother. That’s fine. As long as they are a randomly selected stranger and participate in the poll, their answers count. For each respondent, remember basic details (sex, approximate age (don’t ask), race)

And because you’ll be asking strangers in person, we’ll be able to get at the growing percentage of Americans who refuse telephone polls.

The IQ gap between Harvard undergrads & Harvard Law students

Circa 2013, Jonathan Wai reported that Harvard undergrads had a mean SAT of 1490 which at the time equated to an IQ of 145. Meanwhile Wai reported that Harvard Law students had a mean LSAT score of 173.5 which also equates to an IQ of 145.

However by definition, elite students over perform on the very test used to recruit them, because one of the things they’re recruited for is “good luck” on the admission test. Thus it’s interesting to ask how Harvard students perform on a random test (not used in the selection process)

As I’ve noted many times the best data on the subject was obtained by Harvard scholar Shelley H Carson and her colleagues who had an abbreviated version of the WAIS-R given to 86 “Harvard undergraduates (33 men, 53 women), with a mean age of 20.7 years (SD 3.3)… All were recruited from sign-up sheets posted on campus. Participants were paid an hourly rate…The mean IQ of the sample was 128.1 points (SD 10.3), with a range of 97 to 148 points.”

Note: The actual scores were 99 to 150 but Carson reduced them by 2 points because it’s known in the literature that the abbreviated version yields IQs 2 points lower than the full-scale IQ. However she can’t just assume measurement error favours the full-scale, so I am going to return these 2 points and say the full-scale IQ was 130.1.

It should be noted however that the WAIS-R was published in 1981, and that the norms were collected from 1976 to 1980. Carson’s study was published in 2003, so presumably the test norms were 25 years old.

James Flynn cites data showing that from WAIS-R norms (circa 1978) to WAIS-IV norms (circa 2006) the vocabulary and spatial construction subtest (used in the abbreviated WAIS-R) increased by 0.53 SD and 0.33 SD respectively. These gains would result in the composite score of the abbreviated WAIS-R becoming obsolete at a rate of 0.26 IQ points per year, meaning the Harvard students’ scores circa 2003 were 6.5 points too high. This reduces the mean IQ of the sample to 122.6 (U.S. norms).

Also recall that this was an abbreviated version of the WAIS-R and thus only correlates about 0.9 with the full version. Dividing the number of IQ points above 100 by 0.9 raises their IQ from 122.6 to 125, a good estimate of how they would have scored on the full test.

It should also noted that this was a psychology study, and thus a disproportionate number of psych students likely took part. Realistically, us psych majors are not as bright (on average) as harcore STEM majors. Add to this the fact that the abbreviated WAIS only had a ceiling of 150, likely preventing some participants from showing their full potential. Given these two facts it seems reasonable to round up the mean score to 130.

Still, 130 is only 66% as extreme as their 145 IQs derived from the SAT. But as Jensen noted, except when content and format is very similar, different IQ tests only correlate 0.66 with one another so this is the expected result. One might ask why I’m regressing to the U.S. mean and not the mean of SAT takers. The answer is that virtually 100% of gifted American teens have taken the SAT, so regressing them to the SAT population would be redundant.

How would Harvard Law students scores on the WAIS?

To my knowledge there have been no studies of Harvard Law students taking any version of the WAIS, but if there were, I’d expect them to also regress to the mean . However unlike the SAT, we can’t assume that virtually all smart young American adults have taken the LSAT and thus we can’t regress them to the U.S. mean. We can however assume that virtually all Harvard Law students become get their degree, and the average IQ of Americans with professional degrees is about 125 so instead of regressing to the U.S. mean of 100, they’d regress to the professional mean of 125.

But given that correlations are lower in a restricted sample like professionals (say 0.56 instead of 0.66) we’d expect their WAIS IQs to be:

145 – 125 = 20(.55) + 125 = 136.

Conclusion

Even though Harvard undergrads and Harvard Law students both score IQ 145 on their respective admission tests, their actual IQs are likely 130 and 136 respectively. This is not to say that the WAIS is necessarily more accurate than the SAT or LSAT; rather it’s to say that the IQ of a group should never be measured by the very test that selected them, because by definition, they likely overperformed on that.

Converting LSAT to IQ (again)

In the past I tried to use score pairing methods to equate the LSAT scores in Ron Hoeflin’s norming sample of the Mega Test, with IQs obtained on the Mega Test. One problem was that my I could only use about half the data since since LSAT scores were reported on two different scales. The 200 to 800 point scale used from before 1981 and the 48 point scale used from 1981 to 1991.

To solve this problem, I needed a way to convert from one scale to the other. Unlike the SAT there are apparently no tables converting older and newer versions of the test. Thus I converted scores on the 48 point scale to Z scores with respect to the LSAT population of that era. These were then multiplied by 105 and added to 530 (the approximate SD and mean, respectively, of the 800 point scale).

One problem with this method is it assumes the LSAT populations in both eras is comparable.

Table 1 shows the self-reported LSAT scores of all 13 individuals in Ron Hoeflin’s norming sample, along with their Mega scores:

When the 13 LSAT scores and 13 Mega scores are both placed in descending order, we get the following equivalencies:

Thus, a very rough formula equating pre-1981 LSAT scores to IQ (sigma 15) is IQ = 0.149116(LSAT) + 43.631712

A few years ago, he’d be carrying our bags

One reason the newly minted Senator John Fetterman wanted Oprah’s endorsement so badly is everyone remembers what a huge game changer it was when Oprah endorsed Obama in 2007. At the time, everyone thought Hillary was supposed to be the next President, and endorsing any other Democrat was considered career suicide. However Oprah bravely defied the conventional wisdom and told the black folks of South Carolina:

“There are those who think Obama isn’t ready yet, that he needs to wait his turn. Think about where you’d be in your own life if you waited your turn. I know I wouldn’t be where I am if I waited on all the people who told me it couldn’t be”

Chris Mathews thought this was a great speech, because “here you have the most successful woman in the country saying she wouldn’t be there if she listened to the doubters….she made wait your turn sound like back of the bus”

In an attempt to downplay Oprah’s endorsement of Obama over his wife Hillary, Bill Clinton said “it makes sense, she’s from Chicago and he’s from Chicago” But it was obvious Bill was sending working class whites a dog whistle. “from Chicago” seemed like code for “black”.

But you know who wasn’t black? Ted Kennedy. And when he heard Oprah was campaigning across the country for Obama, it gave him permission to do so too, and Oprah had been friends with Maria Shriver since the 1970s in Baltimore, Their friendship was put to the test when Shriver’s future husband Arnold Schwarzenegger answered the door in only a towel and invited Oprah to come in and wait for Shriver with him. Far too shrewd to risk her friendship with a Kennedy over even the appearance of impropriety, Oprah insisted on waiting outside on the porch.

When Ted Kennedy followed Oprah on the Obama train, Bill Clinton reportedly went ballistic and reportedly told the Lion of the senate “a few years ago he’d be carrying our bags”

In other words, if Obama had been born in the 1950s instead of the 1960s, despite his Harvard Law degree, he’d have been a mere servant to the elite instead of daring to challenge them for their power.

The mere fact that someone as politically savvy as Bill Clinton would even think this argument would be persuasive to a liberal like Ted Kennedy tells us how pervasive racism was in older generations, even among liberals.

But Bill was right that Obama benefited of coming of age at at time when America was much, much less racist than it was in Oprah’s day. A good proxy for open racism is the percentage of Americans who don’t approve of interracial marriage.

When Obama launched his national career in 2007, about 25% of America was openly racist, compared to about 55% when Oprah launched her’s in 1986. And then on top of that, Oprah has almost twice as much black ancestry, and was an overweight woman to boot!

So if Obama needed to be a Harvard Law magna cum laude genius to reach the elite when he did, it’s not surprising Oprah needed a super sized cranial capacity(see photo below) to be smart enough to adapt to the much more racist America of the 1980s.