New version of the aliens test

Preliminary evidence suggests that the aliens test has little, if any, correlation with IQ, at least among readers of this blog. The most likely reason for this is that the test so overloaded working memory that people were essentially just guessing.

As a result I have substantially revised the test in the hopes of reducing the demands on working memory and attention to detail, since the point of the test was to get at higher level, more conceptual ability.

I have reduced the number of aliens you have to compare from 12 to nine and also reduced the number of ways people can be tricked into miscategorizing the aliens.

Because the point of this test is not just to measure your intelligence, but more importantly, to show that there is a correct way to organize the natural world, whether we’re dividing categories as broad as plants and animals or as narrow as racial groups.

[Take the new version of the aliens test here. You can register with a fake name; email optional]

Take the Aliens test (culture reduced)

Like the Pairs test, Aliens is a very culture reduced test of conceptual reasoning. It is culture reduced because it’s not about America or any other culture on Earth, but rather about 12 “men” on an imaginary planet. Because no one has ever been to this planet, it hopefully measures largely novel problem solving or fluid ability as it is sometimes called.

Like the Pairs test, this test measures the ability to categorize but unlike the Pairs test, one must also identify the number of categories and even subcategories, a skill I’ve never before seen measured on an IQ test, but one that is dear to the author’s heart.

You can take the test here. Feel free to register with a fake name (email optional).

Norms for the Pairs test

Over 60 people have taken the new version of the Pairs and of these, nine reported scores on the cubes test. The mean Pairs score of this sub-group was 7.6 (SD = 1.74) and the mean cubes IQ was 129 (SD = 16); U.S. norms. By equating the mean and SD, I was able to create a rough equivalency between between Pairs raw score and IQ (see chart below).

A random sample of all people who got valid scores on the new version of the Pairs test (excluding those who clearly misunderstood the instructions or quit after less than about a minute) showed their scores to be slightly lower and more variable than the subset who reported cubes scores. By assuming they had the same IQ distribution as all who took the defunct version of the Pairs test, I was able to create an equivalency between both versions which allowed me to equate the old version to IQ too.

The old version has a lower floor and much higher ceiling but this came at the cost of having invalid items (hard to hit the ceiling when certain questions don’t have a valid answer). But even after revising the invalid items, the ceiling remains ridiculously high.

While I’d like to think this test can discriminate above the one in 30 million level, realistically the high ceiling is likely an artifact of low reliability. When test items are flawed, they tend to correlate weakly making it freakishly rare for anyone to hit the ceiling.

Another red flag is the test only correlated 0.24 with self-reported scores on the cubes test though given the small sample size (n = 9) it isn’t quite panic time. On the bright side, the old version of the Pairs correlated perfectly with self-reported PATMA scores (n = 4).

Pairs raw scoreOld version (taken before 6 PM Eastern Time, on April 10, 2024); IQ equivalent (U.S. norms)New version (taken after 6 PM Eastern Time, April 10, 2024); IQ equivalent (U.S. norms)
05059
16168
27377
38487
49696
5107105
6118114
7130123
8141133
9152142
10164151
11175160
12187169
13187+178
14187+188
15187+188+
16187+188+

Take the Pairs test. A culture reduced measure of conceptual ability

[UPDATE: 8:09 PM EASTERN STANDARD TIME: After correcting some errors on this test and article discovered by Teffec and Melo, it is now back online. If you did not get a chance to take it, you can do so now. You can register with a fake name if you want (email optional)]

I created the Pairs test because of a void I had noticed in the field of psychometrics. Although there are tests that are very culture reduced (performance subtests on the Wechsler) most of these seemed to load heavily on spatial ability. What if your testing someone who doesn’t speak English or may have never learned to read in any language, but who nonetheless has a naturally high verbal IQ?

What is needed, it seemed to me, is a test that gets at verbal, semantic, or symbolic modes of thinking but without using language. I immediately began thinking of some of the more fluid tests of verbal ability and wondered, what if we could recreate these tests with pictures instead of words.

At first I began thinking of the odd man out test (which of these is not like the others: yellow, red, blue, seven, green?) and thought, this could easily be made more culture fair by translating it into picture form. Indeed there are IQ tests like this discussed in one of Jensen’s book, but I quickly became discouraged by the error introduced by guessing. To solve this problem, I created the Pairs test where instead of guessing which of the 5 pictures is not like the others (1 in 5 chance of guessing right), you have to guess which pair of the 5 pictures ARE alike (1 in 10 chance of guessing right).

It’s unclear whether this test should be considered a measure of Verbal or Performance IQ since the medium is visual but the type of thinking required is more semantic. Either way it nicely complements the Information subest which is arguably the most crystallized subtest on the PAIS, while Pairs is the most fluid. A large Pairs > Information gap is probably indicative of cultural or educational deprivation while a large Information > Pairs gap might suggest schizophrenia, autism, dementia or brain damage though far more research is needed to validate such speculation.

The test is by no means 100% culture fair. You couldn’t give it to a hunter-gatherer. But it’s arguably culture fair enough to give to someone who never attended school, as long as they grew up in a city.

As mentioned, you can take the test here.

Second norming of the cubes test

Of the 118 or so readers who have taken the cubes test, 14 have reported credible Wechsler full-scale IQs in Canada/the U.S./the U.K. within the last decade. These are are their scores:

cubes raw scoreWechsler IQAge when taking the cubes test
10144Thirties
6120Twenties
9133Late teens
7133Twenties
8137Fifties
11149Twenties
10160+Twenties
17160+Twenties
893Below 16
5143Undisclosed
2103Thirties
7120Twenties
9142Twenties
11124Twenties

There mean cube score of the sample is 8.57 (SD = 3.46) and the mean Wechsler IQ is 133 (SD = 19.5). By equating these statistics, cube scores can be converted to IQ equivalents:

cubes raw scoreIQ equivalent (U.S. norms)
17181
16175
15169
14163
13158
12152
11147
10141
9135
8130
7124
6119
5113
4107
3102
296
191
085

When I ran a regression equation, I found that each year older than 25 decreased cubes IQ by 0.5 points holding Wechsler IQ constant, thus I tentatively suggest older people add a bonus of 0.5 IQ points for each year above 25. However this suggests either massive age related decline and/or the Flynn effect and more research with older subjects is needed because this bonus seems a little generous.

The correlation between cubes and self-reported Wechsler is 0.61 however this needs to be corrected for extended range (since the IQs of the sample are more variable than the general U.S. population).

Grady Towers supplied a formula for correcting for range restriction, which I assume I can use since extended range is simply negative range restriction:

Correcting for extended range reduces the correlation from 0.61 to 0.52. However we should also correct for the fact that the Wechsler IQs were taken as much as a decade ago and thus would only correlate perhaps 0.8 with what their Wechsler IQ would have been on the day they took the cubes test. Dividing 0.52 by 0.8 gives 0.65 which is probably a good estimate for the cube test’s g loading.

Even among Ivy League grads, high IQ increases the odds of getting a very high income job

So what the above chart is showing us is that Ivy League students in general have about a 12% chance of getting a very high paying job after college, however for the ones with high IQ, the odds jump to 16%. Surprisingly being a legacy or having strong non-academic traits (charisma) doesn’t help any. Being an athlete provides a small benefit, probably because it makes you better looking and more energetic, but IQ is a better predictor than all three other factors combined.

This shows that even among Ivy League grads, smart people are getting ahead naturally instead of just artificially getting propped up by the colleges they attend.

Genetic IQ in Europe increased 9 points over 15,000 years?

Emil Kirkegaard has an interesting blog post showing the polygenic education scores (MTAGeduPGS) of different U.S. ethnic groups. Whites 0.47, Native Americans -0.37 and U.S. blacks -1.37. If we set the white polygenic score to equal IQ 100 and the black one to equal IQ 85, then:

Genetic IQ = 8.15(MTAG eduPGS) + 96.17

Applying this formula to Native Americans gives a genetic IQ of 93.

But since both U.S. blacks and Native Americans are about 25% white, correcting for this suggests “pure” blacks and Native Americans would be 80 and 91 respectively.

From this we might infer that before modern humans left Africa 70,000 years ago, our average genetic IQ was no higher than 80, but by the time we reached the arctic 40,000 years ago, some races were averaging 91.

But it was only after the neolithic transition that triple digit IQ races began to appear. Indeed a recent study by Davide Piffer and Emil Kirkegaard looked at 2,625 European genomes and found that polygenic scores for education (a proxy for genetic IQ) gradually increased over the last 15,000 years.

Assuming Upper Paleolithic Europeans had IQs like Native Americans (both cold climate hunter-gatherers), this suggests genetic IQ in Europe has been increasing by 0.0006 points per year since near the end of the ice age, culminating in the Industrial Revolution a few hundred years ago.

Piffer and Kirkegaard believe the increase was caused by higher IQ farmers from the Middle East replacing the indigenous hunter-gatherers of Europe but Peter Frost argues they were not necessarily replaced by farmers, but may have evolved into them, writing:

There is thus an inevitable confound between hunter-gatherer ancestry and natural selection due to hunting and gathering. If we look at alleles that seem to indicate native hunter-gatherer ancestry, we are excluding alleles from hunter-gatherers who successfully adapted to farming and who thus acquired a genetic profile that converges, to some extent, on that of Anatolian farmers.

What is Tyler Perry’s IQ?

Commenter Vegan DHA asks, “Random, but what might be Tyler Perry‘s IQ? I guess something between 125 and 135.”

I’ve always liked Tyler Perry. I’ve never seen any of his movies but his Madea character looks hilarious so I think I would enjoy them. Perry’s childhood was a living hell; brutally beaten by his father and raped by men AND women (“so disgusting” he would later tell Oprah)

After a childhood of abuse, he entered high school but was kicked out, (though would later return for his GED) and was on track to become America’s worst nightmare: the big bad angry black man. Fortunately he turned on The Oprah Winfrey Show and was just blown away, to see someone who looked like him but had made it to the absolute top of America! He heard her say “if you write things down, it’s cathartic”. At the time he didn’t know what cathartic meant but he used his IQ to infer it was good and he began writing and never stopped. Still, making money as a playwright was no mean feat, and four years after being homeless (except for his car) he got to meet the woman who changed his life.

So what is his IQ?

We begin with the fact that as of 2023, Forbes estimated Perry had a net worth of $1 billion, making him the 4th or 5th richest black man of his generation. Given there are about 4.65 million black men in Generation X, that puts Perry around the one in a million level for this demographic. If there were a perfect correlation between IQ and money among black men, you’d expect Perry’s IQ to be 71 points above the U.S. black mean, but since the correlation is actually only 0.42 (somewhat lower than for white men), we’d expect him to be 71(0.42) above the black mean of 89 (U.S. norms);85 (white norms) giving Perry an expected IQ of 30 + 89 = 119 (U.S. norms).

However Perry is not just rich, he’s also tall. At a height of 196 cm, Perry is at least 3.06 standard deviations taller than the average black man of his generation. If there were a perfect correlation between IQ and height, we’d expect him to be 46 IQ points above the U.S. black mean, but since the correlation is only 0.25, his expected IQ is 46(0.25) + 89 = 101 (U.S. norms).

What happens when we put these two predictors together? Using math that a Promethean once explained to me, since the correlation between money and height is only 0.13, both variables maintain most of their predictive power:

Perry’s expected IQ = 0.37(Perry’s lifetime income ) + 0.22(Perry’s height)

Perry’s expected IQ = 0.37(+4.73 SD above the U.S. black male mean) + 0.22(+3.06 SD above the U.S. black male mean)

Perry’s expected IQ = 1.75 SD above the U.S. black male mean + 0.67 SD above the U.S. black male mean

Perry’s expected IQ = 2.42 SD above the U.S. black male mean

Perry’s expected IQ = 2.42(15) + 89 = 125 (U.S. norms)

So 125 would be my first guess for Perry’s IQ, but given that income is only moderately correlated with IQ and height is weakly correlated with IQ, the standard error around this estimate is quite large so think of it as just an educated guess.

Correct me if I’m wrong, but this is kind of a very simplified version of how chat GPT and machine learning in general works, but instead of using statistical predictors to guess IQs, chat GPT uses it to predict what humans would say when asked different questions.

Preliminary norms for the cubes test

It seems people are scared to take the cubes test. You should be. Preliminary norms suggest the test is VERY VERY VERY hard. Of the readers who have taken the test so far, nine of you reported scores on either the SAT or the Wechsler. SAT scores (reading + math) were converted to IQ (U.S. norms) depending on the time period they were taken. If someone reported scores on both the SAT and the Wechsler, I took the average of both.

I found a 0.48 correlation between raw score on cubes (out of 17) and reported Wechsler/SAT score. After correcting for range restriction, the correlation increased to 0.58. By lining up the cube raw scores from highest to lowest beside the Wechsler/SAT scores from highest to lowest, I found the following rough equivalencies. Do not extrapolate beyond these norms as the relationship is anything but linear (as commenter Fraz predicted).

cubes score (out of 17)IQ equivalent (U.S. norms)
11144
8136
7130
6128
4127
2103

Take the cubes test

[You can take the cubes test here. Email optional. You can register with a fake name]

As I analyze the data from the general knowledge test, I thought I would post the cubes test. Tests of cube analysis date back to World War I and perhaps before. The subest was supposed to be part of the Wechsler intelligence scales but they decided to drop it. Wechsler (1958) wrote:

The Cube Analysis test was discarded after being given to over 1000 subjects because it showed large sex differences, proved difficult to get across to subjects of inferior intelligence and because it tapered off abruptly at the upper levels…Apparently others have had less discouraging results with the test; it was included in the Army GCT (World War II). We still think that the test has serious shortcomings.

I’m a bit surprised that Wechsler found the test hard to administer to low grade people as part of what attracted me to the test was its utter simplicity, however perhaps duller subjects don’t grasp what a cube is or have problems perceiving drawings of them.

Despite such warnings, I decided to include the cube test in my battery. For starters, excluding a test because of sex differences seems unscientific because it assumes a priori that the sexes have equal intelligence and there’s no reason to think that. Wechsler didn’t apply that procedure when it comes to race, so why apply it when it comes to sex. Perhaps because the Binet test found little or no sex differences and that was seem as the gold standard before the Wechsler was developed and perhaps because if he did apply that standard to race he’d have virtually no subtests left.

My second reason for including this in the PAIS is I wanted at least one test involving blocks since these hold nostalgic value from being tested as a child.

Finally, I needed more spatially loaded content.

As already mentioned, you can take the test here.