Tags

, , , , , , ,

A reader wrote:


My first question: although IQ tests purports to be designed empirically, it feels like the weighting of speed vs. accuracy is completely arbitrary, whats up with that?

The way I see it, IQ tests are just a sample of all your cognitive abilities. But because no one knows the nature and number of every cognitive ability in the human mind, all psychologists can do is select an arbitrary sample that is as large and diverse as possible. Luckily, all cognitive abilities appear to be positively correlated and by prioritizing cognitive abilities that correlate well with every other known cognitive ability, the total score presumably predicts unidentified cognitive abilities also.

The reader continues:

I don’t think any IQ tests score you on speed, but most of them have a time limit that’s not long enough for the average person to complete it, giving people who can finish it faster an advantage.  While there’s certainly a correlation between one’s speed of reasoning and quality of reasoning, they seem to me like ultimately seperate qualities, yet IQ tests tend to lump them into one. For example, who is smarter, a person who finishes a test in time with 60% accuracy and is confident he got everything right, or a person who finishes half the test before the time runs out so gets a 50% but could have gotten 100% if given twice the time.

Some IQ tests do provide subscores for the so-called speed factor (the Processing Speed index on the WAIS-IV for example) but most timed IQ measures use speed as a convenient way of increasing the test’s difficulty, not because they’re trying to measure speed per se.

For example on a Wechsler spatial subtest, 15% of 16-year-olds are capable of solving every item within the time limit (which is a few minutes for the hardest items), but by giving bonus points to people who can solve the easy items within 10 seconds and the hard items within 30 seconds, only 0.1% can get a perfect score. So the use of time bonuses increases the test’s ceiling by two whole standard deviations without going to the trouble of creating more difficult items that would make the test too long.

When time bonuses are not given, a lot more people score perfect but the rank order of people remains virtually identical especially at age 85 to 90.9 where the correlation is 0.99! (WAIS-IV Technical and Interpretive Manual, Appendix A). The correlation is slight lower at younger ages (but still 0.93+) because of all the ceiling bumping when no time bonus is given. Such absurdly high correlations prove that when used judiciously, time bonuses merely add ceiling without changing the nature of what is being tested.

On group administered tests, the time limits not only don’t typically affect the rank order of scores much, but they don’t even increase the ceiling much. Arthur Jensen has reported that that when the Otis verbal’s time limit was increased by 50% (45 minutes instead of 30) , the average score only increased by 1.5%. When the Otis non-verbal time limit was increased by 150% (30 minutes instead of 20), the average score increased by only 1.7%, and when the Henmon Nelson increased its time limit by 67% (50 minutes instead of 30) scores increased by 6.3% (Bias in Mental Testing, 1981).

The notion of quick superficial thinkers vs slow profound thinkers is probably fallacious. People who do well on the Wechsler Processing Speed index actually have slower brains than people who do well on the untimed Raven Advanced Progressive Matrices. Once you control for general cognitive ability (the g factor), psychometric speed and has no correlation with reaction time (The g Factor by Arthur Jensen).

The reader continues:


My second question, kind of related to the first: what actually is the g-factor? The idea is that g is a construct that links the performance of all cognitive tasks, but how can you actually calculate such a number? It makes sense to me to, say, measure the g-loading of a sub-test with respect to a full IQ test, but how can you measure the g-loading of an entire IQ test? Is it just the test’s correlation to all IQ tests? Wouldn’t that just measure how close the test is to the average IQ test? Also, the idea of a g-factor would seem to require a definition of what’s “general”, and that doesn’t seem like something that can be done empirically. Like if we lived in a society entirely base in music, then would the g-loading of piano skill tests be higher than math tests? And do tests of speed or tests of quality have higher g-loading? Then again, I haven’t read up on much of the literature so I could have some major misunderstandings.

In theory g is the source of variation that all cognitive abilities have in common so the larger and more diverse the battery of subtests from which g is factor analytically extracted, the more likely g reflects something real (as opposed to an artifact of test construction). If we lived in a society based on music, everyone might reach their biological potential for piano playing, while math might be esoteric trivia, so the former may indeed become more g loaded than the latter in that context. However the g loadings of novel tasks, that are not practiced in either our society or the musical one, should have similar g loadings in both.