IQ, board games & test construction

31 Saturday Aug 2019

Posted by pumpkinperson in Uncategorized

Tags

board games, construct validity, David Wechsler, g factor, IQ, predictive validity, test construction

IQ tests are considered valid because they have both construct validity and predictive validity. They have construct validity because a statistical technique called factor analysis proves that all mental tests are influenced by a general factor dubbed g, and that IQ tests load high in g. They have predictive validity because IQ tests better predict important life outcomes like education, income and occupational status than any other single measurable trait.

However IQ skeptics like our very own Race Realist (RR) argue that this validity is just an artifact of test construction. In other words, all IQ tests correlate with g, not because they measure a factor common to all mental abilities, but rather because only tests that are positively correlated are included in test batteries.

Similarly, he’d argue that IQ tests predict success in school and life, not because high IQ people learn faster, make wiser life choices and are more productive, but rather because only test items that good students did well on were included in the tests.

There’s some truth to RR’s view. Tests of general knowledge only became used in IQ tests after WWI testing found them to correlate highly with the total score of other subtests. In addition, David Wechsler would present potential test items to people of “known intelligence” and primarily those items that discriminated well between people with known Binet IQs of different levels were included. A question about a non-Christain religion was included in the general knowledge subtest after Wechsler found it to distinguish those Americans with superior from average IQs.

Given that IQ tests owe at least a small part of their validity to test construction, it’s interesting to ask whether IQ tests would still have the same construct and predictive validity if we remove the selective bias in picking subtests and test items.

What is needed is a random sample of mental abilities, not one pre-selected by psychologists. The closest thing we have to such a sample are board games. Thus, instead of the 10 subtests Wechsler arbitrarily chose for his original scale, we could simply pick the 10 “most popular” board games of all time:

chess
checkers
Backgammon
Scrabble
Monopoly
Clue
Othello
Trivial Pursuit
Pictionary
Risk

An ideal study would be to take 2000 random teenagers with very little experience playing any of these games, and send them to a summer camp where all they did was play these 10 games everyday (though no special strategies would be taught), culminating in a tournament where all 2000 were ranked on each game, and then given an overall ranking that reflected their combined performance across all ten games. This combined ranking would be converted into full-scale IQ, such that the best overall player (out of 2000) would be assigned a full-scale IQ of 150 and the worst would be assigned a full-scale IQ of 50.

If there really is a g factor, we should expect that while some people are great at chess but terrible at Pictionary, in general people who are good or bad at one game would be good or bad at any other.

Further, if g has predictive validity, we should find that 30 years later, those with the highest full-scale IQs would have more education, more prestigious occupations, and higher incomes (even after controlling for family background) than those who performed poorly.