Arthur Jensen noted how measuring a trait indirectly can often lead to misleading conclusions.  He compared it to measuring a person’s height by measuring the height of their shadow.  The correlation between actual height and shadow height could be extremely strong under controlled conditions, but when the position of the sun moves, the measurements become meaningless.

I think giving someone an official IQ test like the Wechsler, is somewhat analogous to measuring their height directly on a stadiometer, while giving someone the SAT is like measuring height from one’s shadow.  Because you’re not directly observing how fast one can learn like you do on many Wechsler subtests, you’re indirectly inferring it from how much they know.

Of course shadow measurements can be extremely accurate.  If everyone is measured at the same time of day,  shadow height will correlate near perfectly with actual height, and when everyone takes the SAT with a similar academic background, the SAT correlates near perfectly with general intelligence (the g factor) as found in a sample from the University omTexas at San Antonio.

.However in America, there’s a strong class divide, so you have the upper class, who studies AP algebra, geometry, calculus and Shakespeare, and then you have the lower class, who attends working class schools and is dissuaded from going to college at all.  The lower class tend not to even take the SAT, but when they do, they tend to score below their genetic potential.  For example Bill Cosby had an IQ equivalent around 80 on the SAT despite being very intelligent on an official IQ test and known for his comic wit.  Other quick comic minds from working class backgrounds who underperformed on the SAT include Rosie O’Donnell and Howard Stern.

A good analogy would be the upper class has their shadow height measured in the morning where shadows are quite long.  The lower class has their shadow height measured in the afternoon, when shadow height is quite short.  Now within each class, shadow height may correlate near perfectly with stadiometer height, just as within each class, the SAT may correlate near perfectly with official IQ.  But when the ENTIRE population is aggregated, the correlation between shadow height and stadiometer height plummets because of the class inequality, just like the correlation between SATs and official IQ scores plummet.

This explains why people who are 46 IQ points above the U.S. mean on the new SAT regress to only 21 IQ points above the U.S. mean on the Raven IQ test, suggesting the new SAT correlates 21/46 = 0.46 with the Raven in the general U.S. population.  Arthur Jensen noted that the correlation between two tests is a product of their factor loadings, so assuming the only factor the SAT and Raven share is g, then dividing their 0.46 correlation by the 0.68 g loading of the Raven tells us the SAT also has a g loading of 0.68, or roughly 0.7 if you like round numbers.

A g loading of 0.7 is not low, and tells us the SAT is a reasonable proxy for g in the general U.S. population, but it’s nowhere near the 0.9 g loading the SAT enjoys in more socioeconomically homogenous subsets of America such as students at the University of Texas at San Antonio.  This is because the general U.S. population is analogous to people having their shadow heights measured at different times of day, while the students at a given local university are analogous to students all having their shadow height measured at the same time of day, thus maximizing the correlation between shadow height and real height.