Commenter “Callitwhatyoulike” wrote:
The primary problem with using SAT as an estimator of IQ is that there are a multitude of reasons why someone may have scored poorly on SAT, yet have an outlying IQ. SAT (arguably) requires years of *learning* (knowledge), *in conjunction* with outlying intelligence, to score highly on. Obviously, the advantage of IQ testing is that knowledge (minus basic literacy) isn’t a factor. I scored under 1200 on SAT (post-’95), yet my IQ is well above the 99th percentile (indicative of a “should have been” SAT score of at least 1400). Why is this a problem? Employers, apparently attempting to substitute banned IQ testing with an “equal” measure of capability, frequently ask for SAT scores. Without giving them my life story, I can’t very well detail why my SAT score was at least 200 points beneath “should have been,” per historical SAT/IQ correlation. My SAT score implies that I’m barely above average, though the IQ scores I provide them with (very specifically) go entirely ignored (or worse, generally speaking, are seen as simple “boasting,” etc., and are used against me).
(If employers knew how to hire people [rest assured that they do not], they’d find ways to determine candidates’ IQs, immediately cease “interviewing” people [rolling eyes], immediately cease looking at photographs of people [rolling eyes], and recognize that, as was proven long ago, virtually nothing but raw intelligence [IQ] is valid as a predictor of potential job performance.)
An important point to remember is that both SAT scores and official IQ scores are approximations for a mysterious trait called “general intelligence” (g), or if you do’t believe in g, “overall intelligence”
The correlation between the SAT and g is about 0.8, which is extremely high considering it’s not marketed as an IQ test, but it’s not as high as the very best official IQ tests (the Wechsler intelligence scales) which have a g loading of about 0.9.
Neither of these tests measures g directly the way a tape measure measures height. Rather, ranking everyone’s g with their score on the Wechsler is like ranking everyone’s height by their shoe size. The correlation between height and shoe size is about 0.9, much like the correlation between g and Wechsler IQ is about 0.9.
Meanwhile ranking everyone’s g by ranking their SAT scores is like ranking everyone’s height by ranking their basketball performance. The correlation between height and basketball performance among young American men is about 0.8, just like the correlation between g and SAT scores among young adults is about 0.8. If you attended a high school where basketball was really important or if you practiced basketball a lot in your free time, basketball performance might overestimate your height. Conversely, if you attended a school that had no basketball team, basketball performance might underestimate your height.
Analogously, if you attended a school where doing well on the SATs was really important, or if you spent much of your free time practicing math problems and memorizing vocabulary lists, the SAT might overestimate your g. Conversely, if you attended a school where no one was expected to go to college, the SAT might underestimate your g.
So the SAT is somewhat less accurate at measuring g than the very best official IQ tests, but it’s still as good, if not better, than most official IQ tests. But even the very best official IQ tests are indirect measures of g, in the sense that they don’t directly measure the biological properties of the brain that collectively cause people to differ on mental tests (assuming you believe that).
If I knew someone had a large gap between their SAT scores and their Wechsler IQ scores, I would give more weight to their Wechsler scores, but I would not automatically assume their SAT scores were completely wrong.
For example if I knew someone sucked at basketball despite having a huge shoe size, my guess would be that they were tall, but lacked skill (analogous to the high official IQ person who flunks the SAT).
If I knew someone had a tiny shoe size but was on the NBA (analogous to someone with a low official IQ score who aced the SAT), I would not however just assume they were a short person with incredible skill. I might instead assume shoe size underestimated their height for whatever reason (analogous to an official IQ test being wrong)
As for employers asking about SAT scores, I’ve said before that it would be interesting to a do a study correlating intelligence (as measured by both the SAT and official IQ tests) and success (as measured by both income and status). If what you’re saying is a major issue for many people, the correlation between intelligence and success should be higher when intelligence is measured by the SAT than by official IQ tests, since the former are known by gate keeping colleges and some employers. However, if the correlation between intelligence and success was as high when intelligence was measured by official IQ tests, it would be strong evidence that smart people tend to get to the top naturally, and not as a self-fulfilling prophecy of high SAT scores.
Analogously, if you attended a school where doing well on the SATs was really important, or if you spent much of your free time practicing math problems and memorizing vocabulary lists, the SAT might overestimate your g. Conversely, if you attended a school where no one was expected to go to college, the SAT might underestimate your g.
this is a common mis-understanding.
it is true that studying for a particular test, or particular subtests of a battery, has been shown many times to increase scores, but…
studying for such and such and not studying for such and such is not really an either/or.
that is, what one would find if he cared to look is that those who score high on whatever without preparation have been preparing all the same…
by pursuing their interests…doing what they like…doing what they don’t like but which is demanded of them…or doing what they don’t like but demand of themsleves.
for example,
if a testee liked reading shakespeare and other “difficult” literature, he will do much better on vocab, just as if he had actually memorized lists of words.
the idea that the test naive environment is the same for everyone is naive. some people will already have without effort or will make the effort to have…environments which are likely to increase their scores on one subtest or another, or on every one, the whole battery.
and of course peepee makes her usual mistake of assuming that “your IQ” is something which some test can suss out, rather than a measure of your affinity for and inclusion in the dominant intellectual culture.
she identifies IQ, a test score, with innate ability, whatever that is..
smart and intelligence are adjective and noun respectively.
but what separates the smart and the dumb is having “smarted” or having “dumbed” over years, decades.
smart and dumb are things one becomes, not things one just is.
there really are such things as:
1. curiosity
2. intellectual laziness
3. the “rage to master”
think about your own experience.
how many times a day or week does it happen that you are presented with some challenge intellectual and you think, i’ll push through or i’m tired…fuck it!
smart and dumb is an accumulation over years of these decisions.
the smart have repeatedly pushed through.
runners know. at first it feels like death. after a while running miles is like walking miles used to be.
IQ seems valid at a very large scale bcz errors tend to cancel out with each other
IQ is valid.
i’m an anti-hereditist, not an anti-test person.
all of the smartest people i’ve known had high test scores.
they were a sine qua non.
the question as to the thing-ness of IQ is…
is it a social thing…
or is it a biological thing…
and if it is a social thing, how similar is this thing across cultures past, present, actual, and possible?
but, at the same time, social things aren’t the sort of things which those who claim to be scientists ever identify as their things.
the question as to the thing-ness of IQ is…
is it a social thing…
or is it a biological thing
Jensen felt IQ had both a cultural and biological component, but that the preponderance of evidence suggested g was a wholly physiological variable not amenable to psychological intervention.
So adoption or schooling raised IQ in Jensen’s eyes, but these gains were hollow with respect to g (and hollow with respect to biology) and so people with culturally induced IQ gains (fake IQ gains?) would not be as successful in school and life as their IQs predict.
BUT THE BIGGER OF THE TWO BY FAR, ACCORDING TO THE TWIN STUDIES, IS…
…or will make the effort to have…environments…
that is…and the jews should be praised for their seeing this much more clearly than goyim…
being smart is largely the result of moral decisions.
forcing oneself to understand. forcing oneself to not give up whenever understanding is the prize…
makes a smarter person.
and a split infinitive for the school-marms.
Wrong, actually, because recent evidence might show that non-shared environment is probably not “environmental” at all, but de novo mutation in neurons.
Scott Alexander (http://slatestarcodex.com/2015/10/19/links-1015-take-back-your-link/) writes:
“Something I should have realized a long time ago: cells sometimes pick up a few extra mutations when they divide, but it doesn’t matter because throughout the zillions of cells in the body they all even out. Unless we’re talking about the first division of the fertilized zygote, or the first few divisions in the neural crest which is about to become the embryonic brain, or anything like that. Now scientists find these crucial developmental mutations lead to large populations of genetically different neurons in the adult brain. This ought to increase (by how much? I don’t know) our estimate of how much interpersonal variation is genetic. Even identical twins will have different post-fertilization mutations, so the old maxim that all differences between identical twins are non-genetic doesn’t really hold; since identical twins are the yardstick by which we judge everyone else, that means we have to revise those estimates as well. In other words, these sorts of mutations could make up part of what we previously called ‘non-shared environment’.”
The link to the paper: http://hms.harvard.edu/news/natural-history-neurons
again carl demonstrates the very low social iq of hbders.
what is called “intelligence” in human beings is always…
always…
mediated by their interaction with other human beings…
always…and over years and decades.
all iq tests are achievement tests whether they ant to be or not.
even reaction time is an achievement test.
if einstein had been raised in a barrel…
think about it fucktard.
…whether they want to be or not.
RIIIIIIIIIIIIIIIIIIIIIIIIIIIGHT!!!
you’re a moron carl.
scott alexander?
seriously?
that guy is a self-described [redacted] with diarrhea of the mouth. he’s a total fucktard. and he lives in detroit. yuck!
if you put a gun to someone’s head for decades and say, “be smart!”
guess what.
they WILL score higher at the end of that.
some people put a gun to their own head.
smart isn’t ever something one just is. it’s something one becomes by turns over years not days or months.
raise a perfectly normal newborn in a barrell…never speak to him, just feed him…
he’ll be just as tall as he would have been, but he’ll be retarded. though this will likely be remediable…over years of intensive education.
hbder = retard.
[pumpkin person: Oct 20, 2015, redacted one word from this comment]
I redacted the part about scott alexander being a self-described whatever, unless you have a link.
a link to his own blog peepee-tard?
scott alexander is the PSEUDONYM behind slate star codex.
he says on his own blog he’s a “polyamorist”.
fucking canadians.
you persecuted ernst zundel.
you don’t get that censors are always worse than whatever anyone might say.
AND THIS IS THE TRUTH FUCKTARD. IT’S NO SECRET.
I thought of this today because a bunch of people have accosted me about the article There’s A Big Problem With Polyamory That Nobody’s Talking About. “Scott, you’re polyamorous! What do you think of this?”
http://slatestarcodex.com/2015/02/11/black-people-less-likely/
peepee is a stalinist cunt, yet worships rich people.
her head should explode.
hahaha…sure, don’t even read and bullshit more. Here, watch Plomin.
http://www.bbc.co.uk/programmes/b06j1qts
I assume since your IQ is pretty low, you’ll find it easier to watch the video above than burn out trying to understand behavior genetic and neuroscientific concepts
people with low IQs both:
1. overestimate their IQs.
2. think that their betters have lower IQs than they do.
carl,
get into the BGI study.
then we can talk.
watch Plomin?
what’re you fucking retarded?
Plomin is a JOKE!
He is the most pre-eminent behavior geneticist in the whole world
Employers should be able to use both IQ and SAT scores in ascertaining intelligence and competence . Generally, the correlation holds pretty well, but there are always some outliers. A 99th percentile IQ and an SAT score of 1,200 is still better than 95% or so of the population, and I’m sure employers would take that into consideration.
even if they could, employers wouldn’t use IQ tests…in america or canada.
jew of the blogosphere has commented on this many times.
it’s only delusional prole conservatards who think the problem is Griggs v Duke Power.
employers do use test scores elsewhere, but not just because they can.
those with the power to hire and fire in america and canada have benefited from the system which doesn’t select for iq…they’re not that bright…and they know it.
hiring managers hire people who are most like themselves when they’re given the freedom and fire people who are least like themselves. but they also tend to have zero introspection, so they don’t see themselves doing this, they think they’re just hiring and firing the “best” and the “worst” respectively.
basically those with a conservatard ideology are socially retarded.
LOTB also believes employers care very much about whether you went to an elite school. Seems like a contradiction to eschew testing, yet select people from schools that select based largely on testing.
FOR ONCE PEEPEE IS EXACTLY RIGHT.
IT IS A CONTRADICTION.
JUST ONE OF MANY EXAMPLES OF DOUBLE THINK IN LES ETATS UNIS MERDEUX.
EVEN STEVE SHOE HAS POSTED ON THIS…WITH QUOTES FROM ELITE FIRMS’ HIRING MANAGERS.
PEEPEE,
AT MOST HALF OF ELITE AMERICAN SCHOOLS’ STUDENTS HAVE BEEN SELECTED FOR THEIR IQ…EVEN INDIRECTLY.
IT’S NOT ME PONTIFICATING PEEPEE.
THE STUDIES HAVE BEEN DONE. THE BOTTOM LINE IN COMTEMPORARY AMERICA-STAN IS THAT…
USING TEST SCORES…
SMART POOR KIDS ARE MUCH LESS LIKELY TO GRADUATE FROM ANY COLLEGE THAN DUMB RICH KIDS.
THANKFULLY CANADA THREW OUT ITS WHITE TRASH PM YESTERDAY.
BUT UN-THANKFULLY REPLACED HIM WITH A RICH KID DIM WIT.
CANADA NEEDS TO STOP TRYING IMITATE AMERICA. AMERICA IS SHIT.
CANADA NEEDS A NEW DEMOCRAT WHO’LL ALSO LIMIT IMMIGRATION.
this is clear even in the example of prof steve shoe, a great booster for what he thinks is american meritocracy.
look at shoe’s collaborators on both his physics papers and his behavior genetics research. it’s not just that the bgi is chinese. shoe’s american collaborators are also overwhelmingly chinese americans.
i mean fucking grow up.
take your head out of your ass.
stop licking rich men’s assholes.
If you think intelligence is overall adaptive ability, the WAIS is for you. It is based on that concept. But the (old) SAT – at least the Verbal Reasoning test – is definitely more g-saturated than the WAIS. The WAIS is clinically useful just because so many of its tests reflect factors other than g. Specifically, it includes measures of processing speed – less g-saturated than reasoning, comprehension, or vocabulary – and of general visualization capacity. It is true that the SAT involves items somewhat improved by education, but the same is true for parts of the WAIS – such as vocabulary, information, and arithmetic – and those items happen to be the highest g loading and best predictive validity.
Remember, also, most of the evidence for the significance of IQ comes from the Armed Services tests, which resemble SAT much more than the WAIS. The SAT (of old) and the Armed Services tests have been subjected to far greater refinement than the tradition-bound WAIS.
The WAIS is for diagnosis more than precise g estimation, particularly for very-superior adults. (But of the individual tests, the Woodcock-Johnson is unequivocally better than the WAIS.) The WAIS is a rather easy test, and the SAT is ordinarily better for determining capacity above 130. (A 1200 SAT is about 125.) Today, individual IQ tests give the tester some discretion to use one or another composite as the best estimate of cognitive capacity. Perhaps that improves the g estimate (but surely not for high IQ subjects).
[ I wonder if the appearance that the WAIS reflects intelligence better than the SAT might derive from the SAT’s attenuated range.]
I don’t know why you think the SAT is more g loaded than the WAIS. The WAIS has a g loading of 0.9 and the reason it’s so g loaded is that it measures many different kinds of g loaded abilities. The extreme diversity of the WAIS allows specific strengths and weaknesses to cancel out leaving the aggregate score especially g loaded.
By contrast the SAT measures mostly reading and math skills, both of which are highly g loaded, but a high g person who attended a crappy high school or isn’t very academic is pretty much screwed on the SAT.
The old SAT did have a lot of ceiling, but a super genius I once knew felt it didn’t measure g above 1400
I was perplexed by .9. What is this g-loading estimate based on? It’s equal to the test-retest reliability of the WAIS. (And g is a hypothetical construct, after all.)
If the WAIS truly were composed of diverse tests, it would allow specific strengths to cancel out, although at the expense of error variance. But it isn’t – instead it reflects certain clusters disproportionately: visuo-spatial ability and processing speed.
The best tests for _high-level_ intelligence don’t, however, use the strategy extreme diversity. Terman’s Concept Mastery Test comprised vocabulary and verbal-analogy items. Diversity is advantageous, but it should be diverse with a large g component, and in practice, it is problematic to measure g at high levels using performance items.
The SAT-V has deteriorated, as you probably know. The old SAT-V contained four kinds of verbal reasoning items: vocabulary, analogies, paragaph comprehension, and sentence completion. Sentence completion is probably the best single g measure (Ebbinhaus’s verbal cloze test.)
But do you really think including measurements for digit span forward and digit-symbol substitution is conducive to measuring g in high-g subject? When the test has a ceiling of 130 to begin with, such WAIS subtests detract from estimating levels of superior intelligence.
I agree the SAT (old version) doesn’t substantially measure g after 1400. (I would prefer to say 700 Verbal Reasoning.) The GRE-V today measures a bit higher, as does the Miller’s Analogies Test, the successor to Terman’s Concept Mastery Test.
I was personally told by a reputable intelligence researcher that the WAIS has a g loading of 0.9. I’m not sure where he got the figure, but it was some peer reviewed paper.
In addition, one way of estimating g loading is that the correlation between tests is a product of their factor loading, assuming two tests share little variance except g (Jensen 1998)
The WAIS correlates about 0.72 with the Raven which is known to have a g loading of 0.8. This implies the WAIS has a g loading of 0.72/0.8 = 0.9
By contrast the new SAT correlates only 0.65 with the Raven, and I’m skeptical about the new SAT being noticeably less g loaded than the old.
The WAIS has a ceiling approaching 160 in young adults and as high as 190 in the elderly (though it’s functional ceiling is debatable)
The newest version measures at least five broad cognitive abilities: verbal comprehension, working memory, abstract reasoning, spatial reasoning, and processing speed.
By contrast the SAT measures mostly just verbal comprehension and working memory.
The main problem with the SAT is that it’s too influenced by what you studied in high school and socioeconomic background:
https://pumpkinperson.com/2015/04/08/charles-murray-is-wrong-the-sat-is-a-student-affluence-test/
Btw, none of this is meant as a criticism of the SAT. The SAT makers do not want it to be considered an IQ test & I don’t really want the SAT to be measuring IQ.
I want to live in a meritocracy, but only if it occurs naturally, without the aid of IQ testing.
Having recently taken the SAT, I can attest that it actually measures (to some degree or another) all five of the cognitive abilities the WAIS professes to measure:
verbal comprehension — the entire Reading section is devoted to this.
working memory — required in both the Reading and Math sections; more specifically in algebraic problems that force the test taker to work with simultaneous equations or analyze large amounts of data; also required in compare/contrast questions on the verbal section.
abstract reasoning — while the test loads less on abstracts reasoning than on verbal comprehension and working memory, some of the word problems on the math section can get pretty abstract.
spatial reasoning — required in geometry problems, especially the more complicated word problems that don’t include diagrams.
processing speed — each test is timed, and tightly so.
On a side note, the addition of a non-calculator section in the forthcoming rendition of the test should make the math section a little more g-loaded, and less about what calculator tricks you’ve memorized.
Closing thought: Why does the WAIS measure processing speed in the first place, considering it’s been measured to have as low as 0.07 correlation with Gf? (source: http://tinyurl.com/oeu6rjh)
I suppose you could argue almost any test measures all five abilities of the Wechsler. Take the Raven for example:
Verbal comprehension: talking your way through the problems
Spatial: the problems are visual
Working memory: keeping multiple patterns in mind at once
Abstract: the problems are analogical
Processing speed: Some versions of the Raven have time limits
The point is, if the SAT and the Raven were included in a large factor analysis with all the subtests of the Wechsler, what we would find is probably the Raven loading primarily on abstract reasoning and all the subtests of the SAT loading primarily on verbal comprehension or working memory.
Why does the WAIS measure processing speed in the first place, considering it’s been measured to have as low as 0.07 correlation with Gf? (source: http://tinyurl.com/oeu6rjh)
I agree that the WAIS should dump processing speed to make room for more important abilities (i.e. Theory of Mind. Executive Function)…but, a low g loading should not be a disqualifying factor because intelligence is about more than g. Any cognitive function that helps one adapt to a wide range of situations should be welcome on an IQ test, regardless of its g loading. It just so happens that when you have many different kinds of cognitive functions measured on a single scale, the aggregate score becomes an excellent measure of g (as non-g variance cancels out)
I would say the sat is probably a better than iq tests
One strong factor is motivation
Most of the people taking the sat are highly motivated, after all it is a test that will determine the rest of your life.
Doing well on the sat could easily offer 10’s of thousands of dollars in scholarships and probably hundreds of thousands or millions in future career earning.
Meanwhile for iq tests most people are paid barely any money maybe $50-100 to take it and it has nowhere near the stakes that the sat has.
I have been a vocal supporter that Langan and rick rossner are frauds who most likely gamed the hell out of whatever iq tests they take in addition their scores are probably inflated due to the general population simply not caring.
If these two dumbasses were as brilliant as they claimed they wouldn’t be broke and barely employed losers.
If they truly had the sat scores that they claimed to have they would have been given merit scholarships and honors programs to top schools.
And yes gaming iq tests are very possible since you have a apathetic population and if you gave a iq test to a psychologist who studied iq tests he would easily score in the genius range.
While for the sat many people game it so that gaming no longer has an effect
The test questions are all judged for fairness, changed every year, little to no selection effect on who takes it
In addition formal education is not required since many of the math questions are designed to be solved multiple ways. You have extremely bright 8 year olds doing well on the math portion of the sat despite not having formal knowledge. They were able to get the answers to mathematical problems that they could not solve simply using mathematical intuition,
“While for the sat many people game it so that gaming no longer has an effect ”
This is a common assumption, but I don’t think it true. A test measures different abilities when it is gamed than not. Essentially, a test gamed by all is less a measure of g: it becomes less cognitively complex.
What I meant is that if a majority of the test takers prep it simply moves the average raw score a bit higher.
But the sat scores are percentile scores so getting a 700+ in a section means that you did better than 95 percent of the people taking the test.
If everyone was gaming the high iq guy that games would score 95th percentile and his raw score would be something like 98 percent of the questions correct if nobody games the high iq guy would get 95th percentile and probably 87% of the questions correct.
Some of the not so smart guys purposefully take the sat at not so popular times to avoid competition
There is a hard limit to preparation shown in several studies, in particular for the math section questions are rated easy medium hard and very hard.
Prepping usually allows an individual to get all the easy and medium questions right.
“The WAIS correlates about 0.72 with the Raven which is known to have a g loading of 0.8. This implies the WAIS has a g loading of 0.72/0.8 = 0.9”
NO!!! Think about what you are saying: then, the lower the g content of the Raven, the higher the g loading of something that correlates with it. But it’s the other way around, as we’re looking for common variance: .72 * .8 = .58.
Think about what you are saying: then, the lower the g content of the Raven, the higher the g loading of something that correlates with it.
Correct If two variables only correlate strongly because their shared loading on g, then a low g loading on variable A means variable B must compensate with a stronger g loading. If both A and B have low g loadings, then they can’t correlate strongly if g is their only link.
In any event, do you agree your computation was erroneous?
Your formula would be correct, I should add, if variance of g predicted by the Ravens were independent of that portion predicted by the WAIS. But that isn’t a reasonable assumption.
Your formula would be correct, I should add, if variance of g predicted by the Ravens were independent of that portion predicted by the WAIS.
Which is why I said “one way of estimating g loading is that the correlation between tests is a product of their factor loading, assuming two tests share little variance except g (Jensen 1998)” and then said it again: “If two variables only correlate strongly because their shared loading on g”
But that isn’t a reasonable assumption.
It’s a debatable point
On the “ceiling” of the WAIS – The best measure of g on the WAIS is Vocabulary. This may be disputed because g is theoretical. But of all subtests, vocabulary correlates best with full scale IQ. And vocabulary is the most heritable of the subtests, which is a clue to its loading with g, since g is the most heritable component of mental ability. Yet someone with a 130 IQ can max the vocabulary subtest. Thus the best subtests are low ceilings, making the WAIS unsuitable as a high-level test. (Similarly for Information, the second best subtest, as judged by heritability.)
The correlation between WAIS and Ravens (larger than with SATs is due to the attenuated range of the SAT. Moreover, the Ravens is vastly overestimated as a measure of g. This is proven by the Flynn effect (unless you really believe we’re all getting smarter just because we’ve collectively learned to solve a certain kind of test question).
Note that I’m talking about high-level IQ. The WAIS is better for below average IQ; I won’t bet on which is better for near-average.
“The SAT makers do not want it to be considered an IQ test & I don’t really want the SAT to be measuring IQ.
“I want to live in a meritocracy, but only if it occurs naturally, without the aid of IQ testing.”
1. The fact that you have strong wants on this subject counts against rather than for your conclusion!
2. The SAT makers have their own reasons not to want it considered an IQ test. Maybe it isn’t any more. It’s now all paragraph comprehension (which is not the most valid of its previous subtests; analogies and cloze were excluded because they are too IQ-like.
On the “ceiling” of the WAIS – The best measure of g on the WAIS is Vocabulary. This may be disputed because g is theoretical. But of all subtests, vocabulary correlates best with full scale IQ. And vocabulary is the most heritable of the subtests, which is a clue to its loading with g, since g is the most heritable component of mental ability. Yet someone with a 130 IQ can max the vocabulary subtest. Thus the best subtests are low ceilings, making the WAIS unsuitable as a high-level test. (Similarly for Information, the second best subtest, as judged by heritability.)
2% of the population has an IQ of 130+. Only 0.1% of the population hits the ceiling on vocabulary and only 0.1% of the population hits the ceiling on information
The correlation between WAIS and Ravens (larger than with SATs is due to the attenuated range of the SAT.
No, if everyone in the U.S. took the SAT and Raven, the g loading would be 0.65. In the restricted samples that take the SAT it’s lower
Moreover, the Ravens is vastly overestimated as a measure of g. This is proven by the Flynn effect (unless you really believe we’re all getting smarter just because we’ve collectively learned to solve a certain kind of test question).
The Raven has a g loading of 0.8 which means 64% of the variance is g. 36% is not g and that helps explain the Flynn effect, though we are partly getting smarter because of better nutrition.
I understand that the WAIS IV differs from WAIS III (which I’m much more familiar with) by greater accuracy at the extremes. I can’t readily find stats, but I’m guessing that Vocabulary’s ceiling was lower on the WAIS III.
The basic point I’m making, however, is the same as that by Jayman when he wrote of scores that are tilted. Feynman and Shockley might well have received (as claimed) relatively mediocre IQ tests (for the reason Jayman provided) – whereas it is inconceivable that they would not have received exceptional scores on the GRE or old SAT. Do you agree?
But do you have a source for the statistics? I have trouble believing that only one in a thousand maxes out the vocabulary section. The test would need to use extremely obscure words, which would obviate their value for IQ.
You still haven’t explained how you can offer “g-loadings” of tests, when g is a hypothetical construct. How could you compute it? (There are ways, but they involve theoretical assumptions which produce different answers.) Evidence that the g loading of the Raven is much lower than .8 is that Ravens scores are less heritable than verbal scores and that a Vocabulary test has much greater predictive validity for job performance than a Ravens. (Which is presumably why the Armed Forces Tests and the SAT/GRE tests don’t use nonverbal measures. And – I would speculate – it is the reason that the Analytic Reasoning test was abandoned on the GRE.)
Your basic reservation about the academic aptitude tests as being g measures (compared to the WAIS) is that it is highly saturated with material taught in school. This, I agree, must be a bias when it comes to estimating g. But the evidence of the whole history of ability testing is that it is much less a bias than we intuitively assume. Test items don’t have to look like what they’re testing! On the other hand, Ravens is heavily biased by the general visualization factor.
Last point – I take your point that the .65 estimate took account of attenuation. An alternative point: the Ravens correlates with the WAIS because the Ravens is a nonverbal test and the WAIS includes a performance section (while there’s nothing of the sort on the SAT).
On pumpkinperson’s point about the test-specific variance of Ravens being sufficient to account for the Flynn Effect: this is the explanation, but even if .2 provides enough room for the observed effect, the nature of the effect is such as to lower the test’s g-saturation itself by making the items less complex for more sophisticated testakers.
The basic point I’m making, however, is the same as that by Jayman when he wrote of scores that are tilted. Feynman and Shockley might well have received (as claimed) relatively mediocre IQ tests (for the reason Jayman provided) – whereas it is inconceivable that they would not have received exceptional scores on the GRE or old SAT. Do you agree?
My guess is Feynman would have scored a 545 on the verbal section of the old SAT (equivalent to his reported score on a likely verbal IQ test) and an 800 on the math section.
But do you have a source for the statistics? I have trouble believing that only one in a thousand maxes out the vocabulary section.
The source is the WAIS-IV manual which equates the highest vocab score to a scaled score of 19 or 19+ at every age. A scaled score of 19 is +3 SD, technically it’s equivalent to the 99.8650032777%ile, but gets rounded to the 99.9%ile
The test would need to use extremely obscure words, which would obviate their value for IQ.
No, the test uses common words that most people have heard, but that are often misunderstood
Evidence that the g loading of the Raven is much lower than .8 is that Ravens scores are less heritable than verbal scores and that a Vocabulary test has much greater predictive validity for job performance than a Ravens. (Which is presumably why the Armed Forces Tests and the SAT/GRE tests don’t use nonverbal measures. And – I would speculate – it is the reason that the Analytic Reasoning test was abandoned on the GRE.)
Raven is less heritable than which verbal scores? And predictive validity can be gained from non-g variance like acquired knowledge, special talents, and achievement motivation, all of which might be correlated with verbal scores independently of g.
The Raven probably can not have a g loading much below 0.8 because the Matrix reasoning subtest on the WAIS-IV, which is a short version of the Raven, has a g loading around 0.73 as documented by factor analysis studies. If this short abbreviated less reliable ripoff of the Raven has a g loading of almost 0.8, then the actual Raven must be at least 0.8.
Your basic reservation about the academic aptitude tests as being g measures (compared to the WAIS) is that it is highly saturated with material taught in school. This, I agree, must be a bias when it comes to estimating g. But the evidence of the whole history of ability testing is that it is much less a bias than we intuitively assume. Test items don’t have to look like what they’re testing! On the other hand, Ravens is heavily biased by the general visualization factor.
The Raven is a relatively weak measure of visualization ability. It’s mostly about conceptual abstract reasoning
Last point – I take your point that the .65 estimate took account of attenuation. An alternative point: the Ravens correlates with the WAIS because the Ravens is a nonverbal test and the WAIS includes a performance section (while there’s nothing of the sort on the SAT).
That’s possible but Raven is probably no more a Performance subtest than the geometry problems on the SAT. Indeed once Matrix reasoning type items were added to revisions of the WAIS, they abandoned the idea of a Performance IQ, and on the WISC-V, performance IQ has been replaced by spatial IQ, processing speed IQ, and fluid reasoning IQ (Matrix type items) because Matrix type problems simply don’t cluster that well with traditional Performance tests
There is hardly any specific variance in the FR cluster. I don’t think Gf is a terribly useful concept.
What studies are you thinking of when you say MR has a g loading of .73? I believe the WAIS-IV states .70, uncorrected.
It might be 0.7 not 0.73…at least according to this source:
https://books.google.ca/books?id=vMR9b7dshrQC&pg=PA49&lpg=PA49&dq=matrix+reasoning+g+loading&source=bl&ots=7EmkBQ1C0p&sig=3QhDSYI5eyF1b4L_QKjljuC1KXA&hl=en&sa=X&ved=0CCIQ6AEwAWoVChMI74GDx7vPyAIVhf0eCh2Zbg6Q#v=onepage&q=matrix%20reasoning%20g%20loading&f=false
But the point still stands. If a short little subtest like Matrix reasoning has a g loading of 0.7…the longer more reliable version of the same tasks, Raven, should be around 0.8
This is an interesting topic for me because my situation is nearly identical to the original comment. I took my SAT in the late 90s and scored under 1200, under which my IQ would predict.
The one thing that may be of interest in this discussion is that I scored 200 points higher in verbal than in math.
I can explain the disparity by offering that as a kid I found math uninteresting, I had a bad attitude about it, didn’t do the homework and never really learned it. I suppose one could counter that a high IQ person should be able to grasp math easily and still score well.