Transcript: Zach Hambrick on Psychometrics and the Science of Expertise – #28

Steve: Thanks for joining us. I’m Steve Hsu.

Corey: I’m Corey Washington and we’re your hosts for Manifold.

Steve: Well, Corey, we’ve got a special treat today. Our guest is professor Zach Hambrick of the Michigan State University psychology department. Because he’s from Michigan State, we’ve got him right here in the room with us, so no staring into a video monitor and weird audio. Got him right here. Zach got his PhD in experimental psychology at Georgia Tech in 2000, and his expertise is individual differences. Zach, maybe you can just explain to our listeners what is meant by that specialization within psychology.

Zach: Well, I study how people differ, particularly in cognitive abilities and also skill and complex domains. Individual differences looks at sources of, particularly, between-subject or between-person variability in intelligence and more specific cognitive abilities.

Steve: Do you do personality as well?

Zach: A bit. The main focus is ability, but we’ve also looked at, in various studies, at personality traits, for example, in predicting outcomes like learning a complex task, and we look at whether personality also predicts variation in outcomes like that. Again, the main focus is on general intelligence and also more specific facets of cognitive ability like working memory.

Steve: Right before we started, I was asking you how your field relates to this other field in psychology called personnel selection, and I think you explained that personnel selection is an application of what you do. For the listeners, personnel selection is really the science of how you… For example, you have a certain job category that you want to fill. What are the best, really, predictors, in a sense, that you can use, measurables or facts about the applicants for those positions, in order to fulfill those positions with the most capable people?

Zach: Right. That is one of the major applications of this work. For example, we have a grant from the Office of Naval Research, and we’re looking at ways to improve the ASVAB. The ASVAB is the Armed Services Vocational Aptitude Battery. It’s basically a cognitive ability test that is used for personnel selection and classification, whether someone is eligible to enlist and then what sort of job they’re placed in. The ASVAB is a quite old test, and what we’re looking at is whether we can measure new cognitive constructs and see whether those improve prediction of outcomes above and beyond the ASVAB.

Steve: Maybe we could talk a little bit about the military applications of cognitive testing. I think a lot of people misunderstand this, but my understanding is that literally every recruit into the military, so we’re talking about millions of people, and then over time probably maybe 100 million people have been tested with these instruments, right? Or tens of millions, at least?

Zach: Yeah, that’s exactly right.

Steve: Right. How you score really has an important impact as to where you’re placed in the military, right?

Zach: It does. I don’t know the specific cutoffs, but it also determines whether you can enlist in the first place, and then, in turn, what sort of job you’re placed in.

Steve: Right. I think that depending on whether there’s a war on and what the economy is like and what their recruiting goals is, the lower cutoff for who they will accept in the military fluctuates around, I think, the IQ equivalent of about 90, something like this. If you have trouble hitting that threshold, you have trouble actually getting in the military, even if you want to get into the military.

Zach: Right. The cutoffs, I believe, I don’t have the exact numbers again, but vary depending on the branch.

Steve: Right. I think I recall like, okay, say you want to be in the communication specialty or electronic specialty, generally that would have a higher threshold. On the other hand, if you’re maybe infantry or some support function, the cutoff might be lower.

Zach: Right. They’re different. The ASVAB is a test battery, and it includes sub-tests to measure verbal ability, mathematical ability. It also has sub-tests to measure knowledge in particular areas like information and mechanics and this sort of thing. The main score that’s used for enlistment is called the AFQT, the Armed Forces Qualifying Test. That basically is verbal ability and mathematical ability, a measure of general cognitive ability, at least crystallized cognitive ability. Then the other sub-tests can be used for more classification purposes.

Corey: I have a question, Zach. They’re given a fairly generic test in these, at least, initial recruits. Everyone is given it. Then when someone wants to go into a field, say, communications, are they given a secondary test that specializes on that particular subject matter?

Zach: Actually, what I was-

Corey: Or are they actually weeded out purely on the basis of the initial test?

Zach: Right. The AFQT score, it captures performance on a few of the sub-tests on the ASVAB. Then the other sub-tests are used for more classification purposes. There may well be additional tests that people are given once they enlist. I’m actually not sure on that.

Corey: Because it would just seem to me kind of odd if you had a more or less generic IQ test and that said you can’t be in communications, but were this person to take actually a specific communications test, they might be actually pretty damn good, although they basically flunked the initial IQ test.

Zach: Well, again, the presumption is that, for a job like that, you have to have a certain level of cognitive ability, and that will tend to be true. There may be people out there who have a high level of skill in some particular domain, but-

Corey: I’m just curious as to whether the armed forces realizes that, that there’s somebody who might not be terribly good at a general test.

Steve: I think that this… We’re going to get into this topic a little bit if Zach wants to talk about it. I could talk about it a little bit. But there was a very famous situation under Robert McNamara during the Vietnam War. You’re familiar with this, right?

Zach: Sure. Go ahead.

Steve: Well, so they kind of tested your hypothesis, Corey. What they did is they admitted a bunch of people with lower scores than they ordinarily would have into the military and into certain specialties, and then measured very carefully. I think it was 100,000 people. It was not a small statistic study. Then, for example, so you might say, “Oh, well, we used to have this cutoff of 110 to get into artillery. Well, we’ll just let some people in who don’t qualify under our old rubric and see how they perform.”

Corey: That’s not what I’m suggesting, actually. I’m saying, look, you might have some people who have actually pretty narrow abilities, maybe almost savants in a very narrow area, or they do pretty badly in general, and you don’t want to lose these people, so give them two tests, a general test, where they might do well, might not do so well, but then you might actually uncover a pretty specific ability that you might want to exploit later on. To weed these people out on the basis of the initial test strikes me as not very smart.

Steve: Right. Well, I think you’re making an empirical claim there. It may turn out that it is useful to relax the raw ability measurement in favor of the narrow skill measurement, but you’ve got to remember these guys have to do a bunch of other stuff. As well as run the cannon, they have to actually interact with other people and do some more general tasks. There’s probably some very nuanced theory among psychologists about which is the better predictor. Even if the guy is very high on some narrow skill thing, and if he’s sufficiently low on the overall aptitude, that there still may be a problem.

Corey: Yeah. It also depends, of course, on the final job the guy is going to be doing and what is life is going to be like in the military.

Steve: Yeah. I’m happy to get into more detail about the military stuff, but I think the one point I wanted to make, and I’ll let Zach take the lead on this, but the main point I wanted to make is that since the SAT and other aptitude tests are very controversial right now in the academy, in university admissions and graduate school admissions, one of the things that most people seem unaware of is that there’s an, in a sense, equally large statistics set of results in the military, where they’ve come to the conclusion that it does make sense to have pretty hard thresholds for letting people into certain areas of operation. I think the data are very strong. Maybe, Zach, you want to say something.

Zach: Right. Well, a lot of this work has been done by people in organizational psychology and industrial organizational psychology. If you ask a person in this area what the single-best predictor of job performance is, what they’ll tell you, general cognitive ability. This is a conclusion based on huge data sets, many from the military. Frank Schmidt at University of Iowa is one of the leading people in this area for many decades. I think that they’ve looked at a wide range of jobs, from really simple jobs to complex jobs, and the validities, which is to say the correlations between general cognitive ability, or some other predictor, and performance, they tend to be higher for more complex jobs than less complex jobs, but they’re nontrivial even for low-complexity jobs, which gets a little bit at your point. Whatever the job is, general ability is still going to be a factor that-

Corey: I’m not arguing it’s not. I’m asking whether there’s another dimension, whether some narrow abilities might actually, in a sense, trump the general ability if there’s a conflict.

Zach: Yeah, this has been a huge debate in this area. The question is, is it all just G, general intelligence? If you look at job performance, you can improve prediction a little bit with measures of more specific abilities, but it’s not much compared to the contribution of general cognitive ability. That’s the conclusion from this work that Steve was referring to.

Steve: Yeah. I think if you look at Schmidt’s papers or other papers in this area, which, again, are based on huge statistics, it’s generally not even close.

Zach: That’s right.

Steve: The first factor is general cognitive ability, which is basically kind of IQ. Then if you look at the next biggest factor that they can find, which it has to be something that you can measure about the candidate beforehand, nothing comes even close. It could be five times less variance accounted for or something like this.

Zach: That’s correct. That’s also true for personality, say. Conscientiousness is one of the so-called Big Five personality traits, and it correlates positively with job performance, but it’s nowhere near, in terms of its predictive power, it’s nowhere near general cognitive ability, as a general statement, across a wide range of jobs.

Steve: Let me say something which I think is a folk statement about the kind of stuff that you do. You can react to it. Some people might say, “Well, we kind of know… You just stated the big result. We kind of know that general cognitive ability works well, and nothing else really comes close. In a way, the field is, you could regard this as a positive thing or a negative thing, it’s reassuringly boring, in a sense, that this main result just keeps getting replicated again and again, and nobody can find another thing that you could measure about the person that is nearly as good as cognitive ability.” But the general public does not want to accept this is true, and so there’s just this constant barrage of other results which may not actually… I’m using air quotes when I say results… which may not replicate. But as soon as somebody proposes one, whether it’s Malcolm Gladwell or, I forgot, is it Amy Duckworth or Susan-

Zach: Angela Duckworth.

Steve: Angela Duckworth. Somebody proposes some other measurement thing which isn’t cognitive ability, but which is awesome. Everybody seizes on that because they just like that feel-good story. They don’t want to believe that there’s just one thing that really is so impactful. Is that how the world looks to you?

Zach: Yes. That’s exactly it. For many decades now, people have been looking at something that is better than G, and it’s really hard to find. When people make claims that there is, at least to this point, they tend to be overblown. One important point to make here is that G is far from perfect as a predictor of job performance or any other outcome, so there’s value in searching for additional predictors. That could include, certainly, specific experience and specific skills, which may themselves be correlated with G, so there’s that issue. You find people who have really high knowledge in some specific domain, well, that might be a stand-in for or reflect their high level of cognitive ability, but yeah, that’s certainly been… This work on general intelligence and, particularly, job performance, it is kind of reassuringly boring, but I think that’s what we should actually want in a science of psychology.

Steve: Right. A stable truth.

Zach: Stable truth. Another stable truth, in fact, maybe the most stable truth in psychological science, is G itself. G, it’s the general factor of intelligence. If you give a bunch of people a bunch of cognitive ability tests, scores on those tests will tend to correlate positively with each other. A person who does well on one will tend to do better on all the others. 150 years ago, people didn’t know this. This is not a foregone conclusion. You can think of models of the brain and cognition that would predict different outcomes, but it turns out that’s the way it is. G is a take-it-to-the-bank finding, and probably the most replicated finding in all of psychology.

Steve: It’s funny, if you look at that early data, they even looked at things which people might not regard as cognitive traits, like ability to detect musical tones or reaction time or things like this, and they all turned out to be positively correlated with G. The strong interpretation of this, which I think is still controversial, but some of my friends who are cognitive scientists believe in this, is that it is basically a measure of the just general goodness of functioning of your brain, and that’s why it’s positively correlated with almost any task that you give somebody.

Zach: Yeah, that’s exactly right. We still don’t know exactly what G is. That is an ongoing area of research and debate. People have said, “Well, it’s working memory, the ability to hold information and the focus of attention.” Other people have said it’s speed of basic mental processes. We still don’t know exactly what G is. But what we do know, as well as we know anything, is that G exists, A, and that it predicts important outcomes, job performance, mortality.

Corey: What’s the correlation with… I guess there are many different types of jobs that we can go down. In my head, I’m having skepticism about whether an Air Force pilot, whether G is the best predictor of performance in combat versus reaction time, other kinds of possible cognitive measures.

Steve: Yeah, the claim would be, just generally in this field, is that you want to select a bunch of pilots and you can make a bunch of measurements on them, and there’s no single measurement that accounts for more than-

Corey: Sure, but I’m wondering about how close are the other measures and how important, say, is simple reaction time. You can’t get into the Air Force unless you have good vision, so that’s a seriously hard cutoff for being an Air Force pilot. They clearly think that that’s extremely important.

Steve: I’m not sure anymore how important vision is because mostly you’re using a heads-up display and stuff. We probably don’t have that much data on that, as you say, because they didn’t let any nearsighted people like me fly planes, so they don’t really know how big of a handicap it is to be staring out through corrective lenses during flight. But my understanding is from across a broad variety of jobs, there isn’t anything close to G, typically, for screening candidates.

Zach: That’s true. Again, this includes things like personality. It includes interview performance. It includes motivation. The average correlation across a wide range of jobs between G and performance is about 0.5, somewhere in the neighborhood of 0.5.

Corey: Again, I’m thinking about the military, and I’m thinking about infantrymen in the military. Now, once you cross that basic threshold maybe above 90, I’m thinking about job performance, and I’m really skeptical about whether G is going to predict job performance in the infantry over physical fitness, over conscientiousness, etc. I can imagine if you have exceedingly low IQ, you may not do well in the infantry, but once you have a basic level of IQ, I would find it surprising, actually, to think that that’s the best predictor. I’d be a disastrous infantryman, I’m pretty certain. I score pretty high on IQ tests. But I’m open to being corrected.

Steve: Well, I think you could give a simple counterexample, like, “Find me a center for an NBA team.” Really, are you going to use G as the main criterion? No, you’re going to look at a whole bunch of other things before you look at G.

Corey: Okay, but there’s a lot more people that can qualify for the infantry than can qualify to be centers for NBA teams.

Steve: Right, so somewhere between professional athlete and foot soldier, at some point you can find other things that are more important. Sure, if the guy can’t walk but has off-the-charts G, probably, yeah, not the infantry.

Corey: Not just that, but there has to be a certain ability to bond with your fellow soldiers, a certain willingness to just work incredibly hard under unpleasant circumstances, etc.

Zach: Strength.

Corey: Strength. Exactly.

Steve: I think your model of reality on those cognitive things, like being able to work well with others or whatever, I think my intuitive picture is similar to yours. The only issue is how well can you measure those in the recruits. Now, physical strength is one that you could measure. If you say, “Hey, you have to be able to at least walk six miles without dying,” that probably is pretty good.

Zach: Basic training no doubt includes physical fitness components.

Corey: You have to run two mile in 12 [00:16:36] minutes or something like that.

Steve: Yeah. The data is a little corrupted in the sense that you don’t get the performance data on the various G people until they’ve already passed through basic training, which means they had some basic level of fitness.

Corey: I’m also curious as to how you assess job performance in these areas, because I’m curious, do we tend to find that generals are very, very high-IQ people, or are they people with these insane work habits, extremely straight arrows, maybe a very good judgment about people, etc., but are they the people who are scoring generally the highest on IQ tests?

Zach: Well, I can’t think of a specific study off the top of my head, but if we looked at average scores on the ASVAB or some other type of ability test, yeah, I think on average we would find that the generals and high-ranking officers are higher on average than, say, infantry.

Corey: I would agree with that, but I would say that my guess is other characteristics take over to move you through the ranks that may be more important than G. Maybe-

Zach: No, this is a great… We’ve got some data that are directly relevant to this. There are two points I wanted to make in response to what you were saying. Okay, so one is the idea that… It’s been called a threshold hypothesis concerning cognitive ability. It’s the idea that you only have to have a certain level of cognitive ability, and that beyond that, G loses its predictive-

Corey: For these kind of jobs, especially.

Zach: Right. This has been looked at in a number of data sets, and what you tend to find is that the relationship is linear.

Corey: For all jobs?

Zach: At least the jobs that have been looked at.

Corey: Which ones are those?

Zach: Everything from tank crew person to infantryman, a lot of different jobs. Okay, this is counter to the idea that beyond a certain level, cognitive ability loses its predictive power. You can look in the classroom to the correlations between ACT and college performance are linear. The higher, the better.

Corey: That’s quite different than job performance, which calls on-

Zach: Yes. I was just noting that you see the same in that realm as well.

Zach: Okay, so another related point is this question of whether, as a function of experience in a job, well, maybe after a certain amount of experience, cognitive ability loses its predictive power, so the validities decrease. This is an idea that has been promoted especially in the literature on expertise, which is another area that I publish in. They say, well, yeah, cognitive ability, talent, if you will, intellectual talent is important initially, but when you continue to practice in a domain and acquire domain-specific skills, it loses its predictive power.

Zach: I call this vanishing validity myth because there’s really not very much evidence for it. If you look in large military samples, I got my hands on a data set with about 11,000 enlisted personnel, all sorts of jobs, radio operator, artillery, all sorts of stuff, 33 different jobs I think, and what they had was they had ASVAB scores and then they also had job experience, how long they had been in the job. You see a little increase in terms of the correlation between AFQT score, which is G basically, and job performance, as measured with hands-on job tests. You see a little decrease in the correlation initially, but even after 10 years of experience in these jobs and maybe even more, a lot of job experience, the correlations are still large enough to be meaningful. They’re statistically significant, but they’re also practically significant. There’s a little bit of a decrease, but it’s not the case that the validities go to zero.

Corey: Take a characteristic like ability to conform. It would seem to me that that would be pretty important to success in the military. Nonconformists are generally not going to do well, high-IQ nonconformists probably especially badly because they’re going to be rebellious. I’m curious, have people looked at characteristics like that, examining-

Zach: Well, and I would just add that they’re probably not going to enlist in the first place.

Corey: Maybe you’re right. They may get weeded out. I again think of having my time in the corporate world a little bit. A bunch of people came in, and I have to say the people who lasted the longest, from my personal point of view, did not seem to be the smartest. They’re people willing to conform to the culture. The nonconformist got in and then various problems started happening right off the bat.

Steve: Corey, let me try to illustrate, I think, the point that you’re wrestling with. No one is saying that G accounts for all of the variance. You don’t get a rank ordering of career performance-

Corey: Yeah, I’m saying, does something surpass G?

Steve: Right. If you, say, looked at… You go back to your example of generals in the military. I think you would find them uniformly above average compared to enlisted men, but you would also find them, for example, probably extremely self-disciplined and driven. Now-

Corey: And probably much further above average on those counts, I would guess.

Steve: Possibly even more above average on those factors. The issue is, how well can you measure and predict those aspects of that person?

Corey: That’s one of the issues. That’s one of the issues.

Steve: Well, in their research, it is, because you always have to deal with a measurement at the beginning and ask, how well does it correlate with outcome? And if you can’t measure it well at the beginning, you have difficulty predicting.

Corey: Good point.

Steve: One of the issues with G is that it’s pretty measurable. If he measures the AFQT of one of these generals and comes back a year later and measures the guy again, the correlation between those two scores could be like 0.95. He doesn’t have any other instruments other than their height. Even their body weight could fluctuate more than their G score.

Zach: Another point to make here is, yes, G is robust. It doesn’t matter what test battery you use. There are all sorts of cognitive test batteries. And if you give people five different IQ tests, their scores correlate about 0.95 or higher across those batteries. Like Steve says, it’s something that we can measure reliably and validly, which is no small thing in psychology.

Corey: I understand you can’t give somebody a test for discipline, but I’d be surprised if, say, over the course of time at a military school like West Point, you couldn’t figure out after four years who was pretty disciplined. It may take a longer course to figure that out, but I think my question is, suppose you gave a rank ordering of people as regards to their discipline at military school, then later looked down the road to see how those people did, the contra-hypothesis would be that as you went up the latter, discipline, beyond a certain G threshold, trumped G.

Steve: You’ve described an interesting research project for psychology. In his field, one of the main things, you would become famous if you developed a good method or instrument for measuring a personality quantity which you could show is somewhat independent of, say, cognitive ability, but is stable, so when you measure the person again you get roughly the same score, and predictive of some outcome. That’s the whole ballgame in his field is to try and do that stuff.

Zach: That’s right. Exactly.

Steve: Let me go back to our friend Angela Duckworth. There’s a woman who… I think she won a MacArthur prize for this.

Zach: That’s right.

Steve: I interacted with her when I was at the University of… She was not at Oregon, but her research rose to prominence when I was back at Oregon. I was interested in these things. She developed a grit inventory. It’s basically a self-survey, a kind of survey instrument where you fill it out and it purports to try to estimate your level of grit. One question is, is grit anything but conscientiousness, which is a previously existing personality construct?

Corey: Seems quite different intuitively.

Steve: Okay. So she had this thing, and it would measure things about-

Corey: How much crap are you willing to go through to get something done.

Steve: Yeah, exactly. She claimed that grit did a better job of predicting school success, for example, than IQ, and she got a ton of play.

Corey: Can you measure grit?

Steve: I think grit is probably only as stable as conscientiousness, roughly, as a construct. I contacted her because I read about her work and we were doing some work at Oregon with student SAT scores, and I said, “I would love to be able to use your grit survey in my classes and see if I can develop it as a separate, additional, independent predictor, which I could then combine with SAT to predict how well students would do,” but I could not replicate any of her results. I could not replicate any of her results.

Corey: Did you publish?

Steve: Of course not.

Zach: If you look in meta-analyses and subsequent studies, yeah, the correlations between grit and outcomes like academic performance and job performance, they’re positive, but they’re not as large as G. They’re not as large as G. Then there’s also the issue, as Steve alluded to, with whether or not grit is old wine in new bottles. I’ve collected data on grit and also conscientiousness, and they correlate really highly. A distinguishing aspect of grit is that it has a temporal component, so persistence towards very long-term goals.

Steve: I think the story that we like to tell ourselves, and I think it’s probably true, that there is some really determined person who really wanted to be an Apache copter pilot, and she just drove herself, and maybe she didn’t have the best other skills, but it was her grit that got her there. That seems like a plausible description of reality. It’s just that we haven’t figured out how to measure that properly maybe in that person.

Zach: Well, yeah, and another thing to point out here is that what we’ve been talking about is the high correlation of G relative to other predictors. This doesn’t preclude an individual… We’re talking about group-level data across many people. We can make predictions about what individuals can do, but it’s possible that an individual can, in a sense, defy that trend and reach a high level of performance in becoming a helicopter pilot or something through just doggedness.

Corey: I have a friend. He’s incredibly self-conscious about the fact that he’s gotten ahead with, as he perceives, very little G. He had a theory, what he calls contact intelligence. His idea is that if you grow up in a certain academic area, an academic town, you actually don’t have to be very smart; you just have to interact with smart people, and it rubs off on you. You pick up enough stuff, being an incredibly lazy person, just being observant and having high metacognition abilities, being self-aware, watching other people. He thinks this has gotten him very far ahead. Although, he worked fairly hard, but not very hard, and he doesn’t think he’s very smart. He thinks he’s more self-aware and observant than most people, and he’s charming. He’s incredibly charming. He’s also good-looking, which helps, and he’s tall. But he’s pretty convinced that that’s allowed him to get by way past vastly smarter people he sees strewn by the side of the road, having blown themselves up by doing really dumb, self-destructive… He ridicules academics as being Asperger-y, unaware, doing really compartmentalized stuff.

Steve: He may not be wrong. There are plenty of people… Other than a few narrow specialties like intellectual property law or quant trading, most of the people who make a lot of money in business are very good at sales and marketing. They’re just charming. They can get their ideas across. They’re brighter than average, but they’re not super bright. Your friend might not be completely wrong.

Corey: Does the G hypothesis apply in these areas?

Zach: Well, one thing I would say is that it’s useful to make a distinction here between fluid intelligence and crystallized intelligence. This is a classical distinction in the intelligence literature. Fluid intelligence is the ability to solve novel problems, adapt to new situations. Crystallized intelligence is your knowledge, your expertise. I think that interacting with smart people, what that affects is your crystallized intelligence. He might be right in thinking that he’s not smart. He’s probably thinking about his analytical abilities, his analytical reasoning and so on. He’s probably not thinking of the knowledge that he gains through interactions with these people as being part of his intelligence. But an argument can be made, I think, that it is.

Corey: It’s interesting, because his father was an engineering prof, very, very smart guy. But his father saw this. At one point in time, my friend… His father was pushing him to be a chem major. My friend failed all the classes that semester. He comes to his father crying. His dad said, “You’re the one of my kids I don’t worry about. I have no worries about you.” And he went on to make far more money than anybody else in the family.

Zach: It also depends on whether or not he’s right, in his assessment of his intellect, who his reference group is, right?

Corey: Yes, definitely. He came from a very intellectual town. He had a lot of smart friends.

Steve: I want to switch gears to expertise and the 10,000-hour rule and becoming a master of a particular skill. But before I do that, I want to just dwell for a second, since we’re all in higher ed, on something called the College Learning Assessment. Are you familiar with this at all?

Zach: No.

Steve: Let me tell you the story of this. This is an assessment that they designed to give to graduating seniors, and it was I think mainly designed for schools that are not on the R1 prestigious side of things, but maybe less well-known schools. It was designed in conjunction with employers. The employers would say, “Well, I want the person to be able to read a paragraph and look at a graph and then answer some questions about what our sales budget should be.” They defined it in terms of really very practical real-world, but white-collarish tasks that they wanted the students to be able to perform. If you then took this College Learning Assessment in your senior year and you did really well, even if you were at some maybe not so prestigious directional state college, the employers could say, “This kid really learned what he was supposed to learn in college, and he’s very useful,” etc. It was designed as a very practical test.

Steve: Now, you as a psychometrician can probably just guess that the thing ends up being a measure of G. There was a very big, I think it was a RAND assessment of the CLA. It’s called the CLA. We’re not that familiar with it because, again, if you’re at an R1, not very many of your students are taking the CLA. But I think at many other schools, a lot of kids are taking the CLA, like maybe our kids might take the GRE in order to get into graduate school. These kids are taking a CLA to validate that they really completed a solid college education; they should be hired by a good company.

Steve: RAND did this big study of the CLA, and lo and behold, it turned out you could predict the CLA of the student based on their ACT or SAT score when they entered college. Now the question is, what did the kid get out of college? You can actually now look at freshmen taking the CLA versus seniors taking the CLA, and how much of a gap is there between, yeah-

Zach: Well, what I would say is that the CLA is capturing knowledge and skill, the acquisition of which is influenced by cognitive ability as assessed by the SAT and ACT. This instrument may well assess knowledge and skill that’s useful, and necessary even, for success in some job. That’s what they’re getting out of their education. They’re better prepared than a college freshman would be. But yeah, what’s driving individual differences in the CLA is what was assessed by the SAT and the ACT.

Steve: What’s interesting about these results, and I’ll send you a link to them, because there are these huge studies now. In economic terms, this is a big test. What’s interesting, I don’t remember the exact numbers, but it’s roughly like this. Imagine I have two… One kid is a senior, and one kid is a freshman. On average, seniors have a higher CLA score than freshmen. But how much do I have to increase, say… If the freshman had a higher SAT score than the senior, how much higher would it have to be before the freshman already is doing better than the senior on the CLA? It’s only like half a standard deviation of SAT score. You could almost say that if you had a choice between everything you got from your college education, as tested in this incredibly practical employer-dictated test, is only about half an SD of SAT. That’s what you took out $100,000 in loans for and spent four years of your life on.

Zach: Right, which would say that you’d actually be better off with a super bright freshman-

Steve: Definitely, yeah.

Zach: … than the-

Steve: In the validation study they did of this, the schools that participated were MIT, the University of Minnesota, so they had the whole range. They had the high end, they had R1-type schools, but then they had directional state universities and maybe even community colleges. They had a very large range. They even could define a capital G, which is the level of the school performance on CLA, and it turned out basically just to be the SAT average at the school, basically.

Zach: Here’s a thought experiment. In sports like, well, basketball, at least, you can draft players right out of high school, or they go to college. What would happen if corporations like Microsoft and Google-

Corey: Microsoft did this actually 20 years ago. Microsoft started doing this when I was in Seattle in the ’90s. They were hiring 17-year-olds.

Zach: Okay, there you go.

Corey: I think they found that performance was… I can’t remember the exact result, but pretty close to what they were getting from seniors graduating from college. And they were training these kids up. These kids were good programmers, but they have some kind of training program internally. They’re doing quite well.

Steve: Gates and Allen are both IQ nuts. Actually, there’s a famous quote in like Forbes or Fortune of Bill Gates where he says, “Microsoft is in the IQ business. That’s it. We have to win the IQ war or we’re going to get destroyed.”

Zach: With himself as Exhibit A.

Steve: Yeah, Exhibit A. Anyway, so I just thought when I read about these CLA studies, I just felt like, wow, these guys, had they taken your class on G or psychometrics, they could’ve predicted the outcome of this huge exercise where they had employers designing tests of office skills or strategy decision-making in the corporate environment, but it just turns out to be basically a test of G.

Corey: Granted, you may be right in general. I’m thinking about an engineering degree or a math degree. Now, I was a math major in college, and there’s simply no way I could’ve beaten myself or come close to myself taking a high-level math test at the end because I simply didn’t know as much by the time I graduated.

Steve: No, that’s a great point. On the GRE subject test for chemistry or math, no high-SAT freshman generally can beat the seniors because the seniors have learned so much during the four years-

Corey: So if you’re going to specialized jobs, you think there’s no doubt that a college education is helpful.

Steve: Right. CLA was specifically-

Corey: I get that. CLA, it’s a general test of basic office skills.

Steve: But in the minds of the designers, who were people from corporate America, they thought they were testing the seniors on a set of special skills that were really important for corporate America, and that these kids, that’s what they went to college to learn, but I think those people were a little deluded about exactly what they were doing.

Zach: Yeah. You could get a super bright freshman who could do better than the average senior who had been through four years of education with a $200,000 price tag.

Steve: By the way, I guess your example, Corey, you’re kind of on the high end. At MSU, I believe we eliminated the algebra requirement for graduation. It used to be a real problem. Some kids who entered MSU had trouble with Algebra II or something, so we’ve actually changed the math requirement at MSU. If you looked at some of our seniors and you ask, “Can these seniors, after four years at MSU, do math better than a bright high school senior?” it could be the opposite. The high school senior could do calculus, and the guy who’s graduating with the non-technical degree here can’t actually do algebra very well. Actually, in that case, it might not turn out so well.

Corey: Yeah, I guess I’m talking about people who take specialized math.

Steve: Specialized major.

Corey: Yeah, a specialized major. Look, I think that’s where… There’s a lot of argument in the value of college, but I think there’s really no doubt if you’re actually learning something that’s a specialized discipline, where knowledge had been created over centuries and there’s specialized things you’ve got to learn, you can’t get away from college in that case.

Zach: Nobody is born with that knowledge. Nobody is born with it.

Corey: That’s right.

Steve: Right. Nobody doubts that in your mechanical engineering courses or your physical chemistry courses, that you’re learning something very specialized and difficult, etc. It takes years to learn. But if I’m the dean of a certain college here at MSU and I say, “Well, I’m preparing my kids for critical thinking and the business world and the ability to process…”

Zach: Generalists.

Steve: Yeah, generalist roles. That’s what these CLA guys are trying to measure, and then they realize, “No, actually, we’re not measuring anything that the SAT doesn’t measure.”

Corey: So they effectively had a test that was an SAT. Maybe that’s all they were familiar with, and so they designed the test to do that.

Steve: That’s exactly right.

Corey: That’s a badly designed test.

Steve: It’s a little bit like instead of measuring you on your mile run time and pull-ups and push-ups, I designed some really complicated thing that you have to do with your body, but then I realize I can predict how you do on that more complicated test just by making you run the mile and the 40-yard dash and doing some pull-ups and push-ups. It’s kind of like that.

Corey: Yeah, if you generalize it across physical abilities, you’ll get the same phenomena. Of course, people who run the mile often can’t bench press-

Steve: I’d take a composite.

Corey: … 70 pounds.

Steve: No, I’d take a composite. Anyway, so, okay, let’s leave that. I want to talk about your colleague Anders Ericsson, who I actually met at some… I was invited to some conference on genius, the idea of genius, and he was there. We had some interesting conversations. This guy, I don’t know if he… Maybe Malcolm Gladwell popularized it, but this guy did research which suggested that there was some kind of rough rule that you needed 10,000 hours of deliberate practice to become an expert.

Zach: Yeah, this idea of a minimum amount of intensive training in a domain. It was called the 10-year rule. People looked at lots of different domains, athletics and composing and all sorts of things, and Chase and Simon, Herbert Simon and his colleague Bill Chase, talked about it in their classic research on chess… Herbert Simon was a Nobel laureate. Ericsson came out of this tradition of work, and he did a study in the ’90s looking at musicians. He found that the elite-level music students in these studies had, on average, by early adulthood, accumulated about 10,000 hours of what he and his colleagues called deliberate practice. Then the less accomplished musicians had thousands of hours less on average. Gladwell wrote about these findings in Outliers, and Gladwell took his inspiration from this study and said that 10,000 hours is the magic number of true expertise. This is where the 10,000-hour rule, as Gladwell formulated it, came from.

Zach: But what is very clear if you look at studies from chess to music to sports is that among elite performers, there’s a tremendous amount of variability in the amount of time they’ve estimated engaging in training to reach an elite level. I have a colleague in Australia who did what I think is really one of the first serious challenges to Ericsson’s viewpoint. He had members of a Buenos Aires chess club, including very highly ranked ones, and he found that the amount of so-called deliberate practice that they estimated before reaching the master level ranged from about 3,000 hours to 24,000 hours, and there were some people with over 20,000 hours of practice who had still not reached master level. There’s no minimum amount of… Well, yeah-

Corey: There must be.

Zach: There must be. Yes, of course, because you’re not born with the… But it’s not 10,000. It’s not 10,000 hours.

Corey: How many chess reach grand master with 500 hours or less?

Zach: I don’t know the exact answer to that, but what I will say-

Corey: I’d estimate zero, but-

Zach: Yeah. In their study, they found, with masters at least, the range from about 3,000 hours to 24,000 or maybe it was 26,000.

Corey: It’s within-

Steve: Order of magnitude.

Corey: Definitely order of magnitude, yeah, and half an order of magnitude.

Zach: That’s kind of way off.

Corey: But your field is not physics, right?

Zach: That’s true. That’s right.

Steve: Well, if you just say practice helps, that’s pretty weak, right?

Zach: That’s pretty weak. We know that because-

Corey: We’re not talking hundreds of hours. We’re not talking-

Zach: Yeah.

Corey: I can’t sit around now and practice for a couple hundred hours and expect to become a chess master.

Zach: Right. Here’s a critical point. Okay, we can talk about two levels of analysis here. We can talk about… Or two types of variability. We can talk about variability within a person. Okay, what explains someone’s increase in skill in playing chess? Well, it’s training and what they acquire through training, and that’s because we’re not born with knowing the queen’s gambit or something. You have to engage in some type of activity, some type of training activity, to acquire that knowledge. Okay, the other level of analysis or the other type of variability is between-person variability. The claim that we’ve been focused on in our research is the claim by Ericsson and colleagues that you can largely account for differences across people in terms of their accumulated amount of training.

Corey: Was that his claim? Because I’m familiar with… The 10,000-hour rule doesn’t say that anybody can become a grand master with 10,000 hours. It suggests that those who become grand masters have put 10,000 hours in it.

Zach: Well, the testable claim that we focused on from his famous 1993 paper, which coincidentally has been cited now just over 10,000 times.

Steve: This is Ericsson, not Gladwell.

Zach: Right, this is Ericsson.

Corey: His readership is smaller than Gladwell’s.

Zach: Is that you can largely account for individual differences in performance in terms of accumulated amount of practice. That is the claim that we focused on. We don’t deny, and no one would deny, that you have to practice in order to become an elite performer. It’s impossible. But that has no direct bearing on the magnitude of the correlation across people between performance and practice. We can think of a plausible scenario where people with less practice have actually reached a higher level than people with more-

Corey: Obviously, that will happen, right? That’s called talent. If he’s denying the existence of talent, then that’s a shocking gloat.

Steve: That’s how it was interpreted initially.

Zach: What he claims is that there’s no evidence that is convincing enough for him to believe that talent plays a major role in achieving elite-level performance. He hasn’t seen the evidence that would convince him of that at this point. That’s his-

Steve: I find that almost not defensible.

Zach: Well, whether or not that is defensible, that’s an important question, but we focused on the narrower, testable claim about the role of practice in accounting for individual differences.

Corey: It just seems that if you look at sports… I assume he’s not talking about sports, right? Because-

Steve: No, he is.

Zach: He is talking about sports.

Corey: Because in sports, you’ve got a set of guys who’ve basically played exactly the same number of hours with massive variations in ability by the time they get out of college. There’s some elite quarterbacks, some not very good quarterbacks. All these guys have been playing since they were six years old.

Steve: Having actually argued with Ericsson about some of this stuff, he has a hidden variable in his pocket. “Well, how effectively or deliberately did your guys practice? See, the guys who were loafing…”

Corey: Oh, deliberate practice. I didn’t catch that qualifier.

Steve: Yeah. They go to practice, but they don’t know what they’re doing. The guys who really focus on improving their skills, that’s why they’re the superstar.

Corey: You can’t measure that, actually.

Steve: It’s difficult-

Corey: It’s impossible to measure.

Steve: It’s hidden.

Zach: That’s a real problem.

Corey: Of course, yeah. It makes the theory unfalsifiable.

Zach: Well, that’s, in fact, exactly what in a recent presentation my colleagues have suggested. They stipulate that full concentration is necessary for deliberate practice. Well, how do you know whether somebody is concentrating fully? Can anybody ever concentrate fully on something? Does that mean that 100% of your attentional resources… How would you ever know that?

Corey: It’s interesting. If you take the case of the Patriots’ Tom Brady, who people thought was a pretty good college quarterback, he went in the seventh round. On Ericsson’s theory, this guy was practicing phenomenally deliberately all through college, but nobody noticed except maybe Bill Belichick, who only drafted him in the seventh round. Everyone else thought he was just like everybody else. If a whole set of people focused on talent, none of whom detected, presumably, this one guy who seemed to be practicing maybe a little harder than everybody else… On the face of it, it makes the theory look-

Steve: He almost just replaces our notion of talent with a notion of practicing effectively, right? Now-

Zach: Which begs the question of, well, why do some people practice effectively? If some people are able to get more out of practice than others, that sounds a whole lot like ability.

Steve: I suspect that if you really wanted to crush the guy, you could say, “Well, let’s look at VO2 max before the guy starts training and cycling, and then I’ll predict the guy who practices effectively because he’ll have a hell of a high VO2 max that I measured through some physical measurements well before he started to learn how to cycle.”

Corey: Remember the behaviorist theory of motivated action? Skinner would claim that we did things because we were reinforced, but some things were self-reinforcing, and Chomsky is like-

Steve: What?

Corey: “What?” He’s like, “How would you actually measure that?” Maybe the guy just really enjoys it, is kind of interested, but you couldn’t use the concept of interested, motivated, etc. Instead, you had to have this other kind of concept, which effectively, again, couldn’t be tested and-

Zach: Ericsson will concede that height and body size or genetically prescribed characteristics that bear on people’s success in various things like basketball-

Steve: Right. But what about balance and coordination and explosiveness.

Zach: What he would say is that these things can be influenced by training and practice. My response is “Well, yeah. That doesn’t mean that those things are not also influenced by genetics.” I think that there’s this notion or definition of talent which is one that… It doesn’t make much sense in terms of what we know about genetics. Just because something is influenced by genes doesn’t mean that it’s not modifiable through experience.

Corey: He would, presumably, expand beyond the concept of height and strength and so forth to all other physical characteristics, right? Like VO2 max or-

Zach: Right. Sure. In fact, he’s-

Corey: … weight to VO2 max ratio, etc.

Zach: Bone diameter. We’ll refer to evidence showing that you can change these things through experience, but again, that’s a straw man.

Steve: I would like to show Ericsson some very recent results which take the genomic-based prediction of cognitive ability and then they actually use it to predict which kids out of a cohort of, say, 3,000 kids who have been tracked through high school will actually manage to take calculus and do well in math, and how many will terminate at algebra or geometry. There’s clearly very strong predictive power in this, and I think he would have trouble with that.

Zach: Well, what he would argue is that here we’re not talking about true expert performance. We’re talking about a lower level of skill. We’re not talking about predicting world-class mathematicians. He’s argued that basic abilities and capacities, general cognitive ability is predictive of performance differences early, but then with a lot of training in the acquisition of domain-specific skill, that drops out. We did what I think was a pretty exhaustive review of the literature concerning that question, and boy, evidence for that notion of a vanishing validity is mixed at best. On balance, the evidence suggests that basic ability, cognitive ability is still predictive of a lot of things, even after a lot of training at relatively high levels of skill.

Corey: It seems like the training hypothesis does give rise to another one, which is no one is going to train unless they’re massively motivated. You’re not going to put in 10,000 hours without this. If there’s a correlation between training and performance, you’d probably find a correlation between motivation and performance too, as measured by basically how many hours you’re willing to put in. I had a saxophone teacher who suggested to me I should be practicing 10 to 12 hours a day and-

Zach: It’s a chicken and egg thing. Why do people persist in something for-

Corey: I had zero interest. That’s probably why.

Steve: But I think the early data that Ericsson looked at showed a correlation between number of hours practiced and ability, and I thought you might just have the causality wrong, in the sense that people that are good are going to practice more because they can see that they’re good and getting better.

Corey: And interested. A huge number of people can just drop out of stuff. They’re just not interested.

Zach: That certainly is the case. You think about people who have persisted in something like music and they’ve accumulated 10,000 hours of practice. You can’t infer from that finding that it’s just practice that’s propelled them, because why did they keep doing it? Why didn’t they drop out?

Steve: Yeah. Survivor bias.

Zach: Survivor bias, exactly. Their studies, including the 1993 study that has gotten so much attention, it cannot rule out that possibility. It cannot rule out that possibility. It’s correlational.

Corey: What do we know about the link between practice and performance? How much can you improve your ability at something you’re, on the surface, not great at? I decide I want to be a great piano player-

Zach: Right. Here’s one very, I think, important and positive thing that’s come out of this discussion about deliberate practice. That is that, through training, people can improve their performance more than they probably thought possible, or more than might even seem possible. You can show this in laboratory tasks like remembering digits. They did these studies back in the early ’80s where they gave people training in remembering random strings of digits, and college students, after a lot of training, can remember 70, 80 digits.

Steve: Can I ask-

Zach: Yes.

Steve: In that example, the kids who got up to 70 or 80, did they develop specialized strategies to do it, or did they just suddenly actually just get better short-term memory?

Zach: They developed specialized strategies to do it, mnemonics. The one example was a college track runner. He had extensive knowledge of running times, and he learned how to recode strings of digits into larger chunks and develop these so-called retrieval structures for storing these. I think that’s a valuable insight, but we can’t then say that we can completely explain differences or even largely explain differences across people in terms of how much training they’ve engaged in. Everybody, barring some disability that really curtails engaging in training, can benefit from good training. But that’s not to say that everyone can reach an elite level-

Steve: Can I ask you-

Zach: … through that training.

Steve: … to give us some examples that maybe you’re familiar with of talent selection in our society and some examples where you think people are really super efficient and they are good at trying to figure out who should be the first-round draft pick versus the tenth round, and then other situations where the way they select talent or personnel is just completely inefficient?

Zach: Well, I think, one, ACT and SAT scores measure what I would feel comfortable calling intellectual talent. There’s a whole lot of discussion and controversy over the use of SAT and ACT scores. People say high school grades are a better predictor. No, actually, they’re about… If you look at the validities of high school grades and ACT, SAT scores, they’re about the same, and the best formula for prediction includes both. I think that’s a success story. Now, are there issues with using the SAT and the ACT, their group difference in test scores? That’s a problem. That’s a challenge. But that doesn’t mean that these test scores are worthless for their intended purpose. In fact, we know they’re not.

Corey: How much do training courses improve ACT and SAT scores?

Zach: If you look at… There’s been a good deal of research on this, and the gains tend to be quite small in terms of number of points. I’m thinking of the old SAT scoring system of 1600 being the top score. I believe that the gains are something on the order of 50 points, or maybe not even that. 20 points comes to mind-

Steve: Yeah, I think that’s right. It’s a fraction of a standard deviation.

Zach: It’s a fraction of the standard deviation. Now, could somebody practice for months and months the SAT, learning the vocabulary terms, the mathematical skills that are necessary to do well on the test? Yes, okay. But people don’t tend to do that. If someone practiced intensively for a year, I would be really surprised if their improvement wasn’t more than 20 points.

Steve: I think we’re close to being out of time. Do you want to throw out one more interesting research result that you’d like to share with our audience?

Zach: This is not a research result, but one thing that’s I think an important point to make in discussions of the sort we’re having, people have strong preexisting beliefs about all of this sort of stuff. I’ve written in both the scientific literature and places like The New York Times about some of the stuff that we’ve talked about, and people have these knee-jerk reactions. “Oh, you’re saying that practice isn’t important to becoming an expert?” No. No, we’re not saying that at all.

Steve: Something as simple as a two-factor model, like, “Well, A and B contribute to your success in different proportions,” people can’t even wrap their heads around that.

Corey: Well, I think they can when they’re not emotionally attached to a certain outcome, right?

Zach: Exactly. I think that it’s important for people to be cognizant of how their preexisting beliefs about all of the sorts of things that we’ve talked about about the origins of individual differences and intelligence and skill impact how they interpret evidence and how they think about these issues. It’s even hard for scientific researchers to try to set all of this aside, but that’s not to say that we shouldn’t try to, because if we want some accurate understanding of this, we have to try to.

Corey: I think in general the ideal is that you seal your beliefs off from your investigative mind. You almost have what Julian Jaynes used to call the bicameral mind.

Zach: That’s right. Very tough.

Corey: One mind just has your values, your beliefs, things you want to be true about the world. The other, essentially, has your methodology and carries things out. The fact is that people generally can’t do this when they’re doing these things on subjects where there’s lots at stake for them emotionally, politically. This applies across the board. I found very few people, in my experience, who have strong views on a topic where that doesn’t seriously color their interpretation of evidence. I’ve found very few cases where someone will say, “I believe strongly this,” and they’ll actually seek out information fairly and interpret conflicting information. That’s very rare. It makes it very hard for humans to conduct science.

Zach: Here’s one, actually a research finding, that I will mention. My graduate student, who actually just yesterday or the day before defended his PhD, has been doing work on mindset. This is another construct in this area that’s-

Steve: Carol Dweck?

Zach: Carol Dweck, right, at Stanford has become very popular. The idea of a growth mindset, people have a belief that their abilities are malleable and can be changed through effort and training. This is huge stuff. Schools are using mindset interventions, and millions of dollars of research funding. My graduate student, Alex Burgoyne, and one of my colleagues at Case Western Reserve did a meta-analysis of this, and basically they found that the average correlation between having a growth mindset and academic performance is about 0.1. Interventions, the average effects are very small. It’s not nothing, but it’s not huge either. This is-

Corey: How many studies was this analyzing?

Zach: I don’t know the exact… It was a lot, hundreds in their meta-analysis.

Corey: Overall sample size?

Zach: Tens of thousands.

Steve: Does that undermine-

Corey: How do you measure growth mindset?

Zach: You measure it with a questionnaire with items like “I can change my intelligence through effort.”

Steve: Does that undermine Dweck’s original results or-

Zach: Well, it’s not a good thing.

Steve: What was her estimate of the-

Zach: I don’t know offhand. Larger than that.

Corey: It seems like the survey would measure whether you think you have a growth mindset; whether you actually have a growth mindset…

Steve: Yeah, that’s the thing. It could be better than… Because it’s so tough to measure the actual variable, right?

Zach: Right. Although, that’s the instrument that they’ve used.

Steve: Okay.

Steve: Our guest today has been MSU psychology professor David Z. Hambrick. You can find out more about his work at a website called scienceofexpertise, all one word, .com. Thanks a lot, Zach.

Zach: Thank you.