SATs are on their way out, but new tests aren’t quite ready.
Jon Boeckenstedt devours data. As DePaul University’s associate vice president for enrollment management, he studies how the institution’s 16,000 undergraduates are doing, trying to forecast their performance. Many in his position would turn to standardized tests like the SAT (Scholastic Assessment Test) and the ACT (American College Testing). But Boeckenstedt believes the tests carry too much weight in college admissions. “We know there are students for whom the tests don’t represent their true ability,” he says. Today more than 800 four-year colleges and universities in the United States no longer require standardized tests as part of their admissions process—that’s about 20 percent of the total. In 2011, DePaul became the largest private nonprofit among these.
The flaws in standardized testing are well-documented at this point. They punish disadvantaged students and minorities, entrench class lines, and their predictive powers only forecast a student’s progress as far as the first semester of their freshman year. The University of California, Berkeley1 economist Jesse M. Rothstein has found that the combination of a student’s high school grades and demographic information predicted first-year grades in college about as well as her high school grades and SAT scores do. Based on his experience evaluating undergraduate performance, Boeckenstedt agrees. “It’s double counting,” he says.
As colleges de-emphasize tests scores for applicants, they are turning to research showing that a student’s potential relies on more than cognition. Traits such as optimism, curiosity, resilience, and “grit” may actually play a stronger role in determining a student’s long-term success.
In the face of a growing agreement that these so-called “soft skills” are important is a question that remains stubbornly unanswered: How can they be measured consistently and fairly? Boeckenstedt has often heard admissions officers say, “you can’t measure heart.” The expression rings true. But is it?
In theory, at least, standardized testing was supposed to deliver a class-neutral measure of a student’s innate ability. Colleges could use them as an apples-to-apples selection aid, putting a student from a small private school in Manhattan N.Y. on the same playing field as a student from a large public school in Manhattan, Kan. The first SAT was administered in 1926, and colleges rapidly adopted it and other standardized tests as a way to assess a large number of applicants efficiently.
But is clear now that most standardized tests used today are far from class-neutral. Even as the College Board announced in March 2014 that it will overhaul the SAT by making the essay optional, cutting obscure vocabulary words, and sharpening the focus of the math section, skepticism abounds. The SAT has undergone many changes before (the much-maligned analogies section was retired in 2005), but SAT scores have continued to reflect socioeconomic disparities.
The economists Anthony P. Carnevale and Jeff Strohl have found that disadvantaged students (who are disproportionately black and low-income, with parents who dropped out of high school) tend to score 784 points lower on the SAT’s math and verbal sections combined than more affluent students do. If the SAT were a 100-yard dash, they write, disadvantaged students start “65 yards behind.” In addition, colleges began to question whether pure brainpower was all they wanted. Ivy League schools began moving away from purely test-driven admissions as early as the 1920s through the use of in-person interviews (though this may have been motivated as much by racial bias as anything else).
The way admissions offices evaluate each of these is as idiosyncratic as the essays and letters themselves. “They are Rorschach tests, so to speak.”
Admissions officers today can draw on a wealth of research describing soft skills. One notable example is the work of University of Pennsylvania psychologist and MacArthur “Genius” award-winner Angela Duckworth. She has studied success through longitudinal studies of graduation rates at Chicago public schools, performance at the U.S. Military Academy at West Point, and winners of the Scripps National Spelling Bee. Her conclusion is that “grit,” which she defines as the ability to sustain interest in and effort toward long-term goals, predicts success over and beyond conventional measures of talent, such as standardized test scores.
“Despite this and other research findings like it, however, college admissions officers do not have a standardized test that can reliably evaluate non-cognitive skills. Until they are developed, many admissions offices use personal essays, interviews, lists of extracurricular activities, and letters of recommendation to get a holistic view of applicants.
But the way admissions offices evaluate each of these is as idiosyncratic as the essays and letters themselves. “They are Rorschach tests, so to speak,” says Patrick Kyllonen, senior research director at the Educational Testing Service, which administers the SATs. “It’s hard to turn them into numbers.”
In 2010 the College Board, which owns the SAT, tried their hand at making numbers out of soft skills. Together with his colleagues, organizational psychologist Neal Schmitt, who is Professor Emeritus at Michigan State University, identified 12 dimensions of success that 100 colleges described as important, which fit into three categories: cognitive/intellectual (knowledge and mastery of general principles), interpersonal (such as curiosity and appreciation for diversity), and intrapersonal (including adaptability and perseverance).
Schmitt and his team asked students to complete assessments about their background and experiences, and to hypothesize how they’d act in given situations. For example, one scenario read, “You are assigned to a group to work on a particular project. When you sit down together as a group, no one says anything.” The results showed that non-cognitive traits like curiosity, appreciation for diversity, adaptability, and perseverance correlated with academic performance over time. They also found that there were only small differences in performance among ethnic subgroups on the two non-cognitive assessments, in contrast to much larger gaps seen on cognitive tests. Using these assessments, the researchers concluded, could help colleges enroll classes with more diversity with little or no decline in student performance.
The freshman-to-sophomore retention rate was almost identical for those who submitted standardized test scores (85 percent) and those who did not (84 percent).
Motivated by such findings, the Educational Training Service developed an online rating tool called the Personal Potential Index. Designed to quantify what’s conveyed in a recommendation, it asks past instructors to rate students on a five-point scale in six categories: communication skills, ethics and integrity, knowledge and creativity, planning and organization, resilience, and teamwork. To gauge resilience, for instance, respondents are asked to what extent a student “accepts feedback without getting defensive; works well under stress; can overcome challenges and setbacks; works extremely hard.” Recommenders can type in comments to elaborate on their ratings, if they choose.
Notre Dame Business School and the American Dental Association are among the first to use the Personal Potential Index in their admissions process. Kyllonen expects that the results of an ongoing large-scale study will validate the new tool as a predictor of student success.
DePaul is implementing their own tests for non-cognitive skills, with a series of essay questions. For the entering class of 2012, about 10 percent of applicants (or about 5 percent of the freshman class) chose not to send ACT or SAT scores. Instead they completed four short-answer questions, designed to measure their leadership skills and their ability to meet long-term goals. Systematically scoring the responses to those questions, DePaul reported that the freshman-to-sophomore retention rate was almost identical for those who submitted standardized test scores (85 percent) and those who did not (84 percent). Boeckenstedt is encouraged by these preliminary results.
However, even as schools make progress in quantifying non-cognitive skills, there is also worry about the assessments they are building. Non-cognitive skills are often measured through self-ratings, which means respondents can fake their answers. This is partly why Brandeis University did not add non-cognitive assessments when it dropped its testing requirements recently. “Once you introduce these measurements into your system, you introduce the ability to game those measurements, especially if students know they are being tested for an opportunity,” says Andrew Flagel, senior vice president for students and enrollment at Brandeis. “With most of these questions, it’s awfully hard to frame them in a way where one couldn’t intuit the best answer.”
Instead, starting this fall, Brandeis applicants who do not wish to submit ACT or SAT scores have the option of sending in a graded paper and a second letter of recommendation. “The reality is every time we talk about holistic review,” Flagel says,
“We are in some ways touching on the inclusion of some non-cognitive factors.”
However the debate proceeds, perhaps most interesting of all will be the impact it has on what many Americans consider to be a core value of education: evaluating a student based on his or her innate potential, independent of the circumstances of their lives. In his book, The Big Test: The Secret History of the American Meritocracy, Nicholas Lemann writes that cognitive testing was supposed to link real power to brainpower: “The new elite’s essential quality, the factor that would make its power deserved where the old elite’s had been merely inherited, would be brains.” As standardized testing changes and is replaced, then, ideas like this will have to be reimagined. As for Boeckenstedt, while he is heartened by the early returns on DePaul’s experiment with non-cognitive tests, he believes that the long search for the elusive “it”—the mesh of attributes that provides a window into an applicant’s future success—might never be over.