Lapas attēli
PDF
ePub

EDITOR'S NOTE:

The following article is reprinted in entirety from the April 1981 MEG. Readers should use this 2nd printing, which corrects production errors contained in the April article, as the definitive version. We regret any inconvenience caused by this inadvertence.

Criticisms of practices in educational and psychological measurement and evaluation are certainly not anything new. Indeed, the history of these practices shows that such criticisms have been present from the onset of the profession. Professionals and lay persons alike have raised innumerable concerns and issues, in manners ranging from simple questioning to blanket condemnation. In the main, professionals favoring "testing" have accepted these criticisms as challenges and have sought to continue to improve practices so as to assuage criticisms. And so it went, loud attacks and quiet

responses.

It now seems that quiet responses are no longer the best tact. Opponents of testing capitalize on modern technology, easily swayed public sentiment, and media prominence to make their criticisms more pungent than ever before. A case in point is the Nairn/Nader report on the practices and policies of Educational Testing Service. Clearly this report demands that alternative perspectives be provided. To ignore this report is to do a grave injustice to those who strive to provide effective educational and psychological measurement and evaluation practices.

For these reasons, I asked William Mehrens to provide an evaluation of the Nairn/Nader report. Bill is well qualified for the task: He has served as president and editor of this journal for AMEG and has authoried numerous books and articles on educational and psychological measurement and evaluation. It should be noted that neither he nor I nor AMEG feels any great need to defend ETS; they can do that on their own. We do, however feel obligated to provide an alternative professional perspective. It is in that vein that I commend Bill's article to you. I is important and vital reading. I hope you enjoy it and are stimulated to thought and action.

Larry Loesch

ETS Versus Nairn/Nader: Who Reigns?

As most MEG readers are probably aware, the 554-page Ralph Nader report on the Educational Testing Service authored by Nairn has been published under the title The Reign of ETS: The Corporation That Makes Up Minds (1980) (hereinafter referred to as the Nairn/Nader report). The Educational

WILLIAM MEHRENS

Testing Service (ETS) has responded to portions of this document with two brief reports: Test Scores and Family Income (1980a) and Test Use and Validity (1980b). In this article, I examine some of the issues, charges, and countercharges in those documents.

If you dislike testing and ETS and

ETS VS. NAIRN/NADER

enjoy one-sided reporting, you will like the Nairn/Nader report. If you are a fairminded scholar who feels positively toward testing and ETS, you will surely be dismayed, as I am, at its biased treatment. Other combinations of characteristics may place you at different places on the pro-con Nairn/Nader report continuum. For example, if you generally believe organizations that are large and successful are also evil, you will view the report differently than if you think largeness and success are characteristics to be admired.

Even if you had heard none of the hype concerning the report and did not pick up a bit of the flavor from the title, the first two sentences in the preface of the Nairn/Nader report (written by Nader) would provide early insight into the direction the report takes: "The conception for this report on the Educational Testing Service began with the victims of standardized testing. Some of these students would come up to me at colleges and universities around the country to express a feeling that they had been unjustly judged by a three hour exam" (italics added) (p. ix).

The sentences remind me of a quote by Barclay (1968). He did a delightful job in pointing out that many popular issues are frequently discussed by inappropriate generalizing. He told, for example, of an article written by a Mr. Pulling:

Mr. Pulling rambled on about testing being similar to phrenology. He said that this article was occasioned by a desultory contact with some child who took some test in some place and was not rated too bright in mathematics. Nevertheless, this particular individual went to some college somewhere and somehow succeeded, all of which prove beyond the shadow of a doubt in Mr. Pulling's logic that all testing in all places and on all levels in similar to the cephalic index of the phrenologists. (Barclay, 1968, p. 4)

The Nairn/Nader report is divided into nine chapters (398 pages), 110 pages of footnotes, and 45 pages of appendixes. It gives the appearance of being a scholarly document, but the title, the preface, the chapter titles, and the one-sided set of references soon dispel any such notion. The sections that follow (except for the last two) are responses to the major themes of the nine chapters of the report. In the last two sections I comment on the ETS responses and provide a summary.

PAINFUL UPHEAVALS?

Chapter 1, "Hope. . . Will Be Kept Within Reasonable Bounds," starts out with five case studies of students who supposedly suffered "painful upheavals" caused by ETS. But, the names and identifying information have been changed. Even after changing the identifying information, however, the arguments include some contradictions. For example, on page 5, we read that Gary Vladnik "knew that the SAT score meant the end of his college hopes." Then, on page 8, we read that Gary "was lucky enough to live near a good, affordable college which did not require an ETS test." He went there for two years, transferred to the school that had rejected him, and graduated. It is certainly hard to see how this "painful upheaval" resulted in any great harm to Gary, and clearly his ETS score did not mean "the end of his college hopes."

This example illustrates two of the underlying themes of the report that disturb me. First, there is a basic theme that ETS has more control over college admissions than it, in fact, has. ETS does not have any power over the selection decisions made by the colleges. Colleges do not need to use ETS tests. Many do not. For those that do, ETS does not suggest minimum cutting scores. ETS does

MEASUREMENT AND EVALUATION IN GUIDANCE VOL 14 NO. 2 JULY 1981

91-170 0-82--3

not tell colleges how much to weigh the SAT scores in the decision-making process and expressly warns them against. placing too much emphasis on the SAT or any other single indicator. ETS simply serves as the contractor of the College Boards to build, score, and report the scores on a set of tests. For example, a 1978-79 College Board survey (see Lerner, 1980, p. 128) reported that only 30% of 2,600 colleges surveyed who used the SAT set minimum cut-off scores on the SAT. As mentioned, many other colleges do not even require the SAT. As Lerner points out, "The inevitable overall result is that virtually all literate and numerate students and many semi-literate or even illiterate ones can find some college which will accept them, if they can somehow arrange to pay the fees" (1980, p. 128). Certainly, the evidence suggests that the majority of students are admitted to the college of their first choice. (This would not be quite as true of certain professional schools, such as law schools.)

Second, there is an underlying theme in the report that if students do not get into college, it is unfair and a detriment, both to the individual and to society. At the undergraduate level it is particularly doubtful that not admitting a student to a selective college is of any long-range harm to the individual or society. There are plenty of colleges that will admit anyone with a high school diplomaeven if the student is functionally illiterate. If the decision made by a college with admission standards to reject the student is wrong and the student is capable and motivated, he or she will likely do just what Gary Vladnik did: be an academic success. If the decision is correct (and even Nairn/Nader admit the chances are better than 50-50 that it is), the student likely will be better off at a less competitive college.

At the professional school level, such as in law schools, it may be true that qualified candidates get turned away; but with the current surplus of lawyers, that does not harm society. Whether or not it harms the individual is more debatable. If there is a surplus of attorneys, would it be good for an individual to be admitted and graduate from law school if he or she has less aptitude for law than most other lawyers? As Brown (1980) points out:

To really gauge the importance of the test, it might... be useful to ask what happens to people who apply to professional schools and are turned down... turned down by the law schools, does the candidate go on to pursue a Ph.D. in history? What percentage of the rejected applicants continue to apply until they are finally selected? What happens to the rest? When all is said and done, how significantly did the test affect the life-chances of individuals who already have bachelor's degrees and are motivated enough to want to go further? (Brown, 1980, pp. 50-51)

A third underlying theme of the report that comes through in the first chapter (although not perhaps illustrated in the Gary Vladnik example) is that somehow rejection for reasons of test scores causes a more painful upheaval than rejection for reasons of low interview ratings, low high school grades, insufficient letters of recommendation from influential advisors, or poor letter-writing skills of the student's high school counselor. The point is, if rejection decisions are made, some individuals will be disappointed. Decisions based on test scores should be no more painful than decisions based on other reasons.

ETS LARGE AND EFFICIENT

Chapter 2 of the report, "Rosedale: Power and Privilege at ETS," is devoted to condemning ETS for its size, efficiency, and concern for quality.

ETS VS. NAIRN/NADER

From around the world, answer sheets are returned to headquarters in Princeton, New Jersey. Propelled by jets of compressed air, these are passed single-file through the entry chute of an electronic scoring machine—the custom-built Westinghouse MRC mark-sensitive scanner. Scanning at the rate of over three sheets per second-24,000 answer sheets per hour-the MRC scores pencil marks representing the answers to millions of multiple choice questions each year. The scores are printed out and forwarded to the mail room where 230,000 letters are processed in an average day. (Nairn, 1980, pp. 32-33)

Taken out of context, we might think the above quote an accolade; but in context, it is obvious Nairn says all this with disapproval.

At another point in the chapter, ETS is criticized for requiring all candidates for the LSAT "to present a photo ID at the test center, give a handwriting sample, sign a pledge that they were, indeed, who they said they were, and submit to thumbprints" (Nairn, 1980, p. 31). Why did ETS do this terrible thing? It was a response "to reports that candidates were hiring impersonators to boost their LSAT scores" (p. 31). Again, it seems most people would praise ETS for this response to curtail cheating. Nairn/ Nader condemn them for it!

PREDICTIVE VALIDITY

In Chapter 3, “Five Percent of Nothing: Aptitude Testing, The Respectable Fraud" (and in the accompanying footnotes at the end of the book), Nairn makes some obvious statistical mistakes. The most obvious one is the averaging of the separate validities of the two parts of the SAT, rather than using the validity of the whole test (see footnote 15 to Chapter 3, p. 417). Thus, he averages .37, a validity coefficient for the SAT-V and .32 for the SAT-M and squares this number to get an incorrect coefficient of determination of 11.9. He then incorrectly inter

prets this as a percentage of perfect prediction and suggests that the SAT predicts grades only 11.9% better than random prediction with a pair of dice (p. 59).

Later in the chapter, Nairn reports that "inclusion of SAT scores in the predictive process improves the prediction of college grades by an average of only five percent or less' (p. 66). He arrives at this figure by computing the coefficient of forcasting efficiency (1 -√√l - r2). For example, for 1974, the index for high school grades alone was 13.4, and for tests and grades combined was 18.5. The difference is, of course, 5.1, but as the ETS response points out, this is a percent improvement of 38% (5.1/13.4) (ETS, 1980b, p. 19). Yet, Nairn has made much of his 5% of nothing" claim. Further, as the ETS response points out, in quoting Cronbach and Gleser (1965), the index of forecasting efficiency is not an index that should be used in evaluating tests for selection purposes.

Elsewhere in chapter 3 Nairn suggests that extreme degrees of anxiety are likely to interfere with test performance (p. 86); that the multiple-choice format could favor certain kinds of personalities (p. 88); that paid coaching may improve a person's score and that this raises the question of equity (p. 97); that if a cut-off score is used, a single point can be the difference between acceptance and rejection (p. 156); and that such large decisions about a person's future should not be based on such small samples of performance (p. 159).

Why all these points are made about tests but not interviews or previous grades is puzzling. Certainly, extreme anxiety is likely to interfere with interview performance; an interview may favor certain kinds of personalities; people can be coached to do well on interviews; if a cut-off score is used from any

MEASUREMENT AND EVALUATION IN GUIDANCE VOL. 14 NO. 2 JULY 1981

type of data, a single point could make a difference between acceptance and rejection; and an interview is a smaller sample of performance (in the sense of time) than a test. But, Nairn prefers not to make these points. Basically, the whole report ignores the limitations of other sources of data for decision-making as well as the extent to which these other measures are currently being used.

HISTORY OF TESTING

Chapter 4, "The Worth of Other Men: The Science of Mental Measurement and the Test of Time," is a somewhat biased review of the history of testing in the United States. Much is made, for example, of the eugenics position ascribed to Galton and Terman, and the use of test data in immigration decisions, quoting Kamin (1974) as follows about the 1924 Immigration Act:

The law, for which the science of mental testing may claim substantial credit, resulted in the deaths of literally hundreds of thousands of victims of the Nazi biological theorists. (p. 27)

One searches in vain for any evidence that Nairn has read any of the more balanced views of the history of measurement such as that written by Cronbach (1975), who points out that "proponents of testing, from Thomas Jefferson onward, have wanted to open doors for the talented poor, in a system in which doors often are opened by parental wealth and status" (1975, p. 1).

Nairn quotes Thorndike in a disapproving manner when Thorndike suggested that “the able and good should acquire power. In order to support the truth, defend justice and restrain folly, superior men should acquire power" (Nairn, p. 193). The writing does not make it totally clear whether Nairn is opposed to the able and good acquiring

power or opposed to the use of test data to help determine the able and good. I rather hope the latter. If the former, then would we want the incompetent and evil to acquire power? I would not. In fact, I would abhor such an occurrence so much that I would want to use as much data as are available (including test data) to decrease the odds of such an occur

rence.

CLASS VERSUS MERIT?

Chapter 5, "Class In the Guise of Merit," presents the argument that because SAT scores are related to income, they must not be related to merit-a strange argument. Evidently, Nairn has not heard of sociological studies such as those reported by Havighurst and Neugarten (1975), which show considerable within lifetime and cross-generation changes in social class. Waller (1971) found, for example, that sons who rise above the parent's socioeconomic status (SES) score, on average, better on intelligence tests than sons whose SES is lower than that of their parents. The United States does have a somewhat permeable social class system where financial advancement is based, at least in part, on merit. Thus, one would expect that if the SAT measures merit, and if merit is related to social class, then a correlation between SAT and income should exist. Such a correlation is another bit of data in the network that would support the construct validity of the SAT. To find no correlation between the SAT and family income would be evidence against the construct validity of the SAT

Another incorrect implication of the Nairn/Nader report is that tests are perpetuating social classes by keeping poor people out of college. This is simply not true. Fricke (1975, p. 110) demonstrated, for example, that if admission to the

ETS VS. NAIRN/NADER

« iepriekšējāTurpināt »