The Educational Testing Act of 1981

We spent a lot of money to develop what we thought was a much more effective test. tests which gave much better information to those who took it and those who were using the test results in anyway. We think that has been accomplished and we want to continue to make the point that we are different.

The circumstances under which the tests are administered and the ways in which test scores are used were found to be variable. In short what may have appeared appropriate and equitable to require one program was seen to be inappropriate and destructive for another. Our medical college admission test is the case in point. We have argued that the MCAT is a highly specialized achievement examination with specifications drawn very tightly to insure maximum relevance to medical education and to provide maximum guidance to the candidate for his or her preparation for the test. We did disclose the nature of those restrictions for any legislation that was introduced in either California, New York, or proposed in the national Congress.

The MCAT manual which the students can obtain at a very low cost provides them with very explicit instructions on the areas in which the test items are drawn. And second, it provides them with a test which permits them to and how the individual types of questions are developed. It also provides them with a test which they can use themselves in trying to determine their preparation, their readiness to take a test of this kind. Data were collected from a cross-section in developing the questions. Data were collected from a cross-section of approximately 150 medical faculty to determine what they considered as relevant to the study and practice of medicine. Testimony was heard last summer from an MCAT examinee, Ms. Carolyn Bennett, that she and her colleagues considered the MCAT a fair representation of what students should know to get into medical school.

All of this is now in serious jeopardy as a result of this bill. Disclosure of test questions and answers as required by the New York law and by H.R. 1662 would destroy the MCAT since under these conditions there simply is not a sufficient supply of questions that would meet the specifications of the test. Disclosure is also incompatible with the conditions under which the MCAT must be administered with the result that our ability to write and test new questions and equate new forms would be seriously compromised.

These kinds of considerations were apparently persuasive in several States since we were advised a number of times that should legislation prove feasible, the MCAT would be excluded. The same issues, by the way, were fundamental in our pleadings before the Federal court in the northern district of New York that led to our being awarded a preliminary injunction preventing New York from enforcing the disclosure provisions of its law against the MCAT.

The third issue raised dealt with the accountability of test sponsors. I have explained in previous testimony how the MCAT program is directly accountable to the 126 medical schools of the United States, their faculty, students, and administration and how we provide for the continuing counsel of the candidate's premedical faculty. Proponents direct the issue of accountability not only to test contents but also propose extending disclosure to all studies, data analyses, reports, and so forth as though test sponsors control

all research and evaluation associated with the test and have no reason to feel accountable to the public which it serves. On the occasion of our last appearance in this forum, we submitted three early studies conducted independently of the AAMC as a demonstration that the test sponsor is not the sole source of information about a test. We have now counted 21 independently prepared studies relating to the interpretation and use of the new MCAT that have appeared in the literature. Appendix A includes 21 of these citations for your perusal. The list cites studies conducted at 15 different schools. In addition, the AAMC has developed a cooperative arrangement with other schools to stimulate their interest in conducting local studies that will contribute at the same time to nationally aggregated information. The result is that over 40 schools are actively engaged in studying the effectiveness of the MCAT and will be preparing articles for publication that the test sponsor has no opportunity to control. Contrary to the claims of the bill's proponents, it seems that a high degree of accountability exists in this situation where approximately one-third of the institutional users of a test are actively involved in the evaluation of its effectiveness.

The fourth reason cited suggested that the test disclosure was necessary to detect and prevent scoring errors. The term "scoring errors" as we have understood it have been used in at least two ways: One, to refer to an incorrectly calculated score, and two, to refer to a question that has ambiguities in the correct solution that has been identified. It is our contention that the exposure of test questions and answers as required by H.R. 1662 is a classic case of killing fleas with a sledge hammer. If a serious problem exists in this regard, other review mechanisms preserving the security of the examination materials are certainly available to provide the necessary assurances. In the MCAT program we have instituted a series of checks and balances both before and after the administration of every test in order to achieve the highest probability of detection of an error of either kind. We have detailed in previous testimony all of the steps taken in the development and testing of each individual question. In addition to these safeguards common to the industry as a whole, we have special field tryouts before a question is ever used where samples comparable to our examinee population are engaged in a dialog seeking a variety of reactions to questions under development. And that sample does include minorities among others. In addition, after each administration we have an elaborate formal score verification process that is designed to be a final check on possible problems in the scoring of an answer sheet, questions with potential ambiguities in the keying of the correct answer, and the potential errors in the determination of scaled scores and their equivalency with previous forms. These checks employ visual inspection of answer sheets, comments from examinees who have taken the test and those comments are immediately forwarded to the test developers and are acted upon, a review of item characteristics, and sophisticated statistical techniques. Though we are confident that we have instituted every reasonable precaution against errors, we continue to evaluate our system through the ongoing use of teams of external consultants who are recognized as leaders in their field.

And with your permission I would just like to give their qualifications so that you can evaluate the kind of people that we're using. Dr. Robert Lynn, professor of education, University of Illinois, who is president of the National Council on Measurement in Education. Dr. Laurie Shepherd, professor of education, University of Colorado, president-elect of the National Council on Measurement in Education, former editor of the Journal of Educational Measurement. Richard Jaggar, professor of education, the University of North Carolina in Greensboro, former editor of the Journal of Educational Measurement.

The question then is what would the exposure of all test questions and answers contribute? The answer is that it would not only destroy the current test specifications at least for the MCAT, but would result in the need to produce greater quantities of material with a corresponding loss in available resources to maintain the current checks on question quality. In short, the proponents of this legislation are in one sense in an envious position. For test sponsors, on the other hand, it is a Catch-22. If disclosure is mandated, more poorer questions will result and more scoring errors will be identified, justifying, of course, the imposition of the law to begin with and the intrusion of the Federal Government into the educational matters that are by long tradition the responsibility of the private sector.

A fifth reason we have heard cited in support of testing legislation is the need to equalize access to coaching courses. This seems to be a major justification since considerable attention is planned for the subject tomorrow at hearings here. Several subquestions have been raised or implied in the discussion of coaching courses. First, can they affect test performance, how much, and is that bad? Second, if some can increase performance, should the U.S. Congress attempt to equalize access to them? Third, how does H.R. 1662 address the issue?

Concerning the impact of test preparation courses that they can have on test performance, we have been trying to collect some empirical data about participation in such experiences and their association with changes and performance on the MCAT. The deeper we probe the area, the more complex, the more contaminated with uncontrollable factors the question becomes, and the more hazardous any useful generalizations seem. For example, our data is volunteered by the candidates. We do not have access to the list of people that are taking the coaching courses. It reflects a wide disparity of review experiences, the participants have differing levels of motivation, they start with different backgrounds, et cetera. With these caveats, we are tentatively concluding that though on the average examinees acknowledging participation in commercial review courses exhibit gains in performance on a second administration of the MCAT, the gain is only slightly greater than that observed for those repeating the test after the same interval, but not reporting participation in a formal course. The magnitude of the average incremental gain is on the order of one-tenth to one-half of a scaled score point. The magnitude of the change associated with the group is not reporting participation in a commercial review course is on the order of one-fourth to one scaled point. To report more specific figures at this time we think would imply a stability

and the confidence for these results that we have not yet been able to establish. We did find consistently that the higher gains were for the content specific sections of the test, that is biology, chemistry, and physics and that the gains observed at the lower ends of the ranges were associated with skills analysis, reading, and quantitative tests. As related initially at the lower levels on the Sills Analysis: Reading and quantitative subtests gained least from the typical review course experience. Since these test preparation experiences tend to be heavily content oriented rather than emphasizing the development of basic thinking skills, such a test preparation experience is likely to have relatively little payoff for examinees with less well developed thinking skills.

If as it now appears there may be an effect on MCAT scores associated with participation in a review course, is that bad? Changes in performance on the MCAT as a result of a test preparation course are not a threat to the validity of our test. In fact, to the extent that a test preparation course or any other effort that the student uses to increase and gain knowledge enhances the candidate's knowledge or skills measured by the test, the MCAT would be invalid if it did not reflect that improvement. In this regard, it should be understood that we regularly monitor the leakage of test materials from the program to assure ourselves that materials reused on subsequent forms have not been compromised.

In summary, none of these data have caused us to harbor concerns about the validity of the test. The magnitude of the changes within reasonable bounds, and the type of change and interaction among types of performance even offer evidence for the construct validity of the test, that is the diagnostic value of the skills subtests. We are continuing our research and will report specific findings as we establish the stability of our earlier indications.

The second issue related to coaching was whether the U.S. Congress should try to equalize access to test preparation courses or for that matter to any resource that might be effective in improving performance. Proponents simply-proponents imply that it should but the issue has not been directly confronted. Should the answer be yes, then new legislation will be required.

This brings us to the last issue which is whether H.R. 1662 addresses the concern about equalizing access. Presumably the link is in the increased availability of test questions and answers resulting from mandatory disclosure. In the comments accompanying the introduction of H.R. 1662, this statement is made and I quote:

And it (the testing legislation) would lessen inequities among students created by expensive coaching schools by giving everyone equal access to information about the test and the questions themselves-not widen the gap between students.

At best, this is an unproven assertion. At worst, testing legislation will exacerbate the problem in the sense of the above quote in that it will widen the gap.

First, from the experience of the college boards in New York, we know that the small numbers requesting their questions and answers-fewer than 5 percent-disproportionately more are from the higher socio-economic groups and from those already achieving high scores on the test. If there is a benefit to having access to questions and answers, whom then will it benefit most? Clearly,

not the disadvantaged. If the test preparation experiences continue to be expensive and therefore access for the disadvantaged continues to be limited, we see nothing in H.R. 1662 that will change that. And if there is any value to such experiences, advantaged groups in the population will continue to be the primary beneficiaries. It also seems quite reasonable that putting more information into the hands of the operators of such courses will only enhance their potential to offer effective programs of preparation. It is predictable also that test preparation courses may be made more attractive by their being perceived as the collector and organizer of the flood of questions and answers that are suddenly available. Unexplained, uninterpreted, unorganized bursts of questions and answers will only succeed in increasing dependence on external support while satisfying the resource requirements of the operators in the test preparation business to make money.

We fail to see how these arguments advance the support of H.R. 1662 survive the careful scrutiny of evidence or logic. This intervention would be contrary to the clearly expressed views of the people that Federal presence should be reduced. With the severe restriction in available resources it does not seem logical to spend the public purse on unnecessary regulatory initiatives. In our view these hearings have amply demonstrated all the fundamental flaws of H.R. 1662 and we urge that the subcommittee not report this measure to the full committee. Thank you very much for your kind attention.

[Prepared statement of John Cooper follows:]

« iepriekšējā Turpināt »

Grāmatas

The Educational Testing Act of 1981: Joint Hearings Before the Subcommittee ...