The Educational Testing Act of 1981

VII

References

Alderman, D. L., & Powers, D. E. The effects of special preparation on SAT-verbal scores (CB RDR 78-79, No. 4 and RR 79-1). Princeton, NJ: Educational Testing Service, 1979. (American Educational Research Journal, 1980, 17, 239-251.)

Belson, W. A. A technique for studying the effects of a television broadcast. Applied Statistics, 1956, 5, 195-202.

Breland, H. M. Population validity and college entrance measures (Research Monograph No. 8). New York: The College Board, 1979.

Bryk, A. S., & Weisberg, H. I. Value-added analysis: A dynamic approach to the estimation of treatment effects. Journal of Educational Statistics, 1976, 1, 127-155.

Bryk, A. S., & Weisberg, H. I. Use of the nonequivalent control group design when subjects are growing. Psychological Bulletin, 1977, 84, 950-962.

Campbell, D. T. Reforms as experiments. American Psychologist, 1969, 24, 409-429.

Campbell, D. T., & Stanley, J. C. Experimental and quasi-experimental designs for research on teaching. In N. L. Gage (Ed.), Handbook of Research on Teaching. Chicago: Rand McNally, 1963.

Cochran, W. G. The use of covariance in observational studies. Applied Statistics, 1968, 17, 270-275.

Coffman, W. E., & Parry, M. E. Effects of an accelerated reading course on SAT-V scores. Personnel and Guidance Journal, 1967, 46, 292-296. Dear, R. E. The effect of a program of intensive coaching on SAT scores (ETS RB 58-5). Princeton, NJ: Educational Testing Service, 1958. (Reported in French, J. W. & Dear, R. E. Effect of coaching on an aptitude test. Educational and Psychological Measurement, 1959, 19, 319-330.)

Dixon, W. J., & Massey, F. J., Jr. Introduction to statistical analysis. New York: McGraw-Hill, 1951.

Donlon, T. F. The effects of special preparation for tests: The ETS experience. Paper in preparation.

Dyer, H. S. Does coaching help? College Board Review, 1953, 19, 331-335. (Reported in, French, J. W., & Dear, R. E. Effect of coaching on an aptitude test. Educational and Psychological Measurement, 1959, 19, 319-330.)

Evans, F. R., & Pike, L. W. The effects of instruction for three mathematics item formats. Journal of Educational Measurement, 1973, 10, 257-272.

Federal Trade Commission, Boston Regional Office. Staff memorandum of the Boston Regional Office of the Federal Trade Commission: The effects of coaching on standardized admission examinations. Boston, Mass.: Federal Trade Commission, Boston Regional Office, September 1978.

Federal Trade Commission, Bureau of Consumer Protection. Effects of coaching on standardized admission examinations: Revised statistical analyses of data gathered by Boston Regional Office of the Federal Trade Commission. Washington, DC: Federal Trade Commission, Bureau of Consumer Protection, March 1979.

Frankel, E. Effects of growth, practice, and coaching on Scholastic Aptitude Test scores. Personnel and Guidance Journal, 1960, 38, 713-719. French, J. W. The coachability of the SAT in public schools (RB 55-26). Princeton, NJ: Educational Testing Service, 1955.

French, J. W., & Dear, R. E. Effect of coaching on an aptitude test. Educational and Psychological Measurement, 1959, 19, 319-330.

Lass, A. H. Unpublished study. Brooklyn, NY: Abraham Lincoln High School, 1958.

Marron, J. E. Preparatory school test preparation: Special test preparation, its effect on College Board scores and the relationship of affected scores to subsequent college performance. West Point, NY: Research Division, Office of the Director of Admissions and Registrar, United States Military Academy, 1965.

Pallone, N. J. Effects of short-term and long-term developmental reading courses upon S.A.T. verbal scores. Personnel and Guidance Journal, 1961, 39, 654-657.

Pike, L. W. Short-term instruction, testwiseness, and the Scholastic Aptitude Test: A literature review with research recommendations (CB RDR 77-78, No. 2 and ETS RB 78-2). Princeton, NJ: Educational Testing Service, 1978.

Pike, L. W. Implicit guessing strategies of GRE-aptitude examinees classified by ethnic group and sex (GRE Board Professional Report GREB NO. 75-10P). Princeton, NJ: Educational Testing Service, 1980.

Pike, L. W., & Evans, F. R. The effects of special instruction for three kinds of mathematics aptitude items (CB RDR 71-72, No. 7 and ETS RB 72-19.) Princeton, NJ: Educational Testing Service, 1972.

Powers, D. E., & Alderman, D. L. The use, acceptance, and impact of Taking the SAT-A test familiarization booklet (CB RDR 78-79, No. 6 and ETS RR 79-3). Princeton, NJ: Educational Testing Service, 1979. Roberts, S. O., & Oppenheim, D. B. The effect of special instruction upon test performance of high school students in Tennessee (CB RDR 66-7, No. 1 and ETS RB 66-36). Princeton, NJ: Educational Testing Service, 1966.

Rock, D. A., & Werts, C. E. Construct validity of the SAT across populations-An empirical confirmatory study (CB RDR 78-79, No. 5 and RR 79-2). Princeton, NJ: Educational Testing Service, 1979.

Slack, W. V., & Porter, D. The Scholastic Aptitude Test: A critical appraisal. Harvard Educational Review, 1980, 50, 154-175.

Stern, J. Personal communication, 1975.

Vernon, P. E. The determinants of reading comprehension. Educational and Psychological Measurement, 1962, 22, 269-286.

Whitla, D. K. Effect of tutoring on Scholastic Aptitude Test scores. Personnel and Guidance Journal, 1962, 41, 32-37.

VIII

Appendices
Appendix 1

Critical Notes on the FTC Coaching Study

DONALD L. ALDERMAN, November 16, 1979

Staff Memorandum of the Boston Regional Office (BRO)

The disclaimers and reservations attached to the Boston memorandum make it quite clear that the Federal Trade Commission (FTC) itself found serious flaws in this staff report. Yet it may be useful to enumerate those criticisms as well as to offer further comments on specific details since the "revised statistical analyses" conducted by the FTC's Bureau of Consumer Protection (BCP) in Washington accepts the data set and repeats several assumptions of the Boston memorandum.

The note on the cover of the Boston memorandum goes further than usual disclaimers in stating that "the Commission specifically believes that some of the conclusions in the study are not supported by the evidence obtained in the investigation." This strong disavowal of certain conclusions may be attributed to "several major flaws in the data analysis, quoting from the BCP'S notice to recipients of the Boston memorandum. The notice cites four specific flaws: (1) comparisons are made "for groups of coached and uncoached students without controlling for differences which may exist in personal and demographic characteristics of the students in the two groups;" (2) failure "to provide tests of statistical significance which are necessary to interpret the results" (i.e., whether differences may simply be due to chance occurrences rather than treatment effects); (3) defects that concern "the method used to present the findings from the data analysis" (e.g., discussion of results in terms of a nonexistent subpopulation of students); and, (4) "all the limitations associated with [a nonexperimental design]."

These points stressed by the FTC focus on the statistical analysis and give inadequate attention to the data editing which necessarily took place prior to the stage of statistical analysis. Moreover, the subsequent BCP report repeats several weak assumptions made in the Boston memorandum.

Assumptions and Data Analysis

These additional comments concern some of the BRO's assumptions and some of the steps taken in constructing the final data base. But the design of the study would itself preclude any definitive conclusions. Numerous alternative explanations exist for any differences in the test performance of the two comparison groups (i.e., "coached" and "uncoached") since students in these groups differ markedly on key demographic characteristics (see Table 1 of the BCP's report, pp. 8-11). And the statistical analyses undertaken in the BRO memorandum exacerbate these initial group differences by ignoring them. Nevertheless, I offer a few additional comments:

• Scope of inferences. The BRO memorandum assumes that "valid inferences about the coachability of other examinations can be drawn from the specific results we obtain for the SAT and LSAT.'' Given the pre-existing differences evident in the comparison groups, there is not even a strong basis for inferences about the SAT and LSAT let alone any other examinations. Indeed, the results themselves show inconsistencies across test administrations, commercial schools, and examinations.

• Representative sample. The enumeration of the study's assumptions (BRO memorandum, p. 49) begins with a statement concerning the sample's representativeness of the entire SAT and LSAT candidate populations. Although the control group was a random sample drawn from history files, the "coached" group is obviously very different from the population of candidates (see Table 1 of BCP's report). The release of BRO'S technical appendix ''SDQ,'' which gives demographic profiles by comparison group, should confirm the weakness of this assumption.

Consistency of treatment effects. Another assumption was that the "coaching school [effect] is consistent during the study period (p. 49)!'' Certainly the consistency of treatment effects across schools or test administrations should be an open question subject to investigation rather than conjecture or assumption.

• Treatment self-selection and control contamination. The strangest assumption made in the BRO memorandum is that "the effects of enrollee self-selection, if any, and of coaching of presumably uncoached students offset one another (p. 49)."' There is no way to estimate the extent of treatment contami

« iepriekšējā Turpināt »

Grāmatas

The Educational Testing Act of 1981: Joint Hearings Before the Subcommittee ...