Lapas attēli
PDF
ePub

cluded both PSAT-V and -M in predicting coaching effects for each area on the SAT. This analysis is preferable not only because it is more precisely controlled through the inclusion of additional covariates, but because it more appropriately contrasts the performance of coached students with predicted levels based on uncoached students rather than a mixture of the two. This approach results in valid estimates with fewer assumptions and enables examination of interactions in a straightforward manner (Cochran, 1968).

This reanalysis indicates that, given their background characteristics and pretest levels on the PSAT, students enrolled in one of the three coaching schools studied obtained significantly higher SAT scores than did uncoached students by about 20 to 35 points in both Verbal and Math-the same neighborhood as the FTC estimates. The estimated effects of coaching for the other two schools were not statistically significant. These are estimated combined effects due to coaching and self-selection (self-selection in terms of students who seek and complete commercial coaching programs as opposed to those students who do not), since it is not possible to estimate coaching and self-selection effects separately with these data. No interactions were uncovered at Coaching School A for either SAT-V or SAT-M. At Coaching School B, however, statistically significant and independent interactive effects were obtained on SAT-V for race and self-reported parental income. On the average, even though their number was quite small (N = 13), black students at School B exhibited significantly larger coaching/self-selection effects on SAT-V than nonblacks, and students reporting low family income exhibited significantly larger coaching/self-selection Verbal effects than those reporting high family income. A detailed summary of the Stroud reanalysis follows in Section IV, and the complete report appears in Appendix 2.

A second study of the data was undertaken by Donald A. Rock, Senior Research Psychologist at ETS. Since we know that gains in SAT scores can be expected from the junior year to the senior year of high school, Dr. Rock applied a statistical model incorporating growth effects. For the treatment or experimental group, this study included only those students at the largest coaching school (School A) for whom three sets of test scores were available, a PSAT and two administrations of the SAT. The control group included only those uncoached students for whom these same three sets of test scores were also available. The coached students, who were labeled "underachievers" in the FTC report, performed bet

ter than the uncoached students on the PSAT and were also higher in high school rank-in-class and family income than were the uncoached students. As would be expected, the coached students. also scored higher than their uncoached cohorts on the first SAT administration, but in the case of the Verbal area scored differentially higher. That is, during the period from taking the PSAT to taking the initial SAT, prior to attendance at the coaching school, the verbal skills of the coached students appeared to grow more rapidly than those of the uncoached students. In Math, however, both the coached and uncoached students showed similar group growth rates during this preintervention period. When the confounding effects of differential group growth rates are controlled for, the estimated coaching effect on the Verbal score (about 17 points) is substantially smaller than that for the Math score (about 30 points). The fact that these two effects are different from each other, and not the same, is consistent with the results of earlier studies and with expectations that Math, being generally more curriculum-related than Verbal, might be more responsive to special preparation. A detailed summary of the Rock study follows in Section V, and the complete report appears in Appendix

3.

Most of the criticism in this section points up the limitations of the FTC data and the limits on inferences that can be drawn from such data. Within these limits, further analysis and interpretation may still prove to be illuminating. In this spirit, we next attempt to estimate the combined coaching/self-selection effects using more refined methods and to investigate the possibility of interactions between size of effects and student background characteristics. We also attempt to adjust statistically for that portion of self-selection effects that is embodied in differential group growth rates.

IV

Estimation of Combined Coaching / Self-Selection Effects in the FTC Study: Detailed Summary and Elaboration of the Stroud Reanalyses

In an effort to obtain more precise estimates of the combined coaching/self-selection effects in the FTC data, Dr. T.W.F. Stroud of Queens University, Kingston, Ontario, at the request of ets, designed and conducted additional analyses introducing several refinements over the procedures used by the FTC. It will be recalled that the FTC study was based on six subsamples: (1) high school juniors taking the SAT for the first time in April 1975 or (2) in April 1976; (3) high school seniors taking the SAT for the second time in November 1975 or (4) in November 1976; and, (5) all high school students taking the SAT for the first time on any test date over the three-year period 1974-1977 or (6) taking the SAT for the second time on any test date during that period. Data for students at two coaching schools in the metropolitan New York area were included, while data from a third coaching school were left unanalyzed because the number of students was considered too small. A control sample of uncoached students consisted of every 150th individual in the ETS files who took the SAT during the given threeyear period in the same greater New York area.

The following background variables, roughly in decreasing order of their importance, were controlled for in the FTC regression analyses: pretest score [PSAT-V when the first SAT-V (SAT1-V) was being predicted, SAT 1-V when the second SAT-V (SAT2-V) was being predicted, and similarly PSAT-M when predicting SAT1-M and SAT1-M when predicting SAT2-M], self-reported grade in English or Math, self-reported rank-in-class, self-reported years of English or math taken, self-reported parental income, sex, race, high school type (public or nonpublic), and number of PSATS taken. Time between pretest and posttest was also included as a covariate, but it did not improve prediction significantly. The FTC regression analyses were based only on students with complete data on all of these variables. For each of the six subsamples, separate regression equations predicting SAT-V and SAT-M were com

puted for pooled coached and noncoached students, using dummy variables to represent attendance at Coaching School A and Coaching School B.

Overall, the FTC analyses showed that students at Coaching School A, on the average, scored significantly higher on the SAT than noncoached students. The amount of advantage on a 200- to 800-point score scale falls somewhere between 14 and 38 points, which are 95% confidence limits for the differences in adjusted. means (the median lower confidence limit and the median upper confidence limit over the 12 analyses, for SAT-V and -M in the 6 subsamples). Students at Coaching School B did not do significantly better than noncoached students; their score effect falls somewhere between median confidence limits of -12 and +19 points. In any event, since the coached students in this nonrandomized study differed significantly from the noncoached students in a number of ways that could have influenced both their decision to attend coaching school and their performance on the SAT (such as having higher rank-in-class and higher parental income), these obtained score differences between coached and uncoached groups represent a confounding of coaching effects and personal factors that cannot be disentangled with the available data. Among these personal factors influencing attendance at commercial coaching schools, for example, are motivation to earn a higher test score and financial means. The FTC data set includes a rough proxy for financial means in the form of selfreported parental income, but it includes no proxy for motivation or for a host of other important ways in which coached and noncoached students are likely, on the average, to differ, such as in career aspirations or in level of parental education. Attempts were made in the regression analyses to control for student differences in reported income, but there is no way that statistical adjustments can take unmeasured influences into account. As a consequence, the score effects reported in the FTC study must be interpreted as combined coaching/self-selection effects, as must the findings of the reanalyses that follow.

The reanalyses undertaken here differ from the FTC analyses in a number of respects:

In the FTC analyses, verbal pretests (PSAT-V and SAT 1-V) and verbal background variables (e.g., grades in English, years of English) were used to predict SAT-Verbal scores, and quantitative pretests and background variables were used to predict SAT-Math scores. However, since the inclusion of both verbal and quantitative variables improved prediction of each score, both verbal and quantita

tive pretests and background variables were used in the current reanalyses to predict both SAT 1-V and -M for juniors and both SAT2-V and -M for seniors. The only exception was "number of years of English,'"' which was dropped from the analyses because it did not add to prediction in any regression equation when the other variables were already entered.

The current reanalyses dealt only with the peak test dates in 1975 and 1976 (subsamples 1, 2, 3, and 4 in the FTC study). Subsamples 5 and 6 for the pooled time periods were omitted because of their heterogeneity.

Only students with complete data were included in the FTC analyses. In contrast, Stroud used missing-value techniques so that students not reporting parental income, race, or rank-in-class could nevertheless be meaningfully included in the analyses.

In the FTC study, students enrolled in coaching school who did not receive coaching prior to the SAT administration in question were added to the uncoached group. Since the representativeness of the control sample is thereby eroded, these students were excluded from the present analyses.

All three coaching schools in the FTC data set were included in the present analyses, and effects and standard errors are estimated for each school separately. In addition, smoothed estimates are provided for the three schools which utilize the empirical Bayes concept of "borrowing strength" across samples and which allow, under certain assumptions, for the possibility of predicting coaching/self-selection effects in the same schools in future

years.

In a key departure from the FTC methodology, a multiple regression equation predicting each dependent variable (SAT1-V and -M for juniors and SAT2-V and -M for seniors) is constructed in the present approach on the respective junior and senior samples of uncoached students rather than on a pooled sample of coached and uncoached students as in the FTC study (Belson, 1956). This procedure yields an unbiased estimate of the treatment effect in the presence of interactions between the size of effects and values. of a covariate; it is also the recommended procedure when the control sample is much larger than the treatment sample (provided that the relative contribution of sampling errors in the control-group regression line to the variance of the effect estimate is negligible) (Cochran, 1968). These regression equations are then applied to the coached students to predict the SAT scores they would have expected had they been uncoached students with their same values on predictor variables. Since we wish to assess

« iepriekšējāTurpināt »