Yes, it’s that time again – I am now into the research cycle of the M.Ed., which includes the course I’m taking this semester. Our first assignment was to reflect on the ethics of a case study conducted in the USA in the 1960s.
Robert Rosenthal and Lenore F. Jacobson’s experiment on self-fulfilling prophesies, in the late 1960s, led educators to reflect on their “attitudes and behaviour towards students,” and inspired further research into the impact of teacher attitude and the concept of the self-fulfilling prophesy. Four decades later, however, such an experiment might not get past an academic ethics committee, despite what appear to be significant and desirable effects in the field.
Rosenthal and Jacobson (1968) hypothesize that, based on the concept of self-fulfilling prophesy, teacher expectations of student achievement would in fact affect that achievement. In order to explore this hypothesis, the researchers conducted a study of teachers and students in an elementary school in San Francisco; teachers were given false results of an intelligence test, and told that certain students had been identified as “potential academic spurters” (66). Subsequent intelligence tests supported the original hypothesis, in that “children from whom teachers expected greater intellectual gains showed such gains” (67).
Several aspects of the study might now be considered unethical. Clearly, the nature of the study demands a certain level of deception; the researchers needed the teachers to be unaware of the purpose of the study, particularly since the teachers themselves were the true object of the study. However, one might argue that the deception went beyond what was required. For instance, the researchers state that the intelligence test administered was “fairly new and therefore unfamiliar to the teachers” (66), but that they had “special covers” made with the “high-sounding title ‘Tests of Inflected Acquisition’” (66). A 21st century academic ethics committee might question the need to create an added layer of deception to the test, if it was already unknown. Even if it had been a recognized test, there does not seem to be any real justification for the “special” retitling. Given that the researchers later consider the possibility that the Hawthorne effect, which results in subjects behaving differently simply because they are subject of a “high-sounding” study, might influence the teachers, creating a “special” test cover might in fact undermine the results.
Furthermore, the “deliberately casual” (66) mention of students names, at the end of a staff meeting, seems questionable and unnecessary. Why must the delivery of this information be casual? If anything, this approach seems counter to the first notion, that the researchers wanted the teachers to consider the test and the observation of students to be more formal and important. As well, this casual mention in no way guarantees that teachers actually take note of each potential spurter. Finally, this casual mention implies that the ‘results’ of the intelligence test were not officially presented to the teachers, and there is no indication in the research report that teachers were asked not to communicate the results to students and parents. If the study is meant to observe the effects of teacher expectations, researchers must take pains to ensure that neither student nor parent expectations become a factor.
Another questionable aspect of the study lies in the researchers’ justification for identifying only potential spurters. Although Rosenthal and Jacobson claim that they want to “avoid the dangers of letting it be thought that some children could be expected to perform poorly” (65), clearly this expectation is inevitable if certain children are identified as expected to perform well. Furthermore, in choosing a sample of students who were already streamed into three levels (66), the researchers run the risk that teachers already have preconceived notions about student potential – a risk that is borne out in the results, which show that students previously identified as “below average” (66) were not as well-regarded as their counterparts in the two other levels, regardless of whether or not they had been identified as potential spurters (67).
Although the consequences of Rosenthal and Jacobson’s study can be considered positive, their conclusion reveals a possibly controversial stance: that federal funding established in the mid-1960s designed to help “disadvantaged” students was money ill-spent. The researchers conclude that “more attention in educational research should be focused on the teacher” (69), but ignore the potentially important findings in terms of streaming students. Furthermore, there is no indication in the report that the researchers spent any subsequent time with the teachers and students to discuss the implications of their research, or even really considered whether or not teachers or students were harmed through their unwitting participation in the study. Personally, the most important of these issues is related to streaming. It seems, in retrospect, that the hypothesis of the research should have considered that teacher expectations are naturally influenced by streaming, regardless of the results (false or otherwise) of an intelligence test. Even if this possibility were not considered in the initial hypothesis, the results, particularly those related to the lower stream classes, should have been emphasised in the conclusion. In short, rather than imply that educational funding had been misdirected – opening the door to conservative machinations to reduce such funding – the researchers should have called for the abolition of streaming, with additional funding given to support teachers with ‘mixed’ classes.
Perhaps the significance of the study lies not in the researchers’ actual conclusions, but in the observed effects on educators. If teachers become more conscious of how they communicate their expectations to students, and how those expectations directly affect their students, presumably we will strive to first reconsider our expectations and secondly to approach all students with an open mind. Ethically speaking, though, we need to weigh that outcome with the potentially negative impact on the school in question – clearly, several teachers and students were affected over a long period, without, it seems, any follow-up on the part of the researchers to minimize harm. Questions might also be asked about the consequences for the student participants – for instance, what are the effects on students who were falsely identified as potential spurters? They may have, as a group, performed well during the period of the study, but presumably were subsequently re-identified as “less” than they had been. Furthermore, how do the participant teachers deal with each other and their students once the real aim of the study and its results are revealed? Is it possible that teachers will feel a level of resentment towards the misidentified students? What about subsequent research? Such a deceptive study would certainly influence the attitude of teachers and administrators, not only of the school in question, but of all schools. In fact, one might even argue that a certain part of the perceived positive shift in teacher attitudes stems from a paranoia that we might be under scrutiny. In short, without significant reconstruction of the proposed methodology, this study should not be considered viable by today’s academic ethics committee.