| | |
| www.dg.dial.pipex.com | 594 readers since 19 Aug 2007 |
Plowden (1967) Notes on the text Volume 2 Preliminary pages Foreword and Contents
The 1964 National Survey: Appendix 3 1964 National Survey
Appendix 8 Social services and primary education
Report (full text) about Plowden |
The Plowden Report (1967)
A Report of the Central Advisory Council for Education (England) London: Her Majesty's Stationery Office 1967
Volume 7 Appendix 7
by GF Peaker I: Progress since the war 1. Since the end of the war there has been a remarkable improvement in standards of reading. In 1964 boys and girls aged eleven reached on the average the standard of pupils 17 months older in 1948. If the 17 months is expressed as a percentage of 72 months between the age of beginning school and the age of 11, this improvement can be stated as an increase of 24 per cent in the pace of learning. It is also the case that the standard reached or exceeded by half the boys and girls in 1948 was reached or exceeded by three quarters of their successors in 1964. There has been a corresponding advance among boys and girls aged 15. II: The surveys 2. This is a striking record of progress. What is the evidence for it? The evidence is to be found in the results of national surveys (1) (2) that took place in 1948, 1952, 1956, 1961 and 1964. The 1948 survey was carried out by the Inter-departmental Illiteracy Committee, and the subsequent surveys by Her Majesty's Inspectors. In all of them attention has been concentrated on two ages, namely 11 and 15. In 1952 and 1956 each group covered an age range of three months, centred on 15.0 and 11.0. The boys and girls surveyed in 1961 were fourth year pupils whose average age was somewhat below 15, and the juniors surveyed in 1964 were somewhat older than their predecessors. But it is easy to calculate from the results an appropriate age allowance at each age. For juniors it is in fact five months for each point in the test score, and seven for seniors. This enables the scores to be adjusted to a common basis, and this has been done throughout the tables and diagrams that follow. The 1961 survey was carried out for the Newsom Committee and was restricted to comprehensive and modern schools. The 1964 survey was confined to primary schools. 3. The same reading comprehension test has been used for all the surveys in the chain. It was devised by Dr AF Watts HMI, and Professor PE Vernon. It occupies both sides of a foolscap sheet, contains 35 questions and has a time limit of 10 minutes. For each question the pupil has to select the right answer from five given words. The questions become progressively harder. The first questions could be read and answered by an intelligent child in the infant school. The later questions need the same sort of vocabulary and understanding as the leading article in a good newspaper. This long range is ample for the juniors but does not completely extend the best seniors. If 10 harder questions were added to the test many of the best seniors, but hardly any of the juniors, would be able to deal with some of them. This is the explanation of what will be noticed in the tables, that for seniors the median score is higher than the mean, and for juniors vice versa. 4. The confidence that can be placed in the results of the surveys clearly depends on the one hand on the appropriateness of the test, and on the other on the accuracy of the sampling. These questions are considered in Sections III and IV.
5. Table 1 and Diagram 1 show the mean and percentile scores obtained by juniors in the four years when they were surveyed. The progress from 1948 to 1964 can be most easily seen from the diagram. It is apparent that the curve for 1948 has to be pushed forward by about three and a half points to cover the curve for 1964. The two curves are much the same shape; in other words there has been progress in all parts of the range. By reading the diagram horizontally, the increase in score for any percentile can be seen. By reading it vertically, the percentile ranks corresponding to the same score at different epochs can be seen. Percentile for percentile there has been an advance of about three and a half points, and score for score a fall of about 20 percentile ranks, so that a score that would have gained a good rank in 1948 would by no means suffice to do so in 1964. At the foot of the scale the tenth percentile was 3.9 in 1948, but had risen to 7.5 in 1964. In 1948 five per cent of the juniors were scoring two points or fewer, in 1964 there were hardly any. 6. It is apparent from the diagram that for most of the range a point of score is equivalent to about six percentile ranks. This is one useful figure to keep in mind in considering what is meant by an average gain of three and a half points. Another is that a point of score is equivalent to five months of age, for juniors. This cannot be seen from the diagram; it is the gradient obtained by considering the average scores for different (chronological) ages. As the age increases, the gradient gradually declines, so that for seniors, aged 15, it is not five months to the point, but seven months. 7. It will be seen that the statement in the opening section, so far as it relates to juniors, is simply the result of translating the average gains shown in points of score in Table 2 into gains in reading age, at five months to the point, and gains in percentile ranks, at six ranks to the point. 8. The complete account of the survey shows a considerable overlap between the two age groups. Four per cent of the children of 11 are in advance of 50 per cent of children of 15. Diagram 1 Scores in the reading tests, 1948 - 1964, pupils aged eleven III: The appropriateness of the test 9. The strength of the evidence in favour of the remarkable progress illustrated by Table 1 can be considered under two heads. These are the appropriateness of the test and the accuracy of the sampling. The questions are practically independent. The accuracy of the sampling would need to be demonstrated whatever the test, and the appropriateness of the tests would need to be considered whatever the accuracy of the sampling, and indeed if every child in the whole country had been tested. 10. What grounds are there for thinking the test appropriate? It would be easier to discuss this question if the test could be printed on the opposite page. But to publish the test would be to endanger its usefulness for future surveys, since it would not be clear how much of apparent future improvements was due merely to familiarity with the test itself. The fact that it has already been used for five surveys, and has not been published, constitutes its main claim to usefulness in the future. The first use of a new test can only form a link in a chain of surveys by means of a dubious and difficult process of calibration, unless it has been given to a national sample at the same time as the old test, and this is a strong argument for preserving a well tried instrument until it begins to show signs of obsolescence. 11. The general nature of the test is that it consists of 35 questions increasing rapidly in difficulty. It is called a test of reading comprehension, and it seems likely that the early questions test mainly reading, and the later ones comprehension. The early questions are so simple that almost any pupil could answer them if they are put to him orally, so that if he fails to answer them it is reasonable to think that this is because he cannot read them. On the other hand a pupil may have mastered the mechanics of reading and still be quite unable to obtain a high score because he lacks the vocabulary, the general knowledge, and the understanding needed to grasp the meaning and give the answer when he is confronted by the later questions. At one end the test answers the question 'Can he read at all?', and at the other end the question 'Can he read to some purpose, like an educated man?'. It would be hard to ask for more in the course of 10 minutes. 12. If the test were to be used as the sole measure of the ability of a particular child it could very properly be objected that however much the authors were guided by previous trial in their choice of questions they must perforce in the last resort select one question and reject another, and that in so doing they are making a random distribution of good and bad fortune among the children who will subsequently take the test. But it is the essence of randomness that it tends to cancel out over large numbers, and the object of the surveys is not to make judgements about individuals, but to assess the progress of populations, by means of samples large enough for the good and bad luck to cancel out. Even where the distribution of good and bad fortune is not random, the effects are eliminated provided that the proportions remain the same. This is strikingly illustrated by the constancy of the bias of the test in favour of boys, which has been very steady at about a point from the beginning. Analysis has shown that this bias lies almost entirely in nine of the questions, with one of them accounting for more than a fifth of the total. This is a case where the distribution of luck implied by the choice of question is known not to be random, but to favour one sex. But because the favour is constant it does not invalidate the comparisons of one year with another. 13. These arguments go some way to suggest that different methods of assessing average progress will lead to much the same conclusions for a moderately long period. They are not, however, demonstrative. Many definitions of reading ability and many methods of assessment have been proposed, and experiment has shown only moderate correlations between them for individual pupils. Although it seems likely that there would be much closer agreement between them over long term changes in averages there is no conclusive evidence of this. Perhaps the strongest arguments in favour of the present test are first that it implies a definition of reading ability that is in accordance with common sense, and secondly that it takes no more than 10 minutes of the pupil's time. This economy in time is valuable, since the business of the school is not to test but to teach. Moreover it is not 10 minutes of every pupil's time that is needed. It is shown in the next section that with careful sampling design reliable estimates of national averages for an age group can be obtained from samples containing only one pupil in 400. IV: Are the samples fair and accurate? 14. The essence of fairness in sampling is to make the draw by giving each member of the target population a specifiable chance of appearing in the sample. The chances need not be equal, but if they are unequal differential weighting is needed in compensation. Provided that the draw is made in this way the accuracy or representativeness of the sample can be assessed from the internal evidence that it contains. This can be done by working out the standard error of any estimate needed from the sample. In doing this account must be taken of the structure of the sample, and in particular of the number of stages in which it is drawn. In all except the first of the surveys in this series (for which the samples were judgement samples) the draw has been made in either two or three stages. When it was desirable to localise the survey, to enable ancillary work to be more easily done, three stages were used, and two were used where this was not the case. The three stages consisted first of the selection of local education authority areas, secondly of the selection of schools from within selected areas, and finally of pupils from selected schools. Although beginning with the selection of areas is a great convenience it is fairly expensive in terms of increasing the standard errors, or, what amounts to the same thing, increasing the number of schools and children needed for standard errors of a given size. Experience in designing these and other surveys has shown that a good rough preliminary rule, at the design stage, is to assume that when areas and schools are stratified about three per cent of the variation will lie between areas, seven per cent between schools within areas, and the remainder between pupils within schools. In two stage sampling this reduces to 10 per cent between schools and 90 per cent between pupils within schools. These rules enable one to make preliminary estimates, for any allocation of the sample, either of the standard errors or of the simple equivalent sample - that is, a single stage sample of pupils that would give standard errors of the same size. Thus if a sample of 2,000 pupils comes from 100 schools in 20 local authority areas the simple equivalent sample is 380 pupils. If the 100 schools came from only five areas the simple equivalent sample would fall to 140. If it were a two stage sample with the 100 schools selected directly from the whole country the simple equivalent sample would be 690, while if the 2,000 were drawn from 200 schools instead of 100 the simple equivalent sample would be 1,050. These simple illustrations show the importance of having enough schools, as well as pupils, in the samples, and of having enough areas if the sampling is three stage. 15. The preliminary calculations cannot, of course, be left at that. After the event it is necessary to find out what the standard errors actually are, as distinct from what it was hoped, from the preliminary rule, that they would be. Owing to various complications, such as the fact that even after stratification schools vary a good deal in size, the posterior calculations may be fairly lengthy. But in fact the results have seldom differed much from those forecast by the preliminary rule. Indeed, for two stage sampling the rule of 90 for the pupil and 10 for the school has recently proved useful over a dozen countries. 16. Table 2 gives the standard errors for the mean scores in the various surveys.
17. The 1948 sample has been treated as if it were a probability sample, though it was in fact a judgement sample. The reason for its large standard error is that it was a three stage sample with only a small number (four) of local authorities. The 1952 sample was based on 15 local authorities, and the 1956 on 23. The sample of 1964 was a two stage sample with schools drawn direct from the whole country. 18. The simple rule of three per cent for the area, seven per cent for the school, and 90 per cent for the pupil gives an initial estimate of 0.59 for the standard error when applied to the four areas, 80 schools and 2,800 pupils of the 1948 sample. Applied to the 23 areas, 138 schools and 1,374 pupils of the 1956 sample it gives an initial estimate of 0.31. It will be seen that both these initial estimates are remarkably close to the final estimates given in Table 6 above. This illustrates the value of the rough rule for sampling design. 19. Combining the evidence gives the standard error of the gain in mean score as 18 per cent of the gain, if we begin with the rather weak estimate for 1948. Over the probability samples proper from 1952-1964 the standard error of the gain is 14 per cent. The gain itself is 3.4 points over the 16 years, and it is remarkably steady, since it is made up of 0.8 and 0.9 for the first two periods of four years, together with 1.7 for the final period of eight years. This suggests that it is likely to continue at the same rate for some time. 20. If we convert the gain from points to months we get 17 ± 3.0 as the gain in months from 1948. In the opening paragraph the gain in months of reading age was converted to an increase in the pace of learning by dividing it by the preceding length of school life. Taking account of the standard error gives (24 ± 4.2) per cent as the increase in the pace of learning. Allowing two standard errors each way to cover the luck of the draw in sampling would give upper and lower limits of 32 and 16 per cent, but this interval is probably unduly wide, since the size of the standard error depends mainly on the weak determination of 1948 and the estimates, including the 1948 estimate, are remarkably consistent. This makes it reasonable to believe that the gain is in fact very close to 24 per cent.
References (1) Standards of Reading 1948-1956, HMSO 1957. (2) Progress in Reading 1948-64, HMSO 1966 (which incorporates this account). |