Grade retention and educational attainment, M Belot, V Vandenberghe

Tags: Belgium, French-Speaking Community of Belgium, grade retention, PISA, Grade 10, grade repetition, social promotion, grade, educational attainment, attended grade, Community, descriptive statistics, standard deviation, control systems, United Kingdom Greece Hungary Ireland, students, Flemish-Speaking Community, PISA 2000, French Community, score, Republic Sweden Thailand Tunisia Turkey Uruguay United States Yougoslavia Nobs, regime change, Country Australia Austria Belgium, Japan Korea Liechtenstein Luxembourg Latvia Mexico Netherlands Norway New Zealand Poland Portugal Russian Federation Sweden United, American Economic Journal - Applied Economics, Grade level retention, American Educational Research Journal, Nuffield College, reform, Remedial Education, Bibliography Abadie, Economics of Education Review, Switzerland Czech Republic, School Psychology Review, curriculum, California's Tobacco Control Program, Centre for Economic Performance
Content: Grade retention and educational attainment Exploiting the 2001 Reform by the French-Speaking Community of Belgium and Synthetic Control Methods M. Belot and V. Vandenberghe Discussion Paper 2009-22
Grade retention and educational attainment Exploiting the 2001 Reform by the French-Speaking Community of Belgium and Synthetic Control Methods
Michиle Belot Oxford University, Nuffield Centre for Experimental social sciences (CESS), Nuffield College [email protected]
Vincent Vandenberghe Universitй catholique de Louvain (UCL), Economics School of Louvain (ESL) [email protected]
Abstract This paper evaluates the effects of grade retention on attainment by exploiting a reform introduced in 2001 in the French-Speaking Community of Belgium whereby the possibility of grade retention in grade 7 was reintroduced. It uses the Synthetic Control Method to identify the best possible pre-treatment control. Data come from three waves of the PISA study (corresponding to periods before and after the reform) that contains test scores of representative samples of 15 year-olds. These are used essentially to answer two questions. First, has the 2001 grade repetition reform at least succeeded at filtering out weaker pupils, pupils who would presumably be disadvantaged by being promoted directly to higher grades. This is a minimum condition for grade retention to be justifiable. Second, do these treated students achieve better/worse when they repeat (and attend a lower grade) than when they are socially promoted (and attend the age 15 reference grade 10)? We find significant evidence of positive screening but we fail to demonstrate that those filtered out perform differently under the grade repetition regime than under the social promotion regime. JEL: I20, I28, H52 Keywords: Grade retention, educational attainment, synthetic control method
2 1. Introduction
Grade retention (or repetition) is the object of an ongoing debate in many developed countries. Some countries privilege a system of social promotion, which allows pupils to be promoted to higher grades independently of their performance, while other countries have instituted more or less strict policies of grade retention, conditioning promotion to higher grades on educational achievements. As a consequence, there is a considerable variation in grade retention rates1 across OECD countries (Figure 1). Countries/entities like the Netherlands, Austria, Portugal and the French-Speaking Community of Belgium have relatively high rates of grade retention (going up to 50% of pupils having repeated a year or more by the time they reach the end of compulsory schooling); while countries like Denmark, Sweden, Japan, Norway and the UK have no grade retention at all.
Grade retention imposes a cost on society, both in terms of the opportunity costs of those pupils who are forced to repeat a year, but also in terms of teaching resources. Indeed, grade retention often implies larger class sizes and more pressure on the (limited) teaching resources. Overall pedagogues do not generally support the effectiveness of grade retention and the ensuing differences in the grade attended by pupils (McCoy and Reynolds, 1999). They argue that grade retention has negative effects on self-esteem and academic performance, and even on non-academic outcomes such as crime and teenage pregnancy. On the other hand, the proponents of grade retention argue that it may have motivational effects on pupils ­ the threat of being retained playing the role of a stick.
There is a large amount of evidence showing a negative association between grade retention and educational outcomes. Holmes (1989), in a large meta-analysis, finds that, on average, later test scores of children retained in lower grades are 0.19 to 0.31 Standard deviations lower than those of similar children progressing normally through school. The same negative results are reported in a subsequent meta-analysis by Jimerson (2001). There is also a large amount of evidence of a negative relationship between retention and high school (i.e. upper secondary) dropout (e.g. Grissom and Shepard, 1989; Roderick, 1994; Jimerson, 1999). The challenge of this literature is of course that grade retention and educational outcomes are
1
Defined as the share of pupils aged 15 attending a below-reference grade.
3
likely to be simultaneously determined, which often compromises the identification of a causal effect.
There are a few studies providing quasi-experimental Evidence on the effects of grade retention. Eide and Showalter (2001) use the variation in the age of entry into kindergarten across US states as an instrument for retention. They find that for white students, grade retention may have some benefit by both lowering dropout rates and raising Labour Market earnings, although the IV estimates tend to be statistically indistinguishable from zero. Three studies (Jacob and Lefgren (2004, 2009), Roderick and Nagaoka (2005)) exploit a discontinuity in the retention decision under Chicago's high-stakes testing policy2 introduced in 1996-97. The policy created a discontinuity in the relation between scores in a single standardised test (thereby the label high stakes) and the probability of grade retention. Using a regression discontinuity design, these studies evaluate the effects of grade retention on pupil performance at different points in time. Jacob and Lefgren (2004) find no systematic differences in performance between retained and promoted students in the short-run. Roderick and Nagaoka (2005) show that third-grade students who were retained do not yield higher language test scores two years after the retention, and that retained sixth graders had lower achievement growth. Finally, Jacob and Lefgren (2009) find that grade retention leads to a modest increase in the probability of dropping out for older students, but has no significant effect on younger students. Finally, Manacorda (2008) exploits a discontinuity induced by a rule in Uruguay Junior High School establishing automatic grade retention for students missing more than 25 days and shows that grade retention leads to a substantial increase in drop-out and lower educational attainment even 4 or 5 years later.
In this paper we exploit a reform in the French-Speaking Community of Belgium in 2001 that (re)introduced the possibility of grade retention at the end of both grade 7 and grade 8. Before then3, grade retention was not allowed at the end of grade 7. The reintroduction of grade retention in 2001 provides a natural experiment to evaluate the effects of grade retention. We use information from the PISA study, measuring performance in a standardised test across OECD countries in Maths, Reading and Science at the age of 15. Pupils who have not
2
In the mid-1990s, the Chicago Public Schools declared an end to social promotion (i.e. no grade
repetition sanctions) and instituted promotional requirements based on standardised test scores.
3
More precisely in the period 1995-2001
4
repeated a year should then be in grade 10, thus three grades further than the one affected by the reform. We are able to compare results before the reform (PISA 2000 and 2003) and after the reform (PISA 2006), which is a major advantage in comparison to existing studies. This enables us to compare two different regimes, with and without grade retention.
We first find that the 2001 decision did lead to a statistically significant change in how 15year-olds are assigned to grade. The reform led to a reduction of the likelihood of reaching grade 10 at the age of 15 (i.e. no grade retention record), and symmetrically, to an increase in the likelihood of attending lower grades (i.e. grade 9, 8 or 7).
Compared with many studies, ours also present the advantage of assessing the medium-term effects of grade retention. The reform we examine has (exogenously) changed the likelihood of grade repetition in grade 7 at the age of 12, and we examine the effect of this reform when students are aged 15. However, since these pupils are still below the compulsory school age, we cannot assess the effects of the reform on the final educational achievements. Comparing same-age (retained vs promoted) pupils in the medium run remains problematic, because they are by definition in different stages of the curriculum.4
However, we can nicely test for one necessary condition for grade retention to be justifiable, which is that it should at least succeed in filtering out weaker students from passing to higher grades. That is, in order to provide any grounds to grade retention, one should at least be able to show that, at grade 10, the distribution of score under a grade retention regime is better than under a social promotion regime. 5 We will show that we find supporting evidence for a filtering out effect of the reform.
The data also allow us to compare the attainment of those filtered out under the grade repetition regime vs. the social promotion regime. This allow us to shed some light on two conflicting trends impacting grade repeaters: i) a (negative) curriculum effect as repeating a grade means being exposed to a poorer/less demanding curriculum than the one taught in the (higher) reference grade6; and ii) a lower-ability/less-demanding curriculum (positive)
4
They attend different grades, as can be seen in Table 2 for instance.
5
Synonymous with no grade-repetition sanctions.
6
Grade 10 in Belgium at the age of 15.
5 matching effect. The latter effect directly echoes the argument of the proponents of grade repetition: weaker pupils should benefit from being exposed longer to a simpler curriculum that better matches their ability and/or attainment. As to the methodology used in this paper, it is important to stress that the main results are based on the synthetic control (SC) method (Abadie, Diamond, Hainmueller, 2007), which uses data-driven procedures to construct an adequate comparison group/counterfactual. In practice, it is difficult to find a single unexposed unit (here an educational system) that approximates the most relevant characteristics of the French-Speaking Community of Belgium's education system and would provide a counterfactual. The idea behind the synthetic control approach implemented here is that a combination of countries -- a synthetic control -- offers a better comparison than any single country/entity alone (say the FlemishSpeaking Community of Belgium, France, Germany or the Netherlands). The remainder of this paper is organised as follows. Section 2 is introductory and mainly consists of stylised facts. It documents the international evidence on retention rates and overall PISA scores. It essentially shows that there is no correlation between cross-country variance in grade assignment of 15 years-olds and (1) their average score and (2) the dispersion of their scores. Section 3 presents the 2001 reform in the French-Speaking Community of Belgium and documents its impact on the incidence of grade retention using both administrative data and various waves of the PISA survey. It then examines the relationship between (more) grade retention and PISA scores in the French-Speaking Community of Belgium, using the SC method to generate the best possible counterfactual. The plausibility of a filtering out assumption is examined first. Second, the paper looks at how the score of filtered out students compares under the two regimes. Section 4 concludes. 2. Grade assignment and grade retention: the international evidence The different OECD countries that participated to the three waves of PISA (2000, 2003, 2006) provide a relatively large source of variance as to the incidence of grade retention (see Annex 1 to 3). Using country-level aggregate data, it is easy, in Figure 1, to see how the share
6 of pupils attending the grade of reference7 (our proxy for the intensity of grade retention)8 relates to score. Figure 1 basically suggests an absence of correlation between the importance of grade retention (i.e. a leftward shift) and average score in math. Similar results are obtained for reading and science scores. Note incidentally that Figure 2 conveys the same information about the relationship between grade retention and standard deviation of PISA scores. However, country-specific unknown factors may be systematically correlated with i) the (varying) propensity of countries to resort to grade retention and ii) scores. Under these circumstances the results of an analysis exploiting the inter-country variance are bound to be biased. This is why it is worth focusing solely on the intra-(or within-) country variance. This is made possible by the availability of three consecutive waves of the PISA survey (2000, 2003 and 2006). Exploiting the country-level panel structure of PISA is thus possible to re-examine the relationship we are interested in. Descriptive results are displayed in Figure 3 and Figure 4. They tend to confirm the absence of relationship between the (within country) evolution of score from 2000 to 2006 and the changing proportion of pupils who attend Grade 10 at the age of 15. The descriptive results on display in Figures 4 & 5 are confirmed by the OLS estimation of equation [1] (Table 1). The latter uses the same data aggregated at country level. It includes country fixed effects to retain the within-country part of the variance. The list of controls includes a year trend -- that captures changes that are common to the whole group of countries sampled -- and a vector of socio-economic background variables (Table 1). Note finally that this model is estimated separately for each of the topics covered by PISA (Math, Science and Reading literacy).
7
Grade 10 in most countries, grade 9 otherwise. The grade of reference is identified as the most attended
one among 15 year-olds who participated to PISA.
8
Of course, differences could also be due to differences in entry school ages. In the case of Belgium,
except in rare exceptions, pupils enter grade 1 during the calendar year they turn 6. The exact cut-off date is the 1st of January. All the pupils that have reached the age of 6 before that date must start grade 1 during the calendar year that ends on the 1st of January.
7
Yi,t= + SREFGi,t + Z'i,t + YEAR + i + i,t
[1.]
i= 1,...., J and t=2000, 2003, 2006
where
- Yi,t is the average PISA score of country i during year t; - SREFGi,t is the share of pupils attending the reference grade9 in country i during
year t;
- YEARt is the year of observation capturing a trend that would be common to all
countries ;
- Z'i,t is a vector of controls that include the average parental socio-economic
background index and education attainment;
- i is the country i fixed effect ; - and i,t a random error term centred on zero ;
A major limitation however is that a within (country) restriction, as we imposed in the previous section (Figure 3 & 4 or equation [1]), could prove insufficient to properly identify the effect on scores of the grade-assignment regime. Indeed, changes observed within a country over time may be driven by unobserved confounding factors that are correlated with scores, like a better economic environment (insufficiently or inadequately captured by the observables Z). Thus, ideally the identification of the effects of grade retention requires not only an exogenous change in grade repetition, but also the existence of a counterfactual for comparison. This is why we now propose an analysis comparing the changes observed in the French Community to the changes observed in a control group.
9
Grade 10 in the French-Speaking Community of Belgium, like in most of the other countries
considered here.
8
Figure 1 ­ Average score in math and share of pupils aged 15 attending reference gradea. Year 2006
PISA mean score
600 550 500 450 400 350 0
TAP
HKG
BFL
FIN
KOR
CNZELD
MAC
AUTBFR FRSVAK PRT
EST ESP RUS
CLCIAEHNE DNAKUS DEU LVA LLTUUX ITA USA
NZL JPN AZHEGPUOBISNRWRLSNLVIOESNLR GRC HRV
TUR
ISR URY
THA
MCEHXL
SRB RBOGUR
ARG
COL
TUN
BRA
JOR IDN
MNE
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
1
Share of pupils attending reference grade
a) Grade 10 in most countries, grade 9 otherwise. The grade of reference is identified as the most attended grade among 15 year-olds who participated to PISA. ARG : Argentina ; AUS : Australia ; AUT : Austria ; AZE : Azerbaijan ; BFR : French-Speaking Community of Belgium; BFL: Flemish-Speaking Community of Belgium; BGR : Bulgaria ; BRA : Brazil; CAN : Canada; CHE : Switzerland; CHL : Chile COL : Colombia CZE : Czech Republic; DEU : Germany; DNK : Denmark; ESP : Spain EST : Estonia; FIN : Finland; FRA : France; GBR : United Kingdom; GRC : Greece; HKG : Hong Kong-China; HRV : Croatia; HUN : Hungary; IDN : Indonesia IRL : Ireland; ISL : Iceland; ISR : Israel; ITA : Italy JOR : Jordan; JPN : Japan KGZ : Kyrgyzstan; KOR : Korea LIE : Liechtenstein LTU : Lithuania LUX : Luxembourg; LVA : Latvia; MAC : Macao-China; MEX : Mexico; MNE : Montenegro; NLD : Netherlands; NOR : Norway; NZL : New Zealand; POL : Poland; PRT : Portugal QAT : Qatar; ROU : Romania; RUS : Russian Federation; SRB : Serbia; SVK : Slovak Republic; SVN : Slovenia SWE : Sweden; TAP : Chinese Taipei; THA : Thailand TUN : Tunisia; TUR : Turkey; URY : Uruguay; USA : United States. Source: PISA 2006
9
Figure 2 ­ Standard deviation of score in math and share of pupils aged 15 attending reference gradea. Year 2006
Pisa standard deviation of score
110 105 100 95 90 85 80 75 70 65 60 0
MAC
CZE
ISR
BFR
AUT
TUBRFRRSAVAKHKG
TUARPY BFL ARG
TUNNLDPRT
COL
RUSESP THA
QCAHTL
EST
DEU ITACHE
BGR
LIE CAN
LLUTUUSXAAUSNGZHGLRPUOBCSNKWRSLNSORJVIOEPSBRNLNR
LVA DFNINKJOKRGRZOUIRMLHRNVE
MEX IDN
0.2
0.4
0.6
0.8
1
Share of pupils attending reference grade
a) Grade 10 in most countries, grade 9 otherwise. The grade of reference is identified as the most attended grade among 15 year-olds who participated to PISA. Source: PISA 2006
Figure 3 ­ Within country change of the share of age 15 pupils attending reference grade and change of average score. Math.
Change in PISA score mean
40
30
20
10
0
-0.3
-0.2
-0.1 -10 0
0.1
0.2
0.3
0.4
-20
-30
-40 Change of the share of pupils attending reference grade
Source: PISA 2000, 2003 and 2006
10
Figure 4 ­ Within country (statistically significant) change of the share of age 15 pupils attending reference grade and change of standard deviation of score. Math.
Change of PISA score standard deviation
20
15
10
5
0
-0.3
-0.2
-0.1 -5 0
0.1
0.2
0.3
0.4
-10
-15
-20 Change of the share of pupils attending reference grade
Source: PISA 2000, 2003 and 2006
Table 1 ­ Shares of pupils in reference grade and PISA scores (OLS coefficients). Within analysis.
Variable
Country
average score
math
read
scie
Country
standard deviation
math read
scie
Share of pupils attending reference gradea
0.15
-0.10 -0.12 -0.23 -0.09 -0.16
(-0.460) (0.792) (0.575) (0.003) (0.469) (0.038)
R2
0.97
0.86
0.92
0.78
0.62
0.74
N obs
132
129
132
132
129
132
P-values are between brackets Controls include country fixed effects, year, average highest parental socioeconomic index, average highest degree of father, average highest degree of mother. a) Grade 10 in most countries, grade 9 otherwise. The grade of reference is identified as the most attended grade among 15 year-olds who participated to PISA. Source: PISA 2000, 2003 and 2006
11 3. Exploiting the French-Speaking Community reform 3. 1. The 2001 reform in the French-Speaking Community of Belgium: a source of exogenous variation of grade retention
Grade retention/retention and different-grade assignment of same-age pupils have existed for a long time in Belgium, and is particularly frequent in the French-Speaking Community (Figure 1).10 The retention decision is based on the teachers' assessment of the pupil's ability of passing to a higher grade. There is no standardised test used across schools, nor is there a clearly defined threshold to determine whether a pupil should be retained or not. All pupils do take exams at the end of the school year, for each subject, and the retention decision is made after these exams have been taken.
Opponents to grade retention succeeded in 1995 in suppressing grade retention at the end of grade 7 (1st year of secondary education). From 1995 to 2001 no grade retention was allowed at the end of grade 7 (1st year of secondary education), a decision that translated into a sharp fall in the number of repeaters (Figure 5). During that period, grade retention sanctions could only be pronounced at the end of grade 8. Pupils could only possibly repeat grade 7 upon agreement between parents and teachers. This is why on Figure 5 one observes a persistence of grade retention at the end of grade 7 during the 1995-2001 period.
The proponents of grade retention made a successful comeback in September 2001, when the decision was taken11 to re-establish the possibility to retain weak students in grade 7. In a few words, the 2001 reform was such that after the school year 2001-02 it became possible to repeat grade 7 or grade 8, although not both.12 Administrative data (Figure 5) show that the
10
Belgium is a federal state where the educational policy is split according to linguistic lines. Each
linguistic community is in charge of its educational system. Only minor aspects of the educational policy (like
the age of compulsory education) remain under federal jurisdiction
11
Dйcret relatif а l'organisation du premier degrй de l'enseignement secondaire D. 19-07-2001 M.B.
23-08-2001
12
Formally, the legislator insists on the fact that the reform's aim was not exactly to force the pupils to
repeat the year, but to channel weaker students (who did not achieve satisfactory results at the end of grade 7
or at the end of grade 8) towards a complementary year. In practice, however, it amounts to imposing that
these students take more time before moving to the upper grade.
12
number of pupils repeating grade 7 consequently rose sharply from the school year 2002-03 onwards. The same data also show that the total number of students repeating grade 7 or grade 8 is substantially higher after 2001, meaning that the 2001 reform generated an overall increase of the risk of being retained into a lower grade.
Thus, the 2001 reform enables us to compare a system with grade retention with a system with (almost) no grade retention in grade 7. Hereafter, we exploit the 2001 reform and investigate the medium-term13 (causal) effects of the reform on the PISA scores.
Figure 5 ­ Incidence of grade retention at Grade7 and Grade 8. School year 1992-93 to 2003-04
Grade repetition in French Community
7000
6000
5000
Number of pupils 4000 repeating a year 3000
2000
1000
0
92- 93- 9493 94 95
95- 9696 97
9798
9899
9900
0001
0102
0203
0304
grade 7 grade 8
Source: French-Speaking Community of Belgium, Ministry of Education.
13
Remember that we look at age 15 scores to identify the effect of a decision that affected pupils when
they were aged 12-13.
13 3.2. Using the Synthetic Control Method to generate a counterfactual To assess the effects of the reform, we use a synthetic country (SC) as a control (Abadie, Diamond, Hainmueller, 2007). The method generalizes the commonly used difference- indifference model. The SC method a priori uses all countries other than French-Speaking Belgium that participated in PISA as potential controls. The key idea is to identify a linear combination of the other i=2 to J countries -- W=(w2,....wJ) such that wi 0 and w2+....+wJ= 1 -- that best reproduces the French-Speaking Community of Belgium (i.e. the treated entity) during the pre-reform period (i.e. 2000 and 2003), both in terms of average attainment Y and a list of observed controls Z that potentially affect attainment.14 The identification of the effect of the reform is achieved by comparing the post-reform observed average attainment of pupils in i) the French-Speaking Community of Belgium Y1 and ii) its synthetic equivalent YSC=.wiYi, i=2 to J. Annex 1 explains how this is done analytically and why the post-treatment (i.e. 2006) first difference between the treated and the synthetic control entities properly identifies the effect of treatment in the presence of unobserved time effects and country effects that are not randomly distributed.
i) PISA Evidence of more grade retention as a consequence of the 2001 reform
Before turning to the implementation of this evaluation strategy, we need to complement the information highlighted in Figure 5 and check that the PISA data used here also contain robust evidence that the reform has generated some change in the French-Speaking community of Belgium in the likelihood of experiencing grade repetition.
Table 2 reports the distribution of pupils aged 15 according to their grade in French-Speaking Belgium and in the synthetic control entity. We see that in the French-Speaking Community
14
Our list of controls/predictors include student/teacher ratio, ratio of computers to school size, % of
teachers with proper certification, mother education, father education, the highest parental socio-economic index
(HISEI), and the share of pupils attending the reference grade prior to the reform.
14
of Belgium, less pupils aged 15 reached grade 10 in 2006 (i.e. after the reform) than in 2003 or 2000 (before the 2001 reform), and, symmetrically, that more pupils were below grade 10.
Frequencies reported in Table 2 are direct sign that more grade retention (with lasting effects) occurred in the French-Speaking Community of Belgium, as the only way to be at age 15 in grade 10 is to have a no-grade-retention record. In short, all this accords with the graderetention Regime change introduced in the French-Speaking Community of Belgium in Sept. 2001.
Table 2 ­ Share of pupils aged 15 attending grade 10 vs. grade < 10 (%) in the French-
Speaking Community of Belgium
French-Speaking Community of Belgium
2000
0.59
00
2003
0.59
03
2006
0.55
06
Source: Pisa 2000, 2003 and 2006
ii) The screening-out test Grade retention to be justifiable should at least succeed in filtering out weaker students from passing to higher grades. That is, in order to provide any grounds to grade retention, one should at least be able to show that, conditional on grade, the distribution of scores under a grade retention regime is on average better than the distribution of scores under a social promotion regime.15 We will focus here on Grade 10 and use the SC method to generate the counterfactual. Estimated weights, for each of the models estimated here using the SC method, are reported in Annex 9, first Table.
15
Those who make it to grade 10 for instance should be, ceteris paribus, better than under the less
selective regime.
15
Assume that they are (potentially) two categories of students attending grade 10. First, the non-delayed students, unaffected16 by the grade-repetition regime change. These always attend grade10 at the age of 15. Let us note their average score YND10. The second group consists of the 0<<1 students directly affected by the reform (the treated students hereafter). Their average score in 2006 (in grade10) is YT10.
Assume further that Y110 is the post-reform observed score average17 of grade 10 students from the French-Speaking Community of Belgium. It is solely driven by the attainment of non-delayed students.
Y110 YND10
[2.]
By comparison, the grade 10 score average in the synthetic French-Speaking Belgium -- estimated using the data-driven SC procedure exposed above -- should be a linear combination of i) the score of non-delayed/non-treated students and ii) that of treated students (those who reached grade10 at the age of 15 thanks to a less stringent grade- repetition regime before the implementation of the 2001 reform). Note that -- the fraction of the cohort that has been treated normalised by the size of grade10 in 2006 (06) to properly capture the weight of the treated in the 2006 grade 10 average -- can be estimated using Table 2 figures [(58.68-55.32)/(55.32)=0.0608]. It is about 6.8 %
YSC10 (1-)YND10 + YT10
[3.]
where = µ/ 06 and = 03-06 as reported in Table 2
Now, turning to grade 10 score average differences, using [2] and [3] we get:
Y110- YSC10 = YND10- (1- )YND10 - YT10
[4.]
16
We leave aside spillover effects.
17
All results presented hereafter use students' average scores (i.e. the unweighted average of their math,
science and reading PISA scores).
or equivalently, Y110- YSC10 =(YND10 - YT10) where = µ/ 06
16 [5.]
The estimation of the left-hand part of expression [5] gives a direct indication of the score gap between the treated and the non-treated students. If Y110- YSC10 is significantly positive then one can conclude that score of the treated/socially promoted students was below those who usually attend grade 10 at the age of 15. In that case, it can be inferred that the reform properly managed to filter out weaker students (those who presumably could benefit from a less demanding curriculum or extra time). An important restriction is that [5] provides an estimate of the -weighted relative performance of treated students within grade 10. Hence, it is likely that the 1.71 gap reported on the first line of Table 3 underestimates the actual score gap. A short-cut strategy to cope with this bias consists of "dividing Y110- YSC10 by =0.061. This rapid transformation suggests a positive score gap of 28 points.18 But the 1.71 points estimate on the first line of Table 4 is so uncertain (i.e. p-value= 0.40) that there could be no effect at all, or even a positive one. Further econometric analysis is highly desirable to test the plausibility of that result. Our strategy in that respect is simple. It consists of incrementally eliminating the upper percentiles of the grade 10 distribution of scores (Figure 6) to increase the (relative) weight of treated students in the comparison. The crucial assumption is that treated students must be concentrated just at the bottom of the grade 10 distribution. Formally, we estimate:
| Y110- YSC10 (Y10 th perc.)=()(YT<10- YT10)
[6.]
18
Pisa scores have an (international) average of 500 and a standard deviation of 100.
17 - ranging from 90 to 10. - () = µ/ 06 assuming all treated pupils belong to the retained percentiles Results are reported in Table 3. They contain statistically significant evidence19 that YND10>YT10 and thus that more grade repetition after the Sept. 2001 reform has primarily led to the retention of students who had PISA scores inferior to the grade 10 average. Figure 6 ­ Increasing the chances of identifying the sorted-out students: eliminating the upper end of the Grade 10 score distribution.20
Grade 10
Pisa Score
19
See Annex 5 for a presentation of inference analysis/hypothesis testing with SC.
20
The actual score distribution is reported in Annex 6.
18
Table 3 ­ Grade 10 change of attainment between French-Speaking Belgium and synthetic
French-Speaking Belgium as an estimate of the treated vs non-treated student gap. Year 2006$
_Y_French SC_equ Speaking Belgium _Y_synthetic Y_dif Probt
()=/ Y_diff/()
All obs
541.66
539.93
1.73 0.4084 0.061 1.00 0.061
28.48
1<=p90
528.76
528.02
0.74 0.5431
0.90 0.068
10.98
1<=p80
517.50
507.20 10.30*** 0.0000
0.80 0.076
135.41
1<=p70
506.19
502.26
3.93** 0.0200
0.70 0.087
45.21
1<=p60
494.15
486.87
7.28*** 0.0001
0.60 0.101
71.79
1<=p50
481.38
477.18
4.20** 0.0132
0.50 0.122
34.52
1<=p40
466.97
456.29
10.68** 0.0000
0.40 0.152
70.21
1<=p30
449.92
438.77 11.15*** 0.0000
0.30 0.203
54.97
1<=p20
427.71
422.00
5.70** 0.0127
0.20 0.304
18.74
1<=p10
394.68
397.97
-3.29 0.3477
0.10 0.608
-5.40
Source: PISA 2000, 2003 and 2006
*** Significant at 1%, **significant at 5%, * significant at 1% $ Score comparisons for the year 2000 to 2006 for a selection of estimated models are on display in Appendix 7,
whereas Annex 8 displays the comparison of predictor/control variables.
iii) How do filtered-out students fare?
Do the filtered out students achieve better/worse when they repeat (and attend a lower grade) than when they are socially promoted (and attend the reference grade 10)? An answer to that question can be provided by comparing French-Speaking Belgium's overall (i.e. all grades pooled) score average to its synthetic counterfactual.21
Assume now that they are three categories of students forming the public of both grade 10 and grade<10. First, the non-delayed students, unaffected22 by the grade repetition regime change. These always attend grade10 at the age of 15. Keep noting their average score YND10. Another group -- also unaffected by the regime change -- consists of the delayed students always attending grade <10. Their score is noted YD<10. The third group consists of the students directly affected by the reform (again, the treated students). Their average score in grade<10 in 2006 after the reform is YT<10.
21
The computed weights used to build synthetic controls are presented in Annex 9.
22
Again, we leave aside spillover effects.
19
Assume further that Y1 represents the post-treatment (i.e. 2006) observed overall score average in the French-Speaking Community of Belgium.
Y1 0YND10 + (1-)YD<10 + YT<10
[7.]
with 0+ =
The synthetic counterfactual -- corresponding to the case where, due to the absence of the equivalent of the 2001 reform, students are in grade 10 --, writes;
Y1SC0YND10 + YT10 +(1-)YD<10
[8.]
with 0+ =
The difference between these two observed averages [7],[8] is equal to
Y1- Y1SC=(YT<10- YT10)
[9.]
with 0<<<<1.
The first line of Table 4 reports estimates of [9]. They suggest a minor attainment decrease (Y_dif) of about -0.29 points which appears totally insignificant from a statistical point of view. But again, these results consist of averages that are computed with the scores of all pupils. They implicitly (and wrongly) assume that all pupils have been "treated".
Turning back to the frequencies of Table 2, it is more likely that only a small fraction of the cohort that has been directly23 treated ( =0 ­ 1=0.586 -0.553= 0.0336): about 3.4%. Hence, average-based comparisons of the kind reported in Table 3 -- and also in previous sections -- are unlikely to properly reveal the true magnitude of treatment on those who have really been treated.
23
We leave aside spillover effects.
20 To cope with this problem, we follow a strategy that is similar to the one used just above. It consists of incrementally eliminating the upper (and lower) percentiles of the distribution of scores within each grade (Figure 7)24 to lift the (relative) weight of treated students in the averages that are compared with the SC method. The crucial assumption is now that the treated students must be concentrated above and below the grade<10/grade10 cut-off zone. Figure 7 ­ Increasing the chances of identifying the treated students: eliminating upper and lower parts of the grade-specific score distribution
Grade<10
Grade10
>p-th perc
Pisa Score
If for instance one eliminates from the SC computation the students that are above (below) the 90-th (10-th) percentile of grade 10 (grade<10) (meaning that we retain =90% of the initial overall sample), we should a priori increase the weight of the treated students in the comparison of averages.
| Y1- YSC (y<10>10th perc. or y10 90th perc.)=(0.9)(YT<10- YT10)
[10.]
where (0.9)= *1/0.9 assuming all treated pupils belong to the kept percentiles
The second line of Table 4 contains the results when one estimates the left-hand part of [9]. They suggest a -1.02 (non-weighted) effect that is not statistically significant. When divided
24
Actual score distribution by grade is in Annexe 5
21
by the weighing factor (0.9) the estimate is -27 points, which suggest that the reform has led to lower scores for the treated students. But again this result is not statistically significant.
The rest of Table 4 presents our estimates when one eliminates 20%, 30%, 40% ... up to 90% of the initial sample to focus on the observations concentrated below and above the cut-off point where, presumably, treated students should be concentrated.
| Y1- YSC (y<10>100- th perc. or y10 th perc.)=()(YT<10- YT10)
[11.]
where ()= *1/ assuming all treated pupils belong to the retained percentiles
Basically, results remain unchanged. The sign of the estimates change from negative to positive. What is more, all estimates are statistically non significant. The tentative conclusion is thus that the 2001 reform has had no effect on the score of filtered-out students.
This is not necessarily surprising. Recall there are two opposite mechanisms which could affect the score of students when they are moved from grade 10 to grade<10 (or vice versa): First, a (negative) curriculum effect implying that being the grade<10 curriculum is poorer than the one taught in grade 10. Second, a (positive) ability/curriculum matching effect implying that the weakest students attending grade 10 before the 2001 reform could be those who struggle to grasp the material of a more advanced curriculum, thereby benefiting from being retained in grade <10 where there are exposed to a curriculum that better matches their capabilities.
Since the reform of 2001 lead to reallocation of pupils from grade 10 to below, one would expect a negative difference in means due to a curriculum effect. But our results suggest that this has probably been compensated by improvements due to a better ability/curriculum match.
22
Table 4 ­ All grades pooled. Change of attainment between French-Speaking Belgium and
synthetic French-Speaking Belgium as an estimate of treated students attainment change. Year 2006$
Equation
_Y_French
Speaking _Y_
Belgium synthetic Y_dif p-value

()=/ Y_diff/()
All obs.
493.40
493.70 -0.29 0.8537 0.034 1
0.034
-8.67
>p10<=p90
499.97
500.98 -1.02 0.5056
0.9 0.037
-27.18
>p20<=p80
500.48
497.94 2.54 0.0915
0.8 0.042
60.31
>p30<=p70
500.59
502.29 -1.70 0.3039
0.7 0.048
-35.33
>p40<=p60
500.42
503.02 -2.59 0.1583
0.6 0.056
-46.18
>p50<=p50
500.61
503.51 -2.90 0.1692
0.5 0.067
-43.01
>p60<=p40
501.34
503.09 -1.75 0.5013
0.4 0.084
-20.81
>p70<=p30
503.03
504.21 -1.18 0.7101
0.3 0.112
-10.52
>p80<=p20
507.91
504.32 3.59 0.3660
0.2 0.168
21.34
>p90<=p10
526.04
515.72 10.32 0.1533
0.1 0.337
30.65
Source: PISA 2000, 2003 and 2006
*** Significant at 1%, **significant at 5%, * significant at 1% $ Score comparisons for the year 2000 to 2006 for a selection of estimated models are on display in Appendix 7,
whereas Annex 8 displays the comparison of predictor/control variables.
4. Conclusion
This paper exploits a reform in the French-Speaking Community of Belgium (re)introducing the possibility to impose grade retention at the end of both grade 7 and grade 8, to evaluate the effects of grade retention. The reform constitutes a natural experiment introducing an exogenous variation in the assignment of pupils to grade. Indeed, the reform lead to a reduction in the likelihood of reaching grade 10 at the age of 15 (i.e. no grade retention record), and symmetrically, to an increase in the likelihood of attending lower grades. Using a synthetic control (SC) method to generate a post-reform French-Speaking-Belgium counterfactual we are able to address two issues. First, whether a grade retention regime does at least succeed in filtering out weaker students. Second, whether the weaker pupils who end up being retained into lower grades under a grade repetition regime perform worse/better than under a social promotion regime. We find statistically significant evidence in support of the screening out effect of grade retention. But we fail to demonstrate that filtered out
23 students perform differently under the grade repetition regime than under the social promotion regime. Our results suggest that the negative curriculum effect repeaters traditionally suffer from may have been compensated by a better ability/curriculum match. A limitation of the paper -- that uses same-age score data -- is that it cannot assess the effects of the reform on the final educational achievements. Comparing retained and promoted pupils at the age of 15 is problematic, as, by definition, they are in different stages of the curriculum. In particular those who are forced to repeat a grade and who suffer from a negative curriculum effect should normally benefit from a richer curriculum when -- eventually -- they get promoted to the higher grade. The long-run balance could then perhaps be that grade repetition has a positive effect. However, the proper long-run cost-benefit analysis of grade repetition should then also account for the large costs of grade retention, particularly in terms of opportunity costs for the pupils (each grade repetition means that one year is lost), but also in terms of teaching resources (each grade repetition means one extra year of funding). Bibliography Abadie, A., A. Diamond and J. Hainmueller (2007), Synthetic Control Methods For Comparative Case Studies: Estimating the Effect of California's tobacco control Program, NBER Working Paper, No. 12831, Ma. Eide, E.R. and M.H. Schowlater (2001), The Effect of Grade Retention on Educational and Labour Market Outcomes, Economics of Education Review, 20(6), pp. 563-576. Grissom, J. and L. Shepard (1989), Repeating and dropping out of school, in L. Shepard and M. Smith (Eds.), Flunking Grades: Research and policies on Retention, London: The Palmer Press. Holmes, C. T. (1989), Grade level retention effects: A meta analysis of research studies. In L. A. Shepard & M. L. Smith (Eds.), Flunking grades: Research and policies on retention (pp. 16-33). London: Falmer.
24 Jacob, B.A & L. Lefgren, (2004), Remedial Education and Student Achievement: A Regression-Discontinuity Analysis, Review of Economics and Statistics, LXXXVI(1), pp. 226-244. Jacob, B.A & L. Lefgren, (2009), The Effect of Grade Retention on High School Completion, forthcoming in American Economic Journal - Applied Economics. Jimerson, S.R. (1999), On the Failure of Failure, Journal of School Psychology,37(3), pp. 243­272. Jimerson, S.R. (2001), Meta-analysis of Grade Retention Research: Implications for Practice in the 21st Century, School Psychology Review 30(3), pp. 420-437 Manacorda, M. (2008), The Cost of Grade Retention, CEP Discussion Papers 0878, Centre for economic performance, LSE. McCoy, A.R. & A. J. Reynolds (1998), Grade Retention and School Performance: An Extended Investigation, Institute for Research on Poverty, Discussion Paper no. 1167-98, University of Wisconsin,-Madisson. Roderick, M. (1994), Grade retention and school dropout: Investigating the association. American Educational Research Journal, 31, pp. 729­759. Roderick, M., & Nagaoka, J. (2005). Retention under Chicago's high-stakes testing program: Helpful, harmful, or harmless? Educational Evaluation and Policy Analysis, 27(4), pp. 309340.
25
Annex 1 ­ PISA 2000, descriptive statistics
Country Australia Austria Belgium( (Fl) Belgium (Fr) Brazil Canada Switzerland Czech Republic Germany Denmark Spain Finland France United Kingdom Greece Hungary Ireland Iceland Italy Japan Korea Liechtenstein Luxembourg Latvia Mexico Netherlands Norway New Zealand Poland Portugal Russian Federation Sweden United States
Nobs 2859 2640 2211 1573 2717 16701 5456 3066 2830 2395 3428 2703 3861 54627 2605 3491 2128 6424 4413 2940 2769 175 4483 2719 2567 1382 2307 2048 1976 2545 3719 2464 3010
Share of pupils
attending ref. Parental
grade SES index
0.92
52.64
0.50
49.02
0.74
48.33
0.59
50.67
0.41
42.77
0.80
51.27
0.87
47.73
0.57
48.32
0.84
49.75
0.92
49.82
0.72
45.02
0.89
50.07
0.60
47.89
0.54
44.34
0.95
47.89
0.95
48.57
0.96
48.23
1.00
54.11
0.83
46.79
1.00
50.37
0.99
42.41
0.81
46.73
0.79
43.84
0.51
48.83
0.56
43.22
0.48
51.59
0.98
53.95
0.92
52.09
1.00
44.72
0.55
44.59
0.73
49.72
0.97
50.38
0.59
52.50
Mean
Mother degree 4.07 3.41 4.33 4.05 3.11 4.64 3.41 4.13 3.60 4.26 3.20 3.68 3.94 4.06 3.84 4.17 4.03 3.83 3.84
Father degree 4.09 3.49 4.43 4.20 3.13 4.48 3.61 3.99 3.88 3.98 3.36 3.48 3.84 3.74 3.79 3.92 3.72 3.93 3.82
3.65
3.85
3.15
3.66
3.34
3.51
4.73
4.61
2.78
3.03
3.76
3.98
4.20
4.14
4.02
3.97
4.25
4.02
3.14
3.19
4.80
4.69
4.40
4.29
4.65
4.66
Score in Math 530.33 506.86 546.16 493.60 320.05 524.98 531.36 499.27 500.03 516.02 478.79 537.03 523.24 519.82 446.42 492.52 502.92 526.54 460.25 560.07 541.47 513.85 442.23 451.53 394.19 573.72 498.75 536.51 460.09 458.85 478.71 509.90 492.56
Score in reading 529.94 508.16 493.55 505.25 549.54 524.63 477.07 495.16 433.52 446.48 521.75 405.06 504.78 494.52 491.02 490.00 510.17 518.18 492.97 528.00 505.93 522.41 481.94 490.40 540.09 538.46 488.76 508.79 487.21 489.84 493.53 504.44 484.51
Score in Science 523.43 512.59 524.44 452.32 388.10 515.59 503.18 494.82 497.15 492.27 491.11 527.14 497.60 526.15 472.65 506.26 512.89 521.76 481.98 522.15 526.14 451.99 487.94 444.13 454.30 508.79 507.14 522.00 474.39 463.23 464.20 501.87 483.13
Share of pupils attending ref. grade 0.26 0.50 0.44 0.49 0.49 0.40 0.34 0.50 0.37 0.27 0.45 0.31 0.49 0.50 0.21 0.22 0.19 0.07 0.38 0.00 0.11 0.39 0.41 0.50 0.50 0.50 0.13 0.27 0.00 0.50 0.44 0.17 0.49
Parental SES index 16.75 14.03 16.29 17.21 17.23 16.37 15.37 13.70 15.80 15.86 16.40 16.40 17.79 18.51 18.07 15.77 15.22 15.38 15.49 15.49 14.24 15.31 17.55 18.19 17.10 16.28 15.32 16.90 15.09 16.09 17.05 16.16 16.27
Standard deviation
Mother degree 1.07 0.95 1.09 1.27 1.40 0.86 1.02 1.03 1.00 1.03 1.38 1.19 1.15 1.01 1.30 1.01 1.23 1.24 1.16 1.24 0.81 1.41 0.73 1.33 1.19 1.10 1.12 0.97 1.26 0.62 0.99 0.85
Father degree 1.07 0.96 1.03 1.20 1.43 1.00 1.10 1.03 1.04 1.06 1.42 1.15 1.15 1.11 1.31 1.01 1.31 1.20 1.14 1.22 1.11 1.41 0.87 1.40 1.18 1.11 1.15 1.01 1.29 0.75 1.05 0.87
Score in Math 89.26 89.89 92.95 105.10 93.77 79.69 85.23 93.16 95.81 80.49 84.85 74.56 87.23 107.19 100.67 88.02 78.77 78.24 82.82 81.13 79.79 91.63 84.11 100.27 78.28 84.05 87.08 94.52 96.19 85.61 98.14 88.61 96.10
Score in reading 99.58 109.02 101.03 78.80 82.66 94.46 92.00 108.89 98.92 109.92 82.15 101.55 92.90 97.29 93.20 86.80 88.76 90.72 87.12 83.42 93.79 87.40 101.81 98.91 89.78 88.56 101.65 98.96 94.25 94.61 97.59 83.09 99.17
Score in Science 95.68 89.37 87.83 115.24 103.52 85.38 95.52 101.84 96.26 96.58 90.61 86.99 93.72 110.34 91.77 92.92 94.95 80.15 94.86 98.35 87.89 94.75 93.19 98.92 88.49 96.62 93.61 96.17 91.66 88.79 90.91 89.80 91.62
26
Annex 2 ­ PISA 2003, descriptive statistics
Country Australia Austria Belgium( (Fl) Belgium (Fr) Brazil Canada Switzerland Czech Republic Germany Denmark Spain Finland France United Kingdom Greece Hong Kong-China Hungary Indonesia Ireland Iceland Italy Japan Korea Liechtenstein Luxembourg Latvia Macao-China Mexico Netherlands Norway New Zealand Poland Portugal Russian Federation Slovak Republic Sweden Thailand Tunisia Turkey Uruguay United States Yougoslavia
Nobs 12551 4597 5059 3737 4452 27953 8420 6320 4660 4218 10791 5796 4300 9535 4627 4478 4765 10761 3880 3350 11639 4707 5444 332 3923 4627 1250 29983 3992 4064 4511 4383 4608 5974 7346 4624 5236 4721 4855 5835 5456 4405
Share of
pupils
attending ref. Parental
grade SES index
0.91
52.61
0.52
47.42
0.73
50.81
0.59
50.56
0.66
40.53
0.80
50.75
0.83
48.09
0.54
51.62
0.82
49.60
0.91
49.13
0.74
44.92
0.87
50.76
0.63
49.02
0.84
49.54
0.90
46.38
0.60
41.16
0.92
48.33
0.87
35.14
0.97
48.49
1.00
53.63
0.84
47.54
1.00
49.84
0.99
46.09
0.79
50.80
0.85
48.18
0.81
50.74
0.56
39.88
0.76
41.73
0.51
51.48
0.99
54.68
0.93
51.62
0.96
44.77
0.64
42.95
0.70
50.22
0.62
49.60
0.97
50.71
0.57
37.19
0.38
37.49
0.94
41.90
0.58
46.10
0.68
54.19
1.00
48.30
Mean
Mother degree 3.49 3.25 3.59 3.72 2.80 3.83 3.02 3.36 3.08 3.76 2.81 3.92 3.17 3.48 3.02 1.95 3.33 2.03 3.22 3.14 3.04 3.78 2.94 2.90 3.22 4.09 1.80 2.25 3.17 3.88 3.44 3.23 2.05 3.67 3.29 3.83 1.86 1.46 1.63 3.03 3.62 3.60
Father degree 3.54 3.65 3.65 3.70 2.83 3.65 3.39 3.46 3.41 3.51 2.95 3.70 3.25 3.32 3.10 2.16 3.35 2.38 3.10 3.33 3.04 3.67 3.33 3.51 3.42 3.95 1.92 2.51 3.46 3.85 3.26 3.12 2.06 3.59 3.38 3.57 1.99 2.06 2.36 2.98 3.53 3.71
Score in Math 522.33 511.86 552.56 506.97 360.41 521.40 518.24 534.95 508.41 513.69 494.78 542.81 514.73 514.44 440.88 555.86 488.59 361.51 504.68 515.05 496.00 533.51 540.60 536.46 493.48 486.17 522.79 405.40 542.12 495.64 525.62 489.00 465.23 472.44 504.12 507.95 422.73 359.34 426.72 412.99 481.47 436.36
Score in reading 523.85 497.09 528.99 486.94 406.90 516.18 491.76 505.64 497.12 491.32 489.91 541.60 500.04 512.24 468.10 513.87 480.66 383.97 517.21 491.78 500.99 497.36 532.85 525.66 479.78 493.02 493.66 421.72 516.89 499.68 523.40 495.19 476.10 446.89 475.22 513.12 426.33 375.24 443.52 422.68 494.09 411.01
Score in Science 522.78 497.37 528.02 490.24 393.06 508.65 502.81 541.17 508.23 474.54 490.43 544.49 515.89 519.38 477.49 544.61 501.54 397.19 507.12 494.79 515.11 546.98 536.84 525.81 483.07 491.39 521.21 421.79 528.71 484.63 523.03 496.26 466.71 493.71 501.58 505.00 435.52 385.33 436.14 429.23 490.01 436.08
Share of pupils attending ref. grade 0.28 0.50 0.45 0.49 0.47 0.40 0.38 0.50 0.39 0.29 0.44 0.33 0.48 0.36 0.30 0.49 0.28 0.34 0.17 0.00 0.37 0.00 0.12 0.41 0.36 0.39 0.50 0.43 0.50 0.08 0.25 0.19 0.48 0.46 0.48 0.16 0.50 0.49 0.24 0.49 0.47 0.00
Parental SES index 16.04 16.08 16.76 16.85 15.96 15.97 15.83 14.72 16.26 15.48 16.84 16.93 16.74 16.56 16.86 13.44 15.26 18.16 15.81 16.72 16.29 14.74 13.45 14.97 16.60 16.52 12.64 18.76 15.87 15.38 16.41 14.87 15.98 16.76 16.21 16.16 16.20 17.80 15.33 18.17 16.38 16.81
Standard deviation
Mother degree 1.38 1.03 1.36 1.46 1.78 1.17 1.26 0.88 1.32 1.34 1.64 1.33 1.33 1.24 1.38 1.24 1.05 1.51 1.29 1.27 1.27 1.21 1.33 1.21 1.71 1.04 1.32 1.73 1.35 1.17 1.38 0.90 1.82 0.97 0.89 1.35 1.39 1.45 1.39 1.72 1.19 1.21
Father degree 1.37 1.14 1.33 1.43 1.79 1.25 1.38 0.91 1.43 1.29 1.70 1.43 1.35 1.29 1.50 1.28 0.96 1.55 1.40 1.21 1.27 1.34 1.41 1.36 1.62 1.07 1.28 1.77 1.43 1.20 1.37 0.98 1.77 0.95 0.93 1.44 1.39 1.49 1.53 1.72 1.22 1.17
Score in Math 94.01 88.47 101.97 101.17 91.67 84.03 91.50 95.58 97.16 87.10 82.15 79.50 87.03 88.08 89.83 94.18 89.66 73.05 81.76 86.39 89.79 96.71 89.23 95.28 88.01 83.35 84.57 74.47 89.80 88.35 95.03 86.42 83.98 88.04 88.91 90.97 80.80 77.24 97.81 99.08 90.38 81.33
Score in reading 93.16 94.67 94.44 102.22 97.37 84.49 86.07 88.83 100.74 81.02 86.52 73.65 88.77 87.81 95.80 76.87 84.97 64.99 81.51 91.13 90.84 98.60 76.91 83.80 93.51 82.24 64.51 77.67 80.60 95.18 98.82 88.88 86.68 83.10 84.69 89.01 72.73 84.58 84.79 116.26 94.29 74.49
Score in Science 98.19 90.21 93.88 100.22 85.53 92.66 96.61 96.23 103.44 93.98 89.55 83.32 101.88 97.03 90.87 85.75 89.84 56.94 86.68 88.31 96.56 102.43 93.98 96.65 96.07 84.33 81.35 72.07 93.86 95.94 97.74 94.04 86.26 88.95 92.89 98.30 75.58 76.00 85.89 101.95 93.96 74.67
Annex 3 ­ PISA 2006, descriptive statistics
Country Argentina Australia Austria Azerbaijan Belgium( (Fl) Belgium (Fr) Bulgaria Brazil Canada Switzerland Chile Colombia Czech Republic Germany Denmark Spain Estonia Finland France United Kingdom Greece Hong Kong-China Croatia Hungary Indonesia Ireland Iceland Israel Italy Jordan Japan Kyrgyzstan Korea Liechtenstein Lithuania Luxembourg Latvia Macao-China Mexico Montenegro Netherlands Norway New Zealand Poland Portugal Qatar Romania Russian Federation Serbia Slovak Republic Slovenia Sweden Chinese Taipei Thailand Tunisia Turkey Uruguay United States
Nobs 4339 14170 4927 5184 5124 3733 4498 9295 22646 12192 5233 4478 5932 4891 4532 19604 4865 4714 4716 13152 4873 4645 5213 4490 10647 4585 3789 4584 21773 6509 5952 5904 5176 339 4744 4567 4719 4760 30971 4455 4871 4692 4823 5547 5109 6265 5118 5799 4798 4731 6595 4443 8815 6192 4640 4942 4839 5611
Share of pupils
attending ref. Parental
grade
SES index
0.71
45.98
0.91
52.97
0.52
48.13
0.94
50.32
0.75
49.80
0.55
50.62
0.96
48.03
0.59
42.52
0.84
52.60
0.84
48.53
0.78
41.27
0.61
43.04
0.50
50.76
0.84
49.19
0.89
49.25
0.69
46.32
0.73
50.81
0.89
48.87
0.60
48.78
0.97
50.06
0.95
48.92
0.64
42.77
1.00
46.65
0.96
48.20
0.89
37.96
0.97
49.22
1.00
54.01
0.86
53.27
0.82
46.38
0.92
51.71
1.00
50.35
0.93
47.09
0.99
49.99
0.83
51.16
0.88
49.59
0.88
47.69
0.82
49.34
0.37
41.91
0.78
43.58
1.00
48.89
0.51
52.05
0.99
53.11
0.94
51.79
0.97
45.32
0.54
42.01
0.77
61.67
0.95
43.39
0.66
51.33
0.99
48.51
0.61
47.50
0.99
47.83
0.98
50.70
0.69
49.55
0.67
38.90
0.49
37.85
0.57
39.83
0.70
45.82
0.88
52.46
Mean
Mother degree 2.88 3.55 3.39 3.77 3.73 3.64 3.48 2.56 3.99 3.18 2.82 2.61 3.52 3.25 3.91 2.94 3.82 4.19 3.24 3.69 3.24 2.16 3.46 3.41 1.98 3.40 3.33 3.88 2.91 3.02 3.91 4.22 3.26 3.13 3.99 2.94 3.87 1.84 2.43 3.83 3.41 4.00 3.60 3.25 2.01 3.24 3.65 3.65 3.56 3.34 3.29 4.03 2.80 1.90 1.84 1.55 3.14 3.78
Father degree 2.72 3.50 3.77 3.95 3.67 3.66 3.38 2.45 3.76 3.52 2.91 2.69 3.55 3.50 3.64 2.99 3.66 3.91 3.25 3.52 3.27 2.26 3.52 3.37 2.28 3.29 3.51 3.94 2.89 3.32 3.96 4.27 3.60 3.58 3.70 3.15 3.64 1.99 2.66 3.93 3.59 3.90 3.46 3.18 1.93 3.60 3.65 3.54 3.64 3.36 3.23 3.74 2.97 2.07 2.43 2.22 3.02 3.67
Score in Math 388.12 516.26 509.51 476.76 545.82 500.99 417.41 365.57 517.42 528.29 417.08 373.83 536.03 504.32 512.23 501.65 516.77 548.99 496.43 497.27 462.04 551.39 467.32 496.18 380.69 502.34 505.59 443.32 473.63 389.18 525.55 315.90 547.17 524.86 485.61 490.49 491.24 524.41 420.70 395.84 537.41 489.84 523.77 500.95 470.94 317.74 414.97 478.66 436.64 495.10 482.21 503.23 562.75 425.47 363.91 428.25 435.47 474.72
Score in reading 383.93 508.69 493.95 354.98 524.34 483.55 406.83 389.18 512.32 496.60 447.86 390.31 509.64 496.53 493.80 479.52 502.38 547.08 488.66 495.64 461.91 538.95 477.55 488.10 383.92 518.65 484.99 441.30 477.01 409.49 500.21 290.54 556.06 510.74 469.33 480.07 484.86 490.64 427.36 388.23 513.91 484.37 522.74 512.63 476.84 312.51 391.97 442.37 402.86 470.55 468.58 508.99 506.68 425.19 378.96 452.92 424.68
Score in Science 398.33 523.13 513.86 385.35 531.35 495.68 439.05 385.25 522.52 508.02 443.11 391.86 537.61 516.21 494.72 504.51 533.73 563.38 496.12 514.27 476.64 546.09 493.65 508.72 384.76 509.49 490.95 455.63 487.15 427.10 533.72 326.33 521.92 522.25 486.52 486.85 493.78 509.46 422.64 408.79 530.76 486.93 532.68 503.29 478.97 349.08 416.61 481.50 436.93 491.22 494.19 504.23 543.71 429.73 384.19 427.61 437.68 488.29
Share of pupils attending ref. grade 0.45 0.29 0.50 0.23 0.44 0.50 0.21 0.49 0.37 0.36 0.41 0.49 0.50 0.37 0.32 0.46 0.45 0.31 0.49 0.18 0.21 0.48 0.03 0.20 0.32 0.16 0.04 0.35 0.38 0.28 0.00 0.26 0.10 0.37 0.33 0.33 0.38 0.48 0.42 0.04 0.50 0.07 0.24 0.18 0.50 0.42 0.22 0.47 0.10 0.49 0.08 0.14 0.46 0.47 0.50 0.50 0.46 0.32
Parental SES index 16.95 16.40 16.60 18.71 16.26 17.01 16.26 18.42 15.75 15.84 16.76 17.36 14.70 16.35 17.10 17.34 16.56 16.98 16.60 16.23 16.82 13.57 15.09 15.16 15.66 16.37 16.99 15.99 16.32 17.19 14.70 18.02 13.42 15.50 17.87 16.62 16.63 13.93 18.57 16.14 15.68 15.34 15.97 15.33 16.30 12.97 16.25 17.09 16.28 15.80 15.71 15.86 15.91 16.65 18.74 15.47 18.66 16.78
Standard deviation
Mother degree 1.87 1.31 1.09 1.20 1.32 1.43 1.11 1.82 1.18 1.35 1.49 1.90 0.99 1.27 1.32 1.52 1.05 1.29 1.32 1.19 1.37 1.27 1.12 1.09 1.46 1.27 1.38 1.35 1.14 1.64 1.06 1.15 1.15 1.27 1.05 1.77 1.07 1.30 1.78 1.21 1.37 1.17 1.38 0.84 1.78 1.79 1.22 0.97 1.18 0.89 1.07 1.29 1.00 1.47 1.60 1.27 1.81 1.28
Father degree 1.84 1.34 1.18 1.25 1.32 1.43 1.02 1.85 1.26 1.44 1.50 1.94 0.99 1.34 1.29 1.58 1.08 1.43 1.37 1.25 1.47 1.34 1.09 1.02 1.51 1.36 1.31 1.36 1.17 1.63 1.10 1.12 1.26 1.41 1.06 1.69 1.06 1.30 1.83 1.16 1.39 1.21 1.33 0.79 1.76 1.65 1.18 0.96 1.15 0.91 1.00 1.40 1.11 1.50 1.63 1.45 1.84 1.31
Score in Math 90.14 85.86 91.52 44.35 93.18 97.97 93.71 87.27 82.21 90.62 82.41 82.77 103.75 95.49 80.01 83.89 75.96 76.38 91.38 84.01 86.51 88.43 79.02 85.38 69.47 77.67 83.64 102.41 92.29 75.92 86.29 80.05 88.78 88.50 85.35 88.86 77.00 79.68 72.03 79.63 83.69 86.71 88.49 84.35 85.43 83.61 79.79 84.46 86.28 89.62 84.70 85.46 94.56 81.29 85.47 89.33 93.53 85.46
Score in reading 110.10 92.75 101.68 66.03 98.87 102.14 109.81 94.26 95.24 87.57 94.79 96.91 108.19 107.99 84.90 82.58 81.39 76.77 98.95 96.01 97.01 76.99 84.93 87.53 64.73 87.34 92.19 113.34 102.63 85.85 96.87 94.27 84.79 91.50 91.38 95.43 84.48 72.31 81.48 84.98 90.09 99.50 99.79 95.36 93.14 101.25 88.77 85.79 86.04 98.83 88.96 93.35 76.92 78.79 87.79 83.21 111.99
Score in Science 92.48 99.70 93.29 53.27 88.91 98.50 100.77 85.97 92.37 93.68 87.38 80.85 99.67 97.09 89.69 84.08 80.32 82.19 98.24 102.80 87.94 87.43 82.26 83.67 59.36 90.86 93.50 107.85 93.27 83.72 96.05 77.78 87.02 94.20 87.00 93.60 80.31 75.56 71.97 75.79 91.00 91.99 103.74 87.82 84.17 78.49 77.84 85.86 80.93 89.57 93.79 91.31 88.44 79.22 77.22 79.55 91.32 102.37
27
28 Annex 4 ­ Synthetic control as an indentifying strategy
Suppose we observe i=1 to J educational systems during T periods. Suppose that the first one (i.e. the French-Speaking Community of Belgium) is exposed to the intervention/policy change of interest in time T0.
Let Yi,t be the outcome that could be observed for system i at time t
Yi,t= t + it Dit + it
[1]
it Zit + ti + it
with Dit=1 if i=1 and t>T0 and Dit=0 otherwise, where t is a common time period effect, Zi is a vector of observed covariates that potentially influence the outcome, i is an unobserved system-specific effect, t is an unknown common factor, and it are unobserved transitory shocks at the system level with zero mean for all.
We aim at estimating for t>T0 (i.e. after the intervention) 1t= Y1t ­ YN1t Because Y1t is observed, we only need to estimate its counterfactual YN1t. Consider a (Jx1) vector of weights W=(w2,....wJ ) such that wi0 and w2+....+wJ=1. Each particular vector W represents a potential synthetic control (SC), that is, a particular weighted average of control systems.
Consider
an
arbitrary
linear
combination
K
of
all
pre-intervention
outcomes
YKi

T0 s=1
(kiYis)
J2(wi
YKi)
-
YK1=
J2wi
T0 1
(kiYis)
-
T0 1
(kiY1s)
[2]
Using [1], this can be written
J2(wi YKi) - YK1= T01ks s + J2wiT01ki is - T01ks s - T01ks 1s
[3]
or equivalently, exploiting the fact that J2wi =1
J2(wi YKi) - YK1= J2wi (is - 1s)
[4]
Using the definition of is and 1s in [1], the expression becomes J2(wi YKi) - YK1=(J2wi Zi ­Z1)(T01kss) + (J2wi i- 1)(T01kss) + J2wiT01ks(is - 1s)
29
[5]
In addition (for any t),
J2(wi Yit) - YN1t=(J2wi Zi ­Z1)t + (J2wi i- 1)t+ J2wi(it - 1t)
[6]
Suppose that we choose W*=(w*2,....w*J) such that
J2(w*i YKi) - YK1=0 and J2w*i Zi ­Z1=0
Then, the left-hand term of [5] as well as the first term of the right-hand part of [5] and [6] disappear. What is more, if T01kss 0, we obtain from [5] that
J2wi i- 1 = -[1/(T01kss)]J2wiT01ks(is - 1s)
[7]
Hence [6] becomes a function of the random error terms exclusively, with an expected value equal to zero
J2(w*i Yit) - YN1t= - t [/(T01kss)]J2wiT01ks(is - 1s) + J2wi(it - 1t)
[8]
Therefore, for t>T0 we have that J2(w*i Yit) equates the (unobserved) counterfactual YN1t. Hence,
1t= Y1t - J2(w*i Yit)
The computation of W* is done by minimizing the pseudo-distance || X1 ­ XSCW|| subject to the condition that wi0 and w2+....+w J=1 where X1= (Z1, YK11,..., YKM1,) is the vector of pre-treatment characteristics that comprises Z1= observable controls and YK11,..., YKM1 i.e. linear combinations of the PISA scores. The similar vector for the non-treated countries is XSC.
30
Annex 5 ­ Inference analysis with Synthetic control Unlike Abadie, Diamond and Hainmueller (2007) we have access to individual data within each country. Like them, we run the Stata synth procedure, using data aggregated at the country level (Y, Z). This explains that we rely on (numerous) individual data to do hypothesis testing and computed the results reported in Tables 3 and 4.
The statistics we aim at are standard t-tests gauging the plausibility that two means (the post-treatment for the treated country and its synthetic control) are statistically different.
t= (Y1-YSC) / (Var2(1/N1 + 1/NSC))1/2
[1]
with Var2 the pooled sample standard deviation equal to
Var2= Var1 (N1-1) + VarSC(NSC-1)/(N1+ NSC-2) where S1 is the standard deviation characterising the treated entity post treatment (here the French Speaking Community of Belgium in 2006) and VarSC the standard deviation characterizing its synthetic equivalent. It is important to stress how the latter is computed.
Assume we have N1,N2... NJ students j in each of the J countries that participated to PISA, with N1 designating the Sample Size for the treated country (i.e. the French-Speaking Community of Belgium).
The synthetic control score for the post-treatment period computed by the STATA code developed by Abadie, Diamond and Hainmueller (2007) uses country-level averages Yi =1/Ni Ni1=1Yil. The delivered score is equal to
YSC=Ji=2w*i Yi = Ji=2w*i(1/Ni Ni1=1Yil)
[2]
The point is that the latter can be replicated using individual/disaggregated data by i) applying the estimated weights W* to the entire sample N$=N2+... + NJ of individuals forming the synthetic control entity (w*i Yij) ii) provided the individual values are weighted by N$/Ni
YSCR=
1/N$( J Ni 21
*i
Yil)
with
*iN$/Ni
w*i
[3]
It is immediate to show that [3] is equal to [2]. What is more, the same weighing strategy can be used to compute from individual data the variance characterising the synthetic control entity.
SSC
=
1/N$(J Ni 21
(*i
Yij
-
YSCR)2
[4]-
31 Annex 6 ­ Pisa score distribution by grade (on the vertical axis, 0= below grade 10 and 1= grade 10,)
32
Annex 7 ­ PISA average score25 in 2000, 2003 (before treatment) vs 2006 (after treatment). Comparison between the French-Speaking Community Belgium and its synthetic control.
Grade 10 only
French-Speaking Belgium
550
540
530
520
510
500
490
480 2000
2003
synthetic control 2006
All grades pooled
French-Speaking Belgium
496 494 492 490 488 486 484 482 480 478 2000
2003
synthetic control 2006
Grade 10 (<70 th perc)
French-Speaking Belgium
485
480
475 470
465 460
455 450
445 2000
2003
All grades pooled (>70 th perc <=30 th perc)
synthetic control 2006
French-Speaking Belgium
synthetic control
490
485
480
475
470
465
460 2000
2003
2006
25
Country/entity averages, based on individual unweighted average score in math, science and reading PISA scores.
33
Annex 8 ­ Control/predictor variables 26. Comparison between the French-Speaking Community Belgium and its synthetic control (all grades pooled).
Synthetic control (all grades pooled)
French-Speaking Community of Belgium
Year
Share of Student Ratio of Share of Highest Mother Father Share of Student Ratio of Share of Highest Mother Father
pupils
teacher computers teachers parental education education pupils teacher computers teachers parental education education
attending
ratio to school with proper socio-
attending ratio to school with proper socio-
the
size certification economic
the
size certification economic
reference
index
reference
index
grade
(HISEI)
grade
(HISEI)
2000
0.63
13.90
0.09
2003
0.67
12.61
0.10
2006
0.64
13.12
0.09
0.81
48.92
4.41
4.31
0.59
10.06
0.07
0.90
50.40
3.51
3.50
0.59
10.14
0.09
0.91
50.27
3.52
3.45
0.55
9.90
0.11
0.77
50.67
4.05
4.20
0.86
50.56
3.72
3.70
0.78
50.62
3.64
3.66
26
Country/entity averages, based on individual unweighted average score in math, science and reading PISA scores.
34
Annex 9 ­ Country weights forming the synthetic French-Speaking Community of Belgium.
Country AUS AUT BFL BRA CAN CHE CZE DEU DNK ESP FIN FRA GBR GRC HUN IRL ISL ITA JPN KOR LIE LUX LVA MEX NLD NOR NZL POL PRT RUS SWE USA Sum
all data 0.253 0.526 0.221 1
<=p10 0.124 0.226 0.425 0.226 1
<=p20 0.346 0.444 0.140 0.700 1
Analysis of Grade 10 scores <=p30 <=p40 <=p50 <=p60 0.455 0.330
0.491
0.240
0.100
0.850 0.570 0.480
0.451 0.580
0.210 0.331 0.400
0.430
1
1
1
1
<=p70 0.439 0.244 0.318 1
<=p80 0.140 0.116 0.260 0.479 0.366 1
<=p90 0.138 0.763 0.100 1
35
Country AUS AUT BFL BRA CAN CHE CZE DEU DNK ESP FIN FRA GBR GRC HUN IRL ISL ITA JPN KOR LIE LUX LVA MEX NLD NOR NZL POL PRT RUS SWE USA Sum
all data 0.330 0.270 0.169 0.210 0.670 0.386 1
>p10 <=p90 0.363 0.300 0.231 0.500 0.530 0.299 1
Analysis of overall (all grades pooled) scores >p20 >p30 >p40 >p50 >p60 >p70 <=p80 <=p70 <=p60 <=p50 <=p40 <=p30 0.386 0.481 0.533 0.532 0.534 0.510
0.250 0.290
0.323 0.460
0.235 0.300
0.280 0.580 0.100
0.280 0.184
0.990
0.500 0.220
0.600 0.610 0.170 0.650 0.590 0.670 0.160 0.192 0.364 0.191 0.380 0.359
1
1
1
1
1
1
>p80 <=p20 0.438 0.500 0.530 0.190 0.349 1
>p90 <=p10 0.345 0.584 0.710 1
ISSN 1379-244X D/2009/3082/022

M Belot, V Vandenberghe

File: grade-retention-and-educational-attainment.pdf
Title: Much Ado About Nothing
Author: M Belot, V Vandenberghe
Author: Vandenberghe Vincent
Published: Mon Aug 3 18:17:08 2009
Pages: 37
File size: 0.85 Mb

Copyright © 2018 doc.uments.com