Early school entrance and middle‐run academic performance in Mexico: evidence for 15‐year‐old students from the PISA test

Introduction Every year more than 120 million children in the word begin their formal education by entering first grade of elementary school. Each one with the truly hope of succeeding and been able, in the future, to fulfill a productive life. It will be necessary to overcome a long series of obstacles and difficulties, however, their likelihood of success will depend on different characteristics and circumstances. One of them is revealed the first day of classes and may be independent of all others: their relative age at the time of beginning school. The relative age with respect to classmates may have an impact on academic performance and school attendance in the middle and long run, and ultimately, in adulthood outcomes. In Mexico, as in most countries of the world, formal education starts with first grade of Elementary School at age of six. However, although been 6 years old at the first day of the school year is the rule, it does not guarantee the child is ready to go to the school. Such “readiness” to school may depend, at least partially, on the level of emotional maturity of the child (Woodhead and Moss 2007). Although it may seem irrelevant, diverse Abstract

Page 2 of 18 Aguayo-Téllez and Martínez-Rodríguez Large-scale Assess Educ (2020) 8:11 studies in developed countries agree that differences in child's maturity at the moment of starting first grade have consequences on children's academic performance; been the youngest ones the more disadvantaged (Bedard and Dhuey 2006;Datar 2006;Puhani and Weber 2007;Elder and Lubotsky 2009;Crawford et al. 2010;Sprietsma 2010;Grenet 2011;Kawaguchi 2011;Nam 2014;Cascio and Schanzenbach 2016). Then, how good (or bad) is to allow children younger than the required age to start first grade of elementary school? If such mentioned academic disadvantages are only present during a limited period of time (the short run), then, there should not be reason to be worried. However, if such differences persist during a longer period of time (affecting, for example, high school drop offs, college attendance, or labor productivity and wages) then, it must be important to enforce school entry age policies for children. It may also make researchers and policy makers to consider the possibility of shifting up the legal age of entry to first grade.
Estimating the effects of relative age on academic performance and other long-run labor outcomes may be challenging because student age within a class can be manipulated by the parents and hence it would be correlated with other student and family characteristics. Parents can voluntarily hold their children out of school-a practice known as redshirting-, or in other case, parents can prefer some seasons of birth for their children. This manipulation of the relative age of a student could bias Ordinary Least Squares estimations.
Many studies have attempted to estimate the effect of relative age on academic performance and other labor outcomes. Many of them try to address the potential biases in different ways, using controls for season of birth (Dhuey and Lipscomb 2008;Lawlor et al. 2006;Robertson 2011), or using data from jurisdictions where redshirting is not permitted (Kawaguchi 2011). However, studies for developing countries, including Mexico, are almost inexistent (Peña 2017).
Taking advantage of the unanticipated shift in the cutoff date for school eligibility fulfilled in Mexico in 2006, the objective of this paper is to measure the effect of the relative age of entry to first grade on the school performance of 15-year-old students in Mexico. More precisely, we measure the effect of entering first grade before reaching 6 years of age in (a) the probability of having failed at least one school year during the student's academic life, and (b) the scores of the PISA tests (math, science and reading). To do that, we use data from the PISA 2018 survey applied to 15-year-old students in Mexico, those who entered first grade in 2006.
It is important to note that to get unbiased estimators it would be required to have all 15-year-olds in the sample: students and not students. However, because the survey was conducted through schools, only students were interviewed. Although more than 85% of all 15-year-old individuals in Mexico are still studying (ENOE 2018) results may be underestimated given that it is expected that low academic performance individuals are more likely to drop off school and not been in the sample.
A first general approximation assumes that the probability of having failed at least one school year is an indicator of middle-run academic performance. This approximation uses a Probit model and considers a dummy variable that indicates whether or not the student entered first grade before reaching 6 years of age. Different student, family, and school characteristics are included as controls. The exercise finds that the probability

Background and theoretical model
In Mexico, the scholar cycle begins the last week of August and up to 2005, the official age to enter first grade was 6 years old by August 31st of the school year. However, in June 2006 an unanticipated modification of the Article 65 of the General Education Law (DOF 2006) shifted the official age to enter first grade to 6 years old by December 31st of the school year. This shift increased the number of children younger than 6 years registered in first year of elementary school from 17% in 200417% in to 31% in 200617% in (PISA 200617% in , 2018.
At the end, the shift in the cutoff date for school eligibility in 2006 brought so many difficulties to the Mexican education system, than, by the year of 2018, education authorities of most Mexican states shifted back the official age to 6 years old as of August 31st; although the Federal Authority still recognizes the 31st of December as the official date.
Early studies that considered "relative age" at school, such as Armstrong (1966) and Freyman (1965), pointed out that, although it is necessary for administrative processes to establish age cohorts to enter school, age differences between students within the same cohort may benefit the older students and may harm the younger ones. Even though, such studies did not find enough evidence to prove such hypothesis, they opened a line of research that has given important contributions to help us understand early childhood and primary education.
After those early papers, the effects of relative age at school have been widely studied in developed economies. Some researchers have focused on the effects of early entrance to school on different academic indicators (such as school attainment, knowledge test scores, or college admission), 1 and some others have focused on the effects of early entrance to first grade in different adulthood outcomes (such as wages, employment status, marital status, or house ownership). 2 In general, short-run academic outcomes are more often statistically validated than long-run academic and labor outcomes.
Page 4 of 18 Aguayo-Téllez and Martínez-Rodríguez Large-scale Assess Educ (2020) 8:11 For developing economies, there are very few studies. For Chile, McEwan and Shapiro (2008) found that a 1-year delay on school enrollment decreases the probability of repeating first grade by two percentage points, and increases fourth and eighth grade test scores by more than 0.3 standard deviations; and for Mexico, Peña (2017) shows that 1 year of additional age confers an advantage of 0.3 standard deviations in third, ninth, and twelfth grade test scores. Recently, in Mexico, the National Institute for the Evaluation of Education (INEE) executed a series of reports on the academic performance of third, sixth and ninth grade students through standardized tests. They found that younger and older third and sixth grade students tend to obtain lower test scores than typical age students in the same grade. This result does not hold for younger students in ninth grade (INEE 2018a(INEE , b, 2019. The INEE arguments that the lower performance of older students within the same school grade is consequence of late entry, temporary dropout, or grade repetition, situations related to unfavorable economic conditions of the children. Allen and Barnsley (1993) argument that when children enter first grade, teachers are not always able to differentiate between the level of maturity of a child and their ability to learn; so that, teachers may end up thinking that younger students are less intelligent. This thinking may mark a difference between older and younger children, creating an even higher academic performance difference between them. The authors mention that there is qualitative evidence to believe that the expectations and objectivity of teachers are important determinants of their students' test results.
Age differences are not taken lightly in many developed countries. In Sweden and Denmark, parents often delay the entrance to their children to school although there is an "official age" to do so; waiting for a higher level of maturity in their children and trying to assure they will be "ready to school" (Fredericksson and Ockert 2013). Also, in some states of the United States, to ensure children are ready to enter school, readiness maturity tests are applied to children before entering school. In the case that a child is not mature enough, parents could (if they agree) delay one year their child's entrance to school (May and Kundert 1997).
Hence, why is the relative age of children important at the time of entering primary school? Following Woodhead and Moss (2007), age is a proxy of children's readiness for school. This readiness for school may depend of many different factors, some of them intrinsic to the child's personality, and others related to family characteristics and demographics, cultural backgrounds, economic conditions, or the capacity of teachers and academic systems to help children to transit smoothly to the academic life (US National Education Goals Panel 1997). For this reason, there exist different constructs about the ideal age to start formal elementary education, and depending on the country, the cutoff line for school eligibility goes from 5 to 7 years.
All different educational structures and cutoffs are importantly related to relative age because skill-based curriculum usually begins during the first school grades when relative maturity is likely to play a large part in determining skill differences between young and old classmates, and hence, affect skill accumulation throughout the whole educational process (Bedard and Dhuey 2006).
Following Elder and Lubotsky (2009) who depart from a simple model of children's human capital accumulation, once children begin school, differences in school Page 5 of 18 Aguayo-Téllez and Martínez-Rodríguez Large-scale Assess Educ (2020) 8:11 achievement between children with different entrance ages may depend on (a) differences in preschool skills, (b) differences in contemporaneous parental [and school] investments in child's human capital, and (c) differences in the return to schooling. Hence, human capital of a child at age a ( h a ) can be measured as follows: h a = βh a−1 + I a (Y ) + θ a (a) ; where (1 − β) is the rate of depreciation of skills, I a (Y ) is the parents [and school] investment in child's human capital at age a as a function of parent [and school] resources, Y, and θ a (a) is the contribution of a year of schooling to human capital for a child who entered first grade at age a. Subsequently, human capital after k years of schooling can be computed as: h a+k = β k h a + k j=1 β k−j I a+j (Y ) + θ a+j (a) and the effect of a 1-year increase in age at the moment entering first grade on human capital k years after school entry becomes . Hence, after controlling for family [and school] characteristics, age at the moment of entering first grade has a lasting effect on human capital for two reasons: (a) the effects of skills acquired prior to enter elementary school, and (b) the ability of the child to learn once she/he is in school.

Data and the PISA for development project
This paper uses the Programme for International Student Assessment (PISA) survey applied in 2018 in Mexico. The PISA survey is conducted every 3 years in the OECD and many developing countries and it aims to evaluate education systems by testing the skills and knowledge of 15-year-old students. The PISA test evaluates competencies in mathematics, science and reading by means of 81 questions in mathematics, 184 in science and 103 in reading. 3 Some of these questions are open-ended and other of multiple-choice and in order to be able to determine a score for every area of knowledge, every question of the exam has a certain pre-established weight. In addition to the academic test, the students and their school principals also answer questionnaires to provide information about the students' backgrounds, schools and learning experiences and about the broader school system and learning environment. In some countries, including Mexico, optional questionnaires are distributed to parents, who are asked to provide information about their perception of their child's school, their support for learning, and their child's career expectation. 4 It is worth to note that given the way the PISA test is designed, there are neither minimums nor maximums values in the scores. Students took different combinations of different tests and PISA records both the difficult of questions and the proficiency of test-takers on a single continuous scale. PISA scores are set in relation to the variation in the results observed across all test participants; the results are scaled to fit approximately normal distributions, with means around 500 score points and standard deviations around 100 score points. Technically, a 10-point difference in the PISA scale corresponds to a Cohen's d effect size of 0.10. To help users to interpret what students' scores mean in substantive terms, PISA scores are divided into "proficiency levels", going from 1 to 6. Lower proficiency levels represent lower probabilities of been able to correctly perform a series of competences in mathematics, reading or science. Each proficiency level corresponds to a range of about 80 score points.
Following the PISA 2018 Results (OECD 2019a) about 600,000 students completed the questionnaires in the schools of 79 countries and economies, representing more than 32 million 15-year-old students around the world. A representative random sample of between 3300 and 35,000 15-year-old students was conducted for every country or economy and at least 150 schools were selected in each country. In the case of Mexico, 7299 students and 293 schools were surveyed, representing a total of 1.7 million students.
The variables used in the regression exercises come from the three academic tests (mathematics, science and reading) as well as from the three additional questionnaires (students, parents and school principals) applied in Mexico. Table 1 presents basic statistic information for 15-year-old students in Mexico, divided by age of entry to first grade.
The variable early is a dummy variable that takes the value of one if the student entered first grade before turning 6 years old, or zero otherwise. The PISA survey asks directly to the children how old they were at the moment of entering first grade, however, one on four children did not answered that question. Furthermore, this answer depends on the child's memory when she was only 5 or 6 years old. To avoid this problem and to recover missing values we constructed the variable early from the child's date of birth, her current school year, and the number of times she has repeated a school year. Following this procedure, we were able to keep 6199 valid observations (85% of total), and 31% of them (1934 obs.) entered first grade without having reached 6 years of age. 5 Repeat is a dummy variable that takes the value of 1 if the student failed at least one school year during her or his academic life, or zero otherwise. Almost 11% of all 15-years-old students have repeated at least one school year. As we can see in Table 1, this prevalence is higher for students who entered first grade without having reached 6 years of age (16.6%). In order to simplify reading, henceforth, students who entered first grade with 6 years of age or more will be called "regular" students and students who entered first grade without having reached 6 years of age will be called "early" students.
With respect to family characteristics, students with mothers with college education are less likely to enter first grade early while students whose mothers work are more likely to enter first grade early. Also, students who have a computer at home are less likely to enter first grade early.
With respect to school characteristics, in average 12.7% of all students study in private schools, however, early students are less likely to study in such kind of schools (11.1%). Also in average, there is one computer for every four students; however, early students are more represented in schools with fewer computers. Other characteristics of the schools such as the percentage of teachers with a master's degree, the number of students per teacher, or the average size of the class do not present differences between regular and early students. Shortage of material and shortage of staff are two index variables Aguayo-Téllez and Martínez-Rodríguez Large-scale Assess Educ (2020) 8:11 constructed by PISA based on school principals' responses about their perceptions on educational resources in their school. Higher values of the indexes indicate greater shortages of educational material or staff at school. 6 In average, early students are more represented in schools with greater shortages of educational material and staff. In general, these numbers show an important correlation between the economic conditions of the family and the school and the parent's decision to enroll their children in school prematurely.

PISA plausible values
Plausible values for the estimation of the academic proficiency of students began to be generalized in 1996 after the Third International Mathematics and Science Study (TIMSS) conducted by the International Association for the Evaluation of Educational Achievement (IEA). Wu and Adams (2002) define plausible values as "a representation of the range of abilities that a student may reasonably have. (…) Instead of directly estimating the student's ability θ; a probability distribution of a student's θ, is estimated. That is, instead of obtaining a point estimate for θ, a range of possible values for θ, with an associated probability for each of these values is estimated. " In the case of PISA 2018 tests, ten plausible values, and their corresponding associated probabilities, were estimated for each one of the three areas of knowledge (reading, mathematics and science).
The use of plausible values allow us to obtain unbiased estimators for all population parameters estimated in the regressions, including academic performance and bivariate and multivariate parameters of the relation between academic performance and different students and schools characteristics. Plausible values are scaled using Rash's model (OECD 2009 Table 2 shows average PISA scores in mathematics, science and reading in 2018 for selected countries and economies. The highest scores in Mathematics, Science and Reading were obtained by China (Beijing, Shanghai, Jiangsu and Zhejiang) with 591, 590 and 555 score points respectively. Mexico is far below, ranked 61 out of 78 in Mathematics with 409 score points, 57 in Science with 419 score points and 53 in Reading with 420 score points; not only below the OECD average but also below the total sample average.

A brief sight to PISA results for Mexico
In terms of proficiency levels, Table 3 shows percentage of Mexican students in each proficiency level for mathematics, science, and reading. Students classified in levels 4, 5 and 6 have the potential to perform activities of high cognitive complexity. In Mexico, less than four percent of students in mathematics, science and reading were classified in levels 4, 5 and 6. On the other hand, students classified in levels 1a and below have insufficient knowledge to access higher education and to develop the minimum activities required by life in the knowledge society. In Mexico, 56% of students in mathematics, 47% in science and 50% in reading were classified in levels 1a and below. 7 For comparison, the percentages of students who scored less than or equal to level 1a for the average of OECD countries are 24 in mathematics, 22 in science and 23 in reading. Results are not very encouraging; around one in two 15-year-old students in Mexico do not have enough knowledge to confront the more basic requirements of the contemporaneous society. Table 4 presents average PISA scores for mathematics, science and reading for 15-yearold students in Mexico, divided by age of entry to first grade. Regular students obtained an average score of 415 in mathematics while early students obtained an average score Aguayo-Téllez and Martínez-Rodríguez Large-scale Assess Educ (2020) 8:11  of 407; that is; early students obtained an average of 8.3 points less than regular students. Similarly, early students obtained an average of 11.6 points less in science and 11.7 points less in reading than regular students. Mean differences are statistically significant for the three areas of knowledge. 8

Methodology
Following Elder and Lubotsky (2009) construct, this paper performs two empirical approximations to determine the effects of early entrance to first grade on middle-run academic performance in Mexico. One exercise is to measure the effects of early school entrance on the probability of having repeated at least one school year at the age of 15. The other is to investigate the effect of such early entrance to school on the academic performance (measured throughout PISA's plausible values) of 15-year-old students. For the first exercise, a dichotomy variable that takes the value of one if the student have repeated at least one school year during his life or zero otherwise is used as dependent variable. For the second exercise, the PISA's plausible values for every one of the three knowledge areas of the test are used as dependent variables. Independent variables for both exercises comprise a vector of student characteristics (X), which includes the variable early, a vector of family and household characteristics (Y), and a vector of school characteristics (Z).
For the probability of having repeated at least one school year exercise we use a Probit model as the following: and for the PISA's plausible values (PV) exercise we use the following OLS model: where k is a specific area of knowledge: reading, mathematics or science, ε z is an error component alike to all students from the same education institution z, and ε i is the idiosyncratic or identical and independently distributed error component. 9 For both, Probit and OLS exercises, we run different model specifications with different combinations of independent variables, which include other characteristics of the student (X), characteristics of the family (Y), and/or characteristics of the school (Z), in addition to the variable early. Carrying out these different specifications allow us to evaluate how the estimated coefficient of the variable early reacts to the inclusion (or noninclusion) of different combinations of independent variables (and to the corresponding loos of observations due to the inclusion of such independent variables). It is important to notice that 85% of all 15-year-old students in the database are in 10th grade (both regular and early students); however, students who entered late to first grade and students who repeated at least one school year are in 9th or in a lower grade. Hence, in order to compare in a fair way academic knowledge among students, for the PISA's plausible values regressions we should only compare plausible values from students from the same school grade. However, given that early students may be more likely to repeat school grades than the regular ones, if we drop all students who are in grades other than 10th grade we will drop relatively more early students, resulting on a biased sample that would overestimate the performance of early students. To avoid this bias, and at the same time, in certain way, to be able to compare academic knowledge among students from different grades, instead of dropping observations, for the PISA's plausible values regressions, we include a 9th grade dummy variable and an early*9th grade variable interaction.
Following the OECD (2009), for the PISA's plausible values exercise, we ran ten independent regressions for each area of knowledge, one regression for each of the ten plausible values of each area. This procedure generates ten estimated coefficients for every independent variable for each area of knowledge. The final or reported estimated coefficient of an independent variable for an area of knowledge is just the simple average of the estimated coefficients of the ten regressions: β k = 10 pv=1β pv,k /10 . Following the OECD (2009) it is also possible to calculate standard errors, t-values, and significance tests for each estimated coefficient. 10 Table 5 presents the marginal effects of the estimated Probit model for seven different specifications. Specification in column (1) considers only the variable early; column (2) adds male and four city size dummies; columns (3) to (7) add two characteristics of the mother (six dummies for education of the mother and a dummy indicating if the mother does not work), two characteristics of the household (computer at home, and internet at home), and nine characteristics of the school (private school, computer/student ratio, percentage of teachers with a master degree, students/teacher ratio, class size, the shortage of educational material index and the shortage of staff index, and two dummies for school size). All marginal effects are at the mean values of the independent variables, with the exception of the dummy variables.

Repeating at least one academic year
Showing a robust result, the variable early resulted positive and strongly significant for all model's specifications; i.e., after controlling for individual, family and/or school 10 To get valid standard errors, we follow the OECD (2009) and use the replicated weights provided in the database. If regular standard errors are used, it is possible to obtain acceptable values for the hypothesis test, but not very formal. PISA data base manages a total of 80 replicated weights that allow us to obtain more adequate standard errors. Similarly, all regressions are performed using STATA's survey data analysis. For detailed information please refer to the OECD (2009).
Page 12 of 18 Aguayo-Téllez and Martínez-Rodríguez Large-scale Assess Educ (2020) 8:11 characteristics, students who enter first grade before reaching 6 years of age have around 7 percentage points higher probability of having repeated at least one school year than students who entered first grade at the age of six. This seems to be a first hint that entering first grade before reaching the age of six really causes an academic disadvantage in the middle-run. As noted before, given that it was not possible to reconstruct the age of entry to first grade of 15% of the observations, and given that this proportion of missing values is slightly higher for students born in the last third part of year; results may be underestimated and must be taken with caution. Additional results from Table 5 indicate that boys have between 3.2 and 4.7 percentage points (pp) higher probability of having repeated a school year than girls. In order to explore the possibility of having different early coefficients for boys and girls, we run a regression with the early*boy interaction as well as independent regressions for boys and girls. We did not find statistical evidence to claim that the effect of early entrance on school reprobation is different for girls and boys. 11 With respect to family characteristics, as expected, mother's schooling have an effect on reducing the probability of repeating at least one school year; however, the working   status of the mother does not. Having a computer at home reduces the probability of repeating at least one school year but having internet access at home is quite ambiguous. After controlling for school characteristics, having internet at home increases the probability of repeating at least one school year. The city size dummies (small town, town, city, and large city) did not result statistically significant, however, they are included to control for other environment or neighborhood characteristics. 12 With respect to school characteristics, it is found that currently studying in a private school 13 reduces the probability of having repeated at least one school year around 12 pp, while being enrolled in a large size school reduces the probability of having repeated at least 1 year between 18 and 19 pp. The variables computers/student ratio, proportion of teachers with a master degree, students/teacher ratio, class size and shortage of educational material and staff are included as an attempt to control for the enormous heterogeneity of education quality within Mexico. Coefficients only resulted statistically significant for computers/student ratio, students/teacher ratio, and class size. Smaller classes are often perceived as allowing teachers to focus more on the needs of individual students, however, there is only weak evidence that smaller classes may benefit specific groups of students, such as those from disadvantaged backgrounds (Krueger 2002). Unfortunately, the interaction between early and class size in our regressions do not complement on the evidence that, at the age of 15 years, smaller classes benefit younger students because it did not result statistically significant. Table 6 shows the OLS regression results for the three areas of knowledge considered by the PISA 2018 test (mathematics, reading, and science) for two different specifications: a specification that includes student and family characteristics, and a specification that adds school characteristics. These two specifications are similar to specifications in columns (5) and (7) of the Probit exercise. 14 As mentioned, we calculated average estimated coefficients and their corresponding valid standard errors following the instructions specified in the OECD (2009).

Academic performance measured trough PISA scores (plausible values)
The variable early resulted negative and statistically significant for the three areas of knowledge; indicating that in average, after controlling for other student, family and school characteristics, 15-year-old students who started first grade before reaching 6 years of age obtain 5.6 score points less in mathematics, 9.6 score points less in reading, and 9.7 score points less in science, than students who entered first grade with 6 years of age. Remember that, given the way the PISA test is designed (a score mean around 500 score points and a standard deviation around 100 score points), a 10-point difference corresponds to a Cohen's d effect size of 0.10 (0.1 standard deviations). Hence, 14 For space reasons, Table 6 does not present the estimated coefficients of all seven specifications as in the Probit exercise, but they can be provided by the authors upon request. 12 To inquire if the effect of early is different in different city sizes, we ran a regression including interactions of early and the city size dummies. We did not find evidence to claim that the effect of starting school early on the probability of repeating at least one school year is different in different city sizes. Estimated coefficients from these additional regressions are not presented here but can be presented under request. 13 Unfortunately, there is not information about the characteristics of the students' previous schools. Hence, it is not possible to know whether the student repeated a school year in a private or in a public school or if the student changed school after that. Endogeneity may also be involved and estimated coefficients of the private/public school variable should be taken with caution.
Page 14 of 18 Aguayo-Téllez and Martínez-Rodríguez Large-scale Assess Educ (2020) 8:11 results here correspond to a Cohen's d effect size of 0.06 in mathematics and 0.10 in reading and science. 15 The dummy variable 9th grade also resulted negative and statistically significant; indicating that in average, in the three areas of knowledge, students in 9th grade (both early and regular) obtain scores between 30 and 40 points smaller than students in 10th grade. The interaction between early and 9th grade allows us to know whether the score difference between early and regular students in 9th grade (early + early*9th grade) is different to the score difference between early and regular students in 10th grade (early). 16 The estimated coefficients of the early*9th grade interaction variable did not result statistically significant for any of the specifications of the three areas of knowledge; meaning 15 Also remember that these results could be underestimated because they only consider 15-year-olds who stayed in school. Although we could not corroborate it (given our limited information) it may be possible that early students drop out school more often than regular students. If this were true, our sample would be underrepresented by early students and our estimated coefficients for the variable early would be underestimated. 16 The average score of an early student in 9th grade is (constant + early + 9th grade + early*9th grade), and the average score of a regular student in 9th grade is (constat + 9th grade), hence, the difference in average scores for early and regular students in 9th grade is (early + early*9 th grade). Similarly, the average score of an early student in 10th grade is (constant + early), and the average score of a regular student in 10th grade is (constant), hence, the difference in average scores for early and regular students in 10th grade is (early). Therefore, the difference between both differences is just the early*9th grade interaction.
Page 15 of 18 Aguayo-Téllez and Martínez-Rodríguez Large-scale Assess Educ (2020) 8:11 that in average, the score difference between early and regular students is the same for 9th and 10th grades.
As previous literature has pointed out (Robinson and Lubienzki 2011;AAUW 2008), gender also explains differences in academic scores. In this exercise, boys present academic advantages in mathematics and science (18 score points in mathematics and 12 score points in science) and girls present academic advantages in reading (8 score points). As in the Probit model, we explore the possibility of having different early coefficients for boys and girls by adding to the OLS regressions an early*male interaction variable. We did not find statistical evidence to claim that the effect of early entrance on PISA scores is different for girls and boys. 17,18 With respect to family characteristics, the education of the mother resulted positive and statistically significant but only for those students whose mothers had college education. Also, students whose mothers do not work obtained between 4.3 and 9.7 score points less than those students whose mothers participate in the labor market. This result is consistent with a large body of literature that asserts that the labor participation of the mother motivates children to have a better academic performance and other labor outcomes in the future (Stevens and Boyd 1980;Couch andDunn 1997, Johnston et al. 2014).
With respect to school characteristics, private school did not result statistically significant. This result is not surprising given the wide range of quality levels within private schools in Mexico (Rossi and Rosati 2007;Agüero and Beleche 2013). To try to control for such differences in quality levels, we included additional school variables such as computers/student ratio, percentage of teachers with master degrees, students/teacher ratio, class size, shortage of educational material or staff and school size. The percentage of teachers with master degrees presented considerably high estimated coefficients. We must keep in mind that the main reason to include school variables in the regressions is to try to control for the wide differences in the quality of education in Mexico. After including all school variables, the estimated coefficients of the variable early only slightly decreased.

Conclusions
Throughout the lines of this paper, it has been pointed out that the relative age at the moment of beginning formal education (first year of elementary school) may have an effect in the middle-run academic performance of children. Using the 2018 PISA database for Mexico and two different econometric approximations, it is possible to conclude that the fact of being relatively younger at the moment of entering first grade does have a negative effect on the academic performance of students at the age of 15. Specifically, students who start first grade before reaching 6 years of age are 7 percentage points more likely to have 17 , For space reasons, all the estimated coefficients of the early*male interaction regressions are not presented here but can be provided under request. The estimated coefficients of the other variables resulted very similar to the ones presented in Table 6. 18 A variable "years of preschool" was originally included in the regressions, however, it was never statistically different from zero and due to its large number of missing observations, we decided not to include it in the final regressions. Even though an important set of literature claims that attending preschool or kindergarten is important to increase children's academic performance (Magnuson et al. 2007;Ramey and Ramey 2004); we did not find evidence to claim so. It is possible that the estimated coefficients resulted non statistically significant because 98.5% of all students in the sample reported having attended at least 1 year of schooling before first grade, 81% at least 2 years, and 44% at least 3 years.
Page 16 of 18 Aguayo-Téllez and Martínez-Rodríguez Large-scale Assess Educ (2020) 8:11 repeated at least one school year during their academic life and obtain in average 5.6 PISA score points less in mathematics, 9.6 PISA score points less in reading, and 9.7 PISA score points less in science, than students who entered first grade with 6 years of age. These results are consistent with the explanation provided by previous literature (Bedard and Dhuey 2006;Allen and Barnsley 1993;Peña 2017; among others) about the importance of considering the level of maturity of children when entering first grade because it may have an impact on their level of competences, even in the long-run. Parents and teachers must take with caution the decision of registering their children to elementary school before reaching the age required; Rushing their entry does affect their performance in the middle-run and may also harm his adulthood outcomes.
Although it would be wrong to generalize, since there may be children who at 5 years old are mature enough to enter first grade, and perhaps some others who at 6 years old are not mature enough; the truth is that enough importance must be given to this aspect not only by parents and teachers, but also by those who are responsible for making the law. By assuring that each child is mature enough to thrive in school; it will help to avoid gaps that affect children in the middle-and long-runs.
Given the existence of middle-run effects, and its possible repercussion in the long-run, it is important to search for solutions, especially in developing countries. A proposal could be the use of maturity tests, such as the ones applied in some developed countries (or in some private Mexican elementary schools). Although not all literature agrees with their efficiency (Shepard 1998), an advantage of their use is that they may help parents to take more informed decisions. And although it would be at the discretion of parents to enroll their children, they could help teachers to detect which children will require more attention.
Finally, the modification of the Mexican Education Law in 2006 that unanticipatedly shifted 4 months the cutoff date for elementary school eligibility seems to affect negatively the middle-run academic performance of almost one third of all students of that and the following generations.

Code availability
All econometric exercises were developed using STATA. The codes can be provided by the authors upon request.

Funding
Not applicable.

Availability of data and materials
All data used in this research is public from www.oecd.org/pisa/.

Ethics approval and consent to participate
Not applicable.

Informed consent
Not applicable.