Skip to main content

Boys’ underachievement in mathematics and science: An analysis of national and international assessment data from the Kingdom of Saudi Arabia

Abstract

Boys in the Kingdom of Saudi Arabia consistently and significantly underperform compared to girls across different grades and subjects, forming one of the largest gender gaps in student achievement in the world. Saudi Arabia offers a unique setting in which boys and girls attend separate schools on a universal basis starting from grade 1. This means that boys and girls are educated only by male and female teachers, respectively, in effect inhabiting parallel education systems. In this context, this study examines the factors that are associated with student achievement in mathematics and science in grades 4 and 8 and the extent to which these associations are different for boys and girls, in an effort to gain insights into boys’ underachievement in mathematics and science in Saudi Arabia. The paper employs data from two recent large-scale assessments of education: Trends in International Mathematics and Science Study (TIMSS) 2019 and Saudi Arabia’s National Assessment of Learning Outcomes (NALO) 2018. A series of hierarchical two-level linear regression models showed that in grade 4, school climate was more strongly associated with boys' compared with girls' achievement in both mathematics and science, with boys attending schools of poorer school climate having a considerably lower performance compared with girls attending such schools. The findings also indicated that although greater literacy and numeracy readiness was linked with higher science achievement among boys and girls, grade 4 boys tended to benefit more from this readiness than girls. In addition, the results show that student absenteeism in grade 4 is particularly strongly associated with decreases in mathematics achievement among boys. In grade 8, significant interactions between gender and the extent to which students feel confident in science, the degree of schools’ emphasis on academic success, and teachers’ age are observed. The paper concludes by discussing some of the implications of these findings for educators and policy makers in Saudi Arabia.

Background

The Kingdom of Saudi Arabia is among the countries with the largest gender gaps in student achievement in the world. Boys in Saudi Arabia consistently and significantly underperform compared to girls across different grades and subjects. For example, in the Trends in International Mathematics and Science Study (TIMSS) 2019, grade 4 boys in Saudi Arabia scored below girls by approximately 26 points in mathematics and 60 points in science (Mullis et al., 2020). A similar gender gap exists in grade 8, where boys underperformed girls by approximately 17 points in mathematics and 47 points in science. Furthermore, in grade 4, 53% of boys failed to achieve minimum proficiency in mathematics, compared to 45% of girls. Similarly, in grade 8, 58% of boys did not achieve minimal proficiency in mathematics, compared to 49% of girls. Data from another assessment, the Programme for International Student Assessment (PISA) 2018, show a similar finding that almost twice as many 15-year-old boys (65%) as girls (38%) failed to achieve minimum proficiency in reading (OECD, 2019).

Although the magnitude of the gender gap in Saudi Arabia is among the largest in the world, significant differences in achievement between boys and girls are also observed in many other countries—sometimes in favor of boys, and sometimes in favor of girls (Mullis et al., 2017, 2020). A large body of research has explored the factors contributing to the gender gap in student performance internationally (Autor et al., 2016, 2019; Bertrand & Pan, 2013; Buchmann & DiPrete, 2006; DiPrete & Buchmann, 2013; DiPrete & Jennings, 2012; Fortin et al., 2015; Jha & Pouezevara, 2016; Legewie & DiPrete, 2012; OECD, 2021). Evidence from this research suggests that social norms, school characteristics, students’ social and behavioral skills, and family background are the main factors associated with the achievement gap between boys and girls. Prior research indicates that, compared to boys, girls tend to have better noncognitive skills, such as self-regulation and persistence, and spend more time doing assignments and homework (Buchmann & DiPrete, 2006; Cornwell et al., 2013; DiPrete & Jennings, 2012; Downey & Vogt Yuan, 2005; OECD, 2021). Other studies have found that school characteristics such as school quality, disciplinary practices, and school climate—the institutional norms, practices, structures, values, and relationships underpinning a student’s experience of school—can affect boys and girls differently.

An important study combined birth records with school administrative data from the US state of Florida to identify the effects of school quality (defined as school-level gains in mathematics and reading scores) on the gender achievement gap between opposite-gender siblings who attend the same sets of schools (Autor et al., 2016). This study shows that boys benefitted more than girls from studying in higher-quality schools. Similarly, another recent study (OECD, 2021) based on data from two large-scale international assessments shows that school discipline problems affect boys more negatively than girls. Combining data from the Teaching and Learning International Survey (TALIS) 2018 and PISA 2018, the study demonstrates that increases in teachers’ perceptions of classroom discipline problems were associated with an increase in the achievement gender gap (OECD, 2021). Other school organizational issues, such as poor learning conditions and organizational problems, are also found to exacerbate the gender gap.

Such findings indicate that boys’ achievement tends to be negatively impacted by challenging learning conditions to a greater extent than girls’ achievement, and are consistent with previous work on gender gaps in attitudes showing that girls, in general, tend to report more positive attitudes toward learning (DiPrete & Buchmann, 2013). Previous research also shows that social norms, gender stereotypes, and teacher and school expectations contribute to the gender gap in performance (Jha & Pouezevara, 2016; Page & Jha, 2009; Stromquist, 2007; Younger & Cobbett, 2014). In other words, teachers and schools contribute to developing and reinforcing different expectations of appropriate behavior for boys and girls, which, in turn, may hinder boys’ performance.

Another group of studies have investigated the relationship between the gender composition of schools and/or classes and students’ outcomes. The findings from this research show weak associations between single-sex schooling and academic achievement, with some exceptions in certain grades and among certain subpopulations of students. Pahlke et al. (2014), for example, synthesized available literature on the effects of single-sex compared with co-educational schooling on a wide range of student outcomes, such as mathematics and science achievement, gender stereotypes, interpersonal relations, and students’ perceptions. After reviewing more than 400 studies, the authors conclude that “there is little evidence of an advantage of SS [single-sex] schooling for girls or boys for any of the outcomes” (p. 1064). Similarly, using data from the Secondary Entrance Assessment and the Caribbean Secondary Education Certification (CSEC) examination in Trinidad and Tobago, Jackson (2012) shows that students attending single-sex secondary schools perform no better than those attending co-educational secondary schools, except for female students with strong preferences for single-sex education who perform better on the CSEC examination (at grade 10). Pahlke et al. (2013), using the random assignment of students into single-sex and co-educational schools in Korea to study the effect of single-sex education on student achievement in mathematics and science, have found no association between school gender and student achievement. Other studies, however, show some positive effects of single-sex education, especially among females. Following an approach similar to Pahlke et al. (2013), Eisenkopf et al. (2015) used the random assignment of female students into single-sex and co-educational secondary schools in Switzerland to examine the effect of single-sex education on students’ academic performance. Their results suggest that single-sex education improves females’ performance in mathematics and their self-confidence. Booth and Nolen (2012) have also found that girls in single-sex schools are likely to be more competitive than girls attending co-educational schools, but the same relationship has not been found for boys.

The current study

Building on the existing research, this study examines (i) the contribution of a range of student, family, class, and school variables in predicting overall Saudi student achievement in mathematics and science at grades 4 (primary school) and 8 (intermediate/middle school), and (ii) the extent to which these variables contribute to the large observed gender gap in Saudi student performance in these subjects at grades 4 and 8. Data from both TIMSS 2019 and Saudi Arabia’s National Assessment of Learning Outcomes (NALO) 2018 were employed. Both TIMSS and NALO assess fourth- and eighth-grade students’ mathematics and science achievement, as well as collecting a wide range of contextual information about students, their families, teachers, and schools.

This study contributes to the existing literature in two main ways. First, it addresses a need for more evidence about the factors associated with the achievement gap between boys and girls in the Middle East and North Africa (MENA) region, and especially in Saudi Arabia. In an effort to provide this evidence, this study uses two large-scale assessments that have a shared focus on the same domains of study and the same grade levels in a complementary fashion, with data from NALO used to complement findings arising from the analysis of TIMSS data. It, therefore, provides important insights for policy makers, both in Saudi Arabia and in other countries in the MENA region, regarding factors that may contribute to the observed gender gaps in mathematics and science achievement.

Secondly, Saudi Arabia offers a unique setting in which boys and girls attend separate schools on a universal basis starting from grade 1. This means that boys and girls are educated only by male and female teachers, respectively, meaning that, in effect, they inhabit parallel education systems. This is displayed in Fig. 1, which shows that Saudi Arabia is the only country among all countries participating in TIMSS 2019 to have a completely gender-segregated education system.Footnote 1 Although gender-segregated schools are not uncommon in the MENA region, students do not usually attend single-gender schools until the end of primary education. The unique structure of the Saudi education system, therefore, provides an opportunity to examine, in a multilevel framework, how variance in system-level factors applying only to boys or to girls contributes to the observed individual differences in achievement. This analysis exploits the existence of parallel gender-segregated school environments that operate within a shared overarching cultural context, where expectations and practices outside school also vary significantly between boys and girls.

Fig. 1
figure 1

Distribution of students in single-gender or mixed education among countries participating in TIMSS 2019

While this paper exploits this feature of the Saudi education system in its analysis, it also acknowledges potential difficulties in interpreting findings due to this extreme degree of separation, as gender differences signify differences between schools attended only by boys and schools only attended by girls, and not differences among individual students. Additionally, teacher and school characteristics are confounded with gender differences in learning outcomes. For example, any differences between girls’ and boys’ educational environments seen in these data are inseparable from the fact that the (male) teachers of boys have been trained and work in an environment that is completely separate from the training and work environment of the (female) teachers of girls.

Education system in Saudi Arabia

Preuniversity education in Saudi Arabia is divided into four levels: preprimary, elementary, intermediate, and secondary education. Preprimary education includes three years starting at age 3; elementary education starts normally at age 6 and includes grades 1 through 6; intermediate education comprises grades 7 to 9, and secondary education consists of grades 10 through 12. Students in Saudi Arabia attend single-gender schools, except in preprimary. Boys and girls are separated from grade 1 onwards and are taught by teachers of the same gender. A recent reform, though still on a limited scale, has allowed boys in grades 1 through 3 to enroll in girls’ elementary schools, and, as a result, boys may be taught by female teachers but in separate classes.

Saudi Arabia’s K–12 education system includes more than 5.5 million students and more than 450,000 teachers, and is administered through 47 education directorates and 383 education offices within directorates. The Saudi Ministry of Education plays a central role in setting the policies and regulations for schools across the country including curriculum, teacher hiring and promotion, and student assessment. Directorates and offices are responsible for implementing the directives of the Ministry and tend to have a similar structure across the country (OECD, 2020).

Over the last few decades, Saudi Arabia has achieved substantial progress in improving access to education. For example, the gross enrollment ratio (GER) in primary education – the total enrolment in primary school expressed as a percentage of the total primary school-aged population – increased from 58% in 1979 to 101% in 2019. During the same period, the GER in secondary education increased from 27 to 112%.Footnote 2

Although increased access to education is a positive development, these large gains in access have not been accompanied by similar improvements in students’ learning outcomes. Overall, learning outcomes remain below expectations in Saudi Arabia. Data from TIMSS 2019 show that, in mathematics, Saudi Arabia ranks 53rd of 58 countries in grade 4 and 37th of 39 countries in grade 8. In PISA 2018, less than half (48%) of 15-year-old students in Saudi Arabia achieved minimum proficiency in reading and almost no student was a top performer (i.e., achieving proficiency levels 5 or 6; OECD, 2019). Additionally, only 27% of Saudi students in the same age group achieved at least minimum proficiency in mathematics, compared to an Organization for Economic Co-operation and Development (OECD) average of 75%. Notably, learning outcomes in Saudi Arabia are low relative to the country’s level of wealth, and Saudi Arabia has been identified as an outlier when examining Harmonized Learning OutcomesFootnote 3 relative to gross domestic product (GDP) worldwide (Patrinos & Angrist, 2019). Finally, as noted above, the gender differences in achievement observed in Saudi Arabia are consistently among the largest in the world, with girls showing a consistent advantage over boys across all grade levels and subject areas.

Methods

Data

The analyses described in this paper are drawn from two recent large-scale assessments of education: TIMSS 2019 and Saudi Arabia’s NALO 2018. TIMSS provides a robust and high-quality nationally representative sample, while NALO was conducted to provide regionally representative information within Saudi Arabia (requiring a much larger sample size) as well as national-level data. Although the paper focuses on the national level in the analyses described below, it should be noted that subsequent analyses at the regional level would be possible using NALO data.Footnote 4

There is a substantial degree of overlap between the content covered by the TIMSS and NALO contextual questionnaires, although some variables appear in one study but not the other, or they are presented in slightly different formats. The primary analysis reported in this paper is conducted using TIMSS 2019 data. Given the high degree of overlap in the two studies’ focus on mathematics and science, at the same two grade levels (grades 4 and 8), data from NALO 2018 are used to supplement this primary analysis by drawing on variables of particular interest to Saudi Arabia that have no equivalents in TIMSS. In this way, NALO 2018 data are used as supplementary information to shed additional light on questions arising from the multilevel analysis of TIMSS data.

TIMSS 2019

TIMSS is a study of the International Association for the Evaluation of Educational Achievement (IEA). It assesses mathematics and science achievement at two grade levels, grades 4 and 8. TIMSS has been carried out every four years since 1995. In 2019, 64 countries participated in TIMSS. In addition to providing countries with robust data on mathematics and science achievement, TIMSS collects a wealth of contextual data from students, parents, teachers, and school principals.

In Saudi Arabia, 5,453 grade 4 students (mean age 9.9 years; 49.6% male) across 220 public and private schools and 5,680 grade 8 students (mean age 13.9 years; 49.2% male) across 209 schools took part in TIMSS 2019. Data were collected using a stratified two-stage cluster sample design, with a sample of schools selected randomly at the first stage and one or more classes of students selected per each of the sampled schools at the second stage (LaRoche et al., 2020). The Saudi sample of schools was drawn systematically in order for the sampled schools to represent the populations of grade 4 and grade 8 students nationally, with representation from 13 regions and a balance between male and female schools. Implicit stratification methods were used to ensure representation of the various school types (such as public versus private schools).

The IEA requires high participation rates and adherence to standardized administration procedures for participating countries to be included in the international results. The IEA calculates and provides sampling weights (to ensure that the final sample of participating students can be generalized to the national populations of grade 4 and grade 8 students) and plausible valuesFootnote 5 for mathematics and science scores to ensure accurate population-level estimates of achievement and facilitate appropriate analyses taking the complex nature of the data into account.

NALO 2018

Saudi Arabia’s NALO is administered by the Education and Training Evaluation Commission (ETEC), an independent government agency responsible for school evaluation, accreditation, and assessment, among other responsibilities. In 2018, the domains assessed by NALO were mathematics and science in grades 4 and 8, which means that the data from NALO 2018 are closely aligned to TIMSS 2019 both in terms of the target domains and grade levels. Following a similar approach to TIMSS, NALO also collects contextual information through student, parent, teacher, and school questionnaires.

In NALO 2018, 27,985 grade 4 students (50.2% male) across 964 government, private, and Quran schools completed tests of mathematics and science, as did 30,157 grade 8 students (49.6% male) across 939 schools. The schools that took part in NALO were sampled using procedures similar to those used in TIMSS. Also following the procedures in TIMSS, sampling weights are provided by ETEC to ensure that the NALO database is weighted to represent the national population, and plausible values and replicate weights are provided for appropriate calculation of achievement data.

Measures and variables

In this study, data from the TIMSS mathematics and science tests as well as the TIMSS student, parent, teacher, and school questionnaires were used for the primary analysis, and data from the NALO mathematics and science tests and the student, teacher, and school questionnaires were used to complement the primary analysis.

Outcome variables

Students’ mathematics and science achievement constituted the outcome variables. In TIMSS, the scores for each student across the two subjects and grades are reported on scales with international centrepoints set at 500 and standard deviations (SD) at 100, with most scores falling within the 300–700 band. Following a similar scaling approach to TIMSS, NALO scores for each student across the two subjects and grades are reported on scales with national averages set at 500 and SD at 100.

Predictor variables

A range of predictor variables related to student, teacher, and school demographics and home background, student engagement and attitudes, school climate, teacher qualifications and practices, and school leadership and resources were included in the analysis. Information about these variables can be found in the Additional file 1: Table S1.

Statistical analysis

Prior to the main analysis, descriptive statistics were computed, and a series of statistical tests were conducted to provide a comprehensive overview of the gender differences across the contextual variables of interest to this study. The levels of statistical significance along with the relevant effect sizes for each of these differences are reported. The phi (φ) and the Cramer’s V (φc) effect size measures were used for the contextual categorical variables for 2 × 2 contingency tables and for contingency tables larger than 2 × 2, respectively. The Cohen’s d effect size measure was used for the contextual continuous variables (Fritz et al., 2012). Cohen’s (1988) guidelines were used in conjunction with Hattie’s (2009) guidelines for the interpretation of effect sizes. The IEA International Database Analyzer (IDB Analyzer) (IEA, 2021) and the Statistical Package for the Social Sciences (SPSS) (IBM Corporation, 2020) were used to compute estimates for TIMSS and NALO, respectively.

Next, hierarchical two-level linear regression analysis was conducted to explore the different factors that are associated with all students’ achievement in mathematics and science in grades 4 and 8 more generally as well as boys’ underperformance in Saudi Arabia. The regression models draw on the TIMSS 2019 data. Four models were constructed: (i) grade 4 mathematics, (ii) grade 4 science, (iii) grade 8 mathematics, and (iv) grade 8 science. Students were the unit of analysis at level 1 and classes were the unit of analysis at level 2. However, as the number of intact classes sampled from each school varied, with some schools having one class and other schools having two classes sampled, the class level is confounded with the school level; in other words, for schools where only one intact class is available, the class level is identical to the school level. For this reason, two-level rather than three-level analysis was conducted, with classes constituting the level-2 unit of the analysis. However, the possible confounding of class and school levels was taken into account in conducting the analysis (e.g., calculation of the sampling weights) and in the interpretation of the results.

Along with mathematics and science achievement scores, variables included in the multilevel models are drawn from the student and home questionnaires (level 1 of the analysis) and teacher and school principal questionnaires (level 2 of the analysis). Variables were selected for inclusion a priori, based on previous literature on student achievement and the gender gap in education in particular, as discussed earlier, in addition to the expected theoretical or policy relevance of these variables to the question of gender differences in the Saudi context. As far as possible, each model was constructed using the same set of variables, notwithstanding some slight differences arising from the selection of variables related specifically to mathematics or science instruction and some differences between the grade 4 and 8 questionnaires.

A hierarchical approach was followed, whereby conceptually similar variables were entered into each step of each model in blocks (Table 1). The first step of each model included gender only.Footnote 6 By including the gender variable into the model alone, the difference in achievement between boys and girls, after controlling for the clustering of the data within classes/schools, could be observed. Next, different blocks of variables were entered into the model one by one to allow for the examination of their contribution in predicting overall achievement and in explaining the difference in achievement between boys and girls. In the final step of each model, the statistical significance of the interactions between gender and each of the predictor variables in predicting achievement was explored. Each interaction term was entered into the model individually and all the statistically significant interaction terms were entered into the final model. To facilitate interpretation of the statistically significant interaction terms, those were plotted using the predicted values based on the last step (step 7) of each model. Hence, the interaction plots shown below present the predicted, rather than the raw, gender differences in each variable in terms of mathematics and science achievement, after accounting for a range of student- and class/school-level predictor variables.

Table 1 Steps in building the hierarchical two-level linear regression models

Equation 1 represents the random intercept null multilevel models (models with no predictor variables), which were applied to the TIMSS data, controlling for their clustering and allowing for the estimation of the proportions of the total variance in the outcome variable that is attributable within and between clusters (i.e., intraclass correlation [ICC] coefficients).

$${y}_{ij}={\beta }_{0}+{u}_{0j}+{e}_{ij}$$
(1)

where \({y}_{ij}\) is the outcome variable (e.g., mathematics achievement) of student i in class/school j, \({\beta }_{0}\) is the mean intercept, \({u}_{0j}\) is the variation of class/school j from the mean intercept, and \({e}_{ij}\) is the student-level residual error term.

Equation 2 represents the random intercept multilevel models, which were applied to the TIMSS data, including ν number of predictor variables at the student (ij) or the class/school level (j), while controlling for the clustering of the data.

$${y}_{ij}={\beta }_{0}+{u}_{0j}+{\beta }_{1}{x}_{1ij/j} + ... + {\beta }_{\nu }{x}_{\nu ij/j} {+ e}_{ij}$$
(2)

where \({y}_{ij}\) is the outcome variable (e.g., mathematics achievement) of student i in class/school j, \({\beta }_{0}\) is the mean intercept, \({u}_{0j}\) is the variation of class/school j from the mean intercept, \({\beta }_{1}\) is the regression slope for the predictor variable \({x}_{1}\) of student i in class/school j or class/school j, \({\beta }_{\nu }\) is the regression slope for the predictor variable \({x}_{\nu }\) of student i in class/school j or class/school j, and \({e}_{ij}\) is the student-level residual error term. Additional file 1: Table S3 presents detailed equations for each step of the models.

All five plausible values of achievement and sampling weights were used in all the analyses as per the relevant guidelines by von Davier et al. (2009) and Rutkowski et al. (2010), respectively. Given the imputation methodology of plausible values used to scale achievement data in TIMSS, there were no missing values in the achievement-related variables (i.e., mathematics and science achievement). Missing value analysis of the contextual variables showed that the average proportion of missing values across the variables included in the analysis at both grade 4 and grade 8 was 6.2% and that data were not missing at random; given this relatively low proportion of missing values and that data were not missing at random, these were not imputed. Assumptions necessary for conducting the multilevel linear regression analysis (i.e., linearity, homogeneity of variance, normality of errors) were checked and met, and parameters for the models were estimated using the maximum likelihood estimation with robust standard errors (SEs), the default estimator for multilevel models with continuous outcomes in Mplus which provides estimates that are robust to the non-independence of observations (which, in this study, is a result of the clustering of students within classes/schools) (Muthén & Muthén 1998–2017). Multilevel linear regression analysis was performed using Mplus 8 (Muthén & Muthén 1998–2017).

Reported statistics for each of the models include: proportions of variance (R2; expressed as a percentage of the total variance) in achievement explained at each level and step; intercepts with their SEs; unstandardized coefficients (Bs) and standardized coefficients (βs) each accompanied by their SEs for each predictor variable; fit statistics (Loglikelihood (H0), Akaike Information Criterion (AIC), and Bayesian Information Criterion (BIC)) for each step, including the null model; and ICC coefficients for the null model. Bs are expressed in the original unit of each of the predictor variables, while βs can be used to compare the relative strength of each predictor variable in predicting achievement (i.e., to find the most robust predictors of achievement) in each model.

Although fit indices are not intrinsically interpretable (i.e., their values cannot be interpreted as being large or small in themselves), they can be compared across different steps of each model to check whether changes in the model lead to better fit. For all three indices presented in the tables, smaller values indicate better model fit regardless of the absolute number.

Results

Descriptive statistics

Descriptive statistics for outcome variables

Table 2 shows mean scores in mathematics and science by gender in the TIMSS 2019 and NALO 2018 assessments. Results from both datasets show that boys, or due to the gender-segregated school system in Saudi Arabia, schools attended by boys, consistently underperform compared to girls (or schools attended by girls) in both subject domains and at both grade levels in Saudi Arabia. The differences are larger for science than for mathematics, at both grade levels, in both TIMSS and NALO.

Table 2 Mean mathematics and science achievement in Saudi Arabia, by grade level and gender, TIMSS 2019 and NALO 2018

Descriptive statistics for predictor variables—TIMSS 2019

Table 3 shows the percentages of boys and girls in grades 4 and 8 in Saudi Arabia across the TIMSS 2019 contextual categorical variables of interest to this paper (i.e., categorical variables that were included in the models). Table 4 shows the means (m), and SD for boys and girls in grades 4 and 8 in Saudi Arabia across the TIMSS contextual continuous variables of interest to this study (i.e., continuous variables that were included in the models). The tables also include the effect sizes (φ/φc and d) for the gender differences in each of the contextual variables. As some of the estimates presented in Table 4 are not intrinsically interpretable (i.e., their values cannot be interpreted as being large or small per se), thresholds and their corresponding interpretation along the continuum of each of these variables, as set by TIMSS, are provided in Additional file 1: Table S2.

Table 3 Contextual categorical variables by gender, TIMSS 2019
Table 4 Contextual continuous variables by gender, TIMSS 2019

Across most of the TIMSS 2019 contextual variables, the differences between boys and girls are statistically significant. However, most of the differences yielded small to moderate effect sizes. Among these differences the most noticeable at grade 4 are observed in teachers’ absenteeism, with girls being more likely to attend schools where teacher absenteeism is a minor problem compared to boys who are more likely to attend schools where teacher absenteeism is either not a problem or a moderate to serious problem. Another considerable difference can be found in teachers’ major area of study, with more teachers in boys’ schools having education and mathematics or science as their major area of study compared to teachers in girls’ schools who tend to primarily have mathematics or science but not education as their major area of study. Differences in teachers’ professional development are also noticeable, with teachers in boys’ schools having attended fewer hours of professional development on mathematics and science compared to teachers in girls’ schools. In comparison to grade 4 girls, grade 4 boys report less positive attitudes (liking or feeling confident) toward mathematics and science, a lower sense of school belonging, and experience more frequent bullying. Additionally, boys’ schools tend to be less safe and orderly, and teachers in boys’ schools tend to report lower levels of job satisfaction.

At grade 8, both mathematics and science teachers are younger, on average, in boys’ schools. In line with findings at grade 4, more grade 8 teachers in boys’ schools have education and mathematics or science as their major area of study, compared to teachers in girls’ schools, who tend to primarily have mathematics or science but not education as their major area of study. Grade 8 teachers in boys’ schools report attending fewer hours of professional development on mathematics and science compared to teachers in girls’ schools. Additionally, grade 8 boys’ schools tend to be less safe and orderly, with lower levels of discipline and more frequent bullying among students, compared to girls’ schools, while teachers in boys’ schools tend to also report lower levels of job satisfaction.Footnote 7

Additional contextual information for grades 4 and 8 – NALO 2018

This section draws on data from NALO questions that were not available in TIMSS to enrich the description of students’ experience of education in Saudi Arabia. All of the differences between boys and girls noted in this section are statistically significant at the 0.001 level. However, the effect sizes associated with these differences were generally small.

NALO data indicate that 14.5% of grade 4 boys had repeated a year at school because of poor academic performance, compared to 6.6% of grade 4 girlsFootnote 8 (Table 5). In addition, grade 4 boys reported engaging in a lower level of reading than grade 4 girls. For example, 22.6% of boys reported never having read a book, compared to 12.0% of girls. Conversely, 33.8% of boys had read more than 10 books, compared to 40.4% of girls. However, when asked whether mathematics is important in life, grade 4 boys and girls provided broadly similar levels of agreement (Table 5). Among grade 4 immigrant students (those reported by their parents to have been born outside Saudi Arabia), boys were more likely to attend a private school (13.9%) than girls (2.6%) (Table 6). However, the vast majority of both boys (85.0%) and girls (95.2%) attended government schools.

Table 5 Student reports of selected variables, by school gender, NALO 2018
Table 6 Percentages of students not born in Saudi Arabia attending schools of various types (parent reports), by school gender, NALO 2018

NALO data also show the differences in perceptions among teachers, school principals, and parents in boys’ and girls’ schools. Teachers of grade 4 boys were substantially less likely to agree that parents have a good understanding of their child’s current academic level, suggesting a greater misalignment between student performance and parental understanding in boys’ schools compared to girls’ schools. This difference was associated with the largest effect size observed among all the selected NALO variables (φ = 0.302; Table 7). Teachers in boys’ schools are less likely to report that high-achieving students were respected among their peers compared to teachers in girls’ schools (Table 7).

Table 7 Percentages of teachers ‘always’ agreeing with selected statements, by school gender, NALO 2018

School principals’ reports correspond with those of their teachers in relation to parental support for learning. Principals of girls’ schools report a higher degree of parental support for learning than those in boys’ schools, and also a higher degree of satisfaction among parents with their child’s educational progress (Table 8). More grade 4 boys (67.3%) have access to a school library from which they can borrow compared to grade 4 girls (48.5%), despite girls themselves reporting reading more books, as noted above.

Table 8 School principals’ reports of selected variables, by school gender, NALO 2018

Boys in grade 8 were more likely (6.9%) than girls (4.5%) to report having repeated a year at school because of poor academic performance (Table 5). However, both in absolute and in relative terms, the differences are smaller at grade 8 than grade 4. In terms of reading behavior, grade 8 boys report a more nuanced pattern than seen at grade 4. As at the lower grade, more boys (30.6%) than girls (19.3%) report never having read a book. However, boys and girls are equally likely to report having read more than six or more than 10 books. Another difference from students’ responses at grade 4 is that, in grade 8, boys agree more strongly that mathematics is important in life relative to their female peers (47.6% of boys agreeing a lot, compared to 36.3% of girls). Among immigrant students, grade 8 boys were much more likely to attend a private school (11.1%) than grade 8 girls (2.7%) (Table 6). Nonetheless, as at grade 4, most boys (87.7%) and girls (96.3%) attended government schools.

Again, similar to grade 4, NALO data for grade 8 show the differences in perceptions among teachers, school principals, and parents in boys’ and girls’ schools. A substantially lower proportion of teachers of grade 8 boys agree that parents have a good understanding of their child’s current academic level (associated with the second-largest effect size observed: φ = 0.236; Table 7). As at grade 4, this suggests greater misalignment between student performance and parental understanding among parents in boys’ schools.

A lower level of respect for students who achieve at a high academic level is reported in boys’ schools, although teachers in boys’ schools are slightly more likely to view their grade 8 students as always being keen to excel academically. School principals report weaker parental support for learning for grade 8 boys than for grade 8 girls (Table 8), and that boys’ parents’ expectations are being met to a lesser extent. There was little difference in grade 8 boys’ and girls’ access to a school library from which they can borrow books.

Overall findings from multilevel models

Table 9 summarizes the main findings from the multilevel models. As discussed above, a series of hierarchical two-level linear regression models were constructed, starting with a simple model that includes gender but no other predictor variables. Then, with each step, the changes in the achievement gap between girls and boys, or due to the gender-segregated school system in Saudi Arabia, changes in the achievement gap between schools attended by girls and schools attended by boys, were explored when adding additional information on student demographics and home background, student engagement and attitudes, school climate, teacher qualifications, and school leadership and resources. Table 9 shows the coefficient for the gender difference. Regression estimates for the other predictor variables in steps 1–7 of each model are presented in Additional file 1: Tables S3–S7, while steps 6 and 7 are also presented and discussed in detail in the Findings for grade 4 and Findings for grade 8 sections below.

Table 9 Summary of main findings from the hierarchical two-level linear regression models

As shown in Table 9, boys underperformed relative to girls in Saudi Arabia across both grades and subjects (step 1). The achievement gap between boys and girls is greater in science than in mathematics across both grades. For example, boys in grade 4 underperformed girls by 53 points in science compared to 20 points in mathematics. The results also show that, in grade 4, controlling for student, home, teacher, and school characteristics accounts for the entire gender gap in mathematics and more than half of the gap in science. Specifically, when student-level predictor variables such as student demographics, home resources for learning, and literacy and numeracy readiness are taken into account, the gap in mathematics achievement drops from 20 to 8 points and is no longer statistically significant (step 2).

However, in grade 8, controlling for a wide range of characteristics from the student, parent, teacher, and principal questionnaires explains a relatively small portion of the achievement gap. As shown in Table 9, the gender gap between grade 8 boys and girls declines by 4 points in mathematics and 11 points in science once all the predictor variables are included (step 6). However, a significant unexplained gender gap favoring girls still exists in both subjects in grade 8. Estimates from Table 9 show that boys underperformed girls by 16 points in mathematics and 40 points in science, even after controlling for all observed characteristics.Footnote 9

To explore the extent to which predictors in the models have different effects on boys’ and girls’ performance, the statistical significance of the interactions between gender and each of the predictor variables was examined in step 7 of each model. The results from these interactions are presented in the final step (step 7) of each model, shown in Tables 10 and 11 below. Overall, results for the examined interaction terms show that, in grade 4, school climate, student absenteeism, and early numeracy and literacy skills contribute to the achievement gap between girls and boys in Saudi Arabia. A safe and orderly school climate is more strongly associated with improvements in boys’ mathematics and science achievement than girls’ achievement (in both subjects). The findings also indicate that boys’ mathematics achievement decreases to a greater degree than girls’ achievement with more frequent student absenteeism. In addition, the results suggest that, even though greater literacy and numeracy readiness was linked with improvements in science achievement of both boys and girls, boys tended to benefit more from this readiness than girls. For grade 8, boys’ mathematics achievement increases to a greater degree in schools with stronger emphasis on academic success than girls’ achievement. Feeling more confident in science was also associated with greater achievement gains in the subject among boys compared to girls.

Table 10 Steps 6 and 7 from the hierarchical two-level linear regression models for mathematics and science achievement, grade 4, TIMSS 2019
Table 11 Steps 6 and 7 from the hierarchical two-level linear regression models for mathematics and science achievement, grade 8, TIMSS 2019

Findings for grade 4

The results from the final two steps (steps 6 and 7) for the grade 4 mathematics and science models are presented in this section. Results for all steps are shown in Additional file 1: Tables S3–S7.

Mathematics

Table 10 provides the coefficients and model statistics for grade 4 mathematics from steps 6 and 7. Step 7, the final model, explains a substantial proportion of the observed variance in grade 4 mathematics achievement: 26% at level 1 (student level) and 71% at level 2 (class/school level), or 40% of the total observed variance.

As shown in step 6, the coefficient for gender is negative and statistically insignificant. This indicates that when controlling for student, home, teacher, and school characteristics, the achievement gap between grade 4 boys and girls in mathematics, or due to the gender-segregated school system in Saudi Arabia, the achievement gap between schools attended by boys and schools attended by girls, becomes statistically insignificant. Findings from step 6 also suggest that students’ home resources for learning, early literacy and numeracy skills, absenteeism, bullying, attitudes toward mathematics, and sense of school belongingFootnote 10 are significantly associated with student achievement, after holding other variables constant. For instance, students who are absent once a week tend to underperform students who are never or almost never absent by 24 points, which is equivalent to 24% of an SD. Also, students with stronger early literacy and numeracy skills (B = 7.6, p < 0.001) and those who reported being bullied less frequently achieved higher mathematics scores (B = 4.9, p < 0.001) relative to other students. At the class/school level, school location, poor teacher timekeeping, teacher experience, and professional development are significantly associated with student performance. Surprisingly, after holding other variables constant, students in schools located in small towns or remote areas perform better in mathematics than students in urban areas (B = 54.9, p < 0.05), while with each additional year of teacher experience, students score 1.8 points higher in mathematics (p < 0.01).

In step 7, the interactions between gender and each of the predictors were examined but only the results for interaction terms that are statistically significant are reported here. Overall, poor school climate tends to affect boys more negatively than girls. As shown in Fig. 2, the achievement gap between boys and girls is greater in schools with poor school climate relative to other schools. In schools with a safe and orderly school climate, boys and girls tend to perform similarly.

Fig. 2
figure 2

Interaction between gender and safe and orderly school climate on mathematics achievement, grade 4. The plot presents the predicted values from step 7 of the mathematics model for grade 4 (Table 10)

As shown in Figs. 3 and 4, a number of factors are associated with the underperformance of grade 4 boys in comparison to girls in school. One factor includes being absent from school once a week, which is more strongly associated with decreases in boys’ mathematics achievement compared to girls’. This means that boys’ mathematics achievement appears to suffer more from frequent absences from school compared to girls’ achievement (Fig. 3). Teacher age also affects achievement of grade 4 boys and girls differently in Saudi Arabia. Having a younger teacher (29 years or younger), rather than an older teacher (40 years or older) is more strongly associated with lower mathematics achievement among boys compared to girls (Fig. 4).

Fig. 3
figure 3

Interaction between gender and frequency of student absenteeism on mathematics achievement, grade 4. The plot presents the predicted values from step 7 of the mathematics model for grade 4 (Table 10). Only the reference category and the category for which a statistically significant interaction with gender was found are presented

Fig. 4
figure 4

Interaction between gender and teacher age on mathematics achievement, grade 4. The plot presents the predicted values from step 7 of the mathematics model for grade 4 (Table 10). Only the reference category and the category for which a statistically significant interaction with gender was found are presented

Science

Table 10 also presents the coefficients and model statistics for grade 4 science. Similar to grade 4 mathematics, the results from steps 6 and 7 are presented. The final model explains a substantial proportion of the observed variance in grade 4 science achievement: 27% at level 1 and 65% at level 2, or 39% of the total observed variance. As shown in step 6, the coefficient for gender is positive and statistically significant (B = 21.5, p < 0.05). This coefficient is much smaller in magnitude compared to the basic model (step 1 in Table 9), which suggests that controlling for student, home, teacher, and school characteristics reduces the science achievement gap between grade 4 boys and girls, or due to the gender-segregated school system in Saudi Arabia, the achievement gap between schools attended by boys and schools attended by girls, by more than half. There is still, however, a significant unexplained gap between boys and girls in grade 4 science. Students’ immigration status, early literacy and numeracy skills, absenteeism, bullying, attitudes toward science, and sense of school belonging are significantly associated with student achievement in science. For example, students who are absent once a week tend to underperform students who are never or almost never absent by 21 points. Also, students with stronger early literacy and numeracy skills tend to perform better in grade 4 science compared to other students (B = 7.6, p < 0.001).

School location, school mean of home resources, teacher experience, age, and professional development are the class/school-level variables that were significantly associated with fourth-grade students’ performance in science. After holding other variables constant, students in schools located in small towns or remote areas perform better in science than students in urban areas (B = 60.8, p < 0.01). Students of younger teachers scored lower in science than students of older teachers. Students whose teachers are between 30 and 39 years old scored 29 points lower than students of teachers who are 40 years or older and students whose teachers are 29 years old or younger scored 44 points lower than students of teachers who are 40 years or older. Also, students whose teachers have not participated in any professional development training on science (i.e., completed zero hours in professional development) tend to score much lower than students whose teachers have completed more than 35 h in professional development in science (B = -35.3, p < 0.05). Consistent with grade 4 mathematics, results in science from step 7 show that poor school climate affects boys more negatively than girls (Fig. 5). The achievement gap between boys and girls is greater in schools with poor school climate relative to other schools. In addition, boys tend to benefit more from literacy and numeracy readiness than girls (Fig. 6), and girls in urban and suburban areas outperform boys (Fig. 7).

Fig. 5
figure 5

Interaction between gender and safe and orderly school climate on science achievement, grade 4. The plot presents the predicted values from step 7 of the science model for grade 4 (Table 10)

Fig. 6
figure 6

Interaction between gender and literacy and numeracy readiness for school on science achievement, grade 4. The plot presents the predicted values from step 7 of the science model for grade 4 (Table 10)

Fig. 7
figure 7

Interaction between gender and school location on science achievement, grade 4. The plot presents the predicted values from step 7 of the science model for grade 4 (Table 10). Only the reference category and the category for which a statistically significant interaction with gender was found are presented

Findings for grade 8

The results from the final two steps (steps 6 and 7) for the grade 8 mathematics and science models are presented in this section. Results for all steps are shown in Additional file 1 : Tables S3–S7.

Mathematics

Table 11 presents the coefficients and model statistics for grade 8 mathematics. The gender difference in mathematics achievement remains statistically significant and only slightly smaller in magnitude (B = 16.3, p < 0.05) than the gender difference recorded in step 1 (B = 20.8, Table 9), even after the addition of the selected conceptual blocks of predictor variables. Despite the relatively small effect of the predictor variables on the extent of the gender difference, the final model, including interactions, explains a substantial proportion of the observed variance in grade 8 mathematics achievement: 27% at level 1 and 74% at level 2, or 39% of the total observed variance.

Student-level factors that were significantly associated with higher mathematics achievement among both boys and girls were: first-generation or second-generation immigrant status, greater access to home learning resources, infrequent absence from school (no more than once every two months), feeling more confident in mathematics, lower liking of mathematics, and taking less time to complete mathematics homework (15 min or less). Class/school-level factors that were significantly associated with higher mathematics achievement among both boys and girls were: a higher school-average level of home resources for learning across the student body, and mathematics teachers having a postgraduate qualification (master’s or doctorate) rather than a lower qualification.

One statistically significant interaction with gender was observed for grade 8 mathematics. This interaction, involving schools’ emphasis on academic success, is illustrated in Fig. 8. The interaction term indicates that boys’ mathematics achievement increases to a greater degree than girls’ achievement in schools with stronger emphasis on academic success, relative to schools with a weaker emphasis on academic success. However, though statistically significant, the magnitude of the interaction is small, as shown in Fig. 8. Given that this difference was not substantial, this finding should be interpreted with caution.

Fig. 8
figure 8

Interaction between gender and school emphasis on academic success on mathematics achievement, grade 8. The plot presents the predicted values from step 7 of the mathematics model for grade 8 (Table 11)

Science

Table 11 also presents the coefficients and model statistics for grade 8 science. The gender difference in science achievement remains statistically significant and substantial (B = 39.9, p < 0.001) after the addition of the selected conceptual blocks of predictor variables. Nonetheless, the final model, including interactions, explains a substantial proportion of the observed variance in grade 8 science achievement: 27% at level 1 and 79% at level 2, or 41% of the total observed variance.

Student-level factors that were significantly associated with higher science achievement among both boys and girls were: first-generation or second-generation immigrant status, greater access to home learning resources, infrequent absence from school (no more than once every two months), feeling more confident in science, and taking less time to complete science homework (15 min or less). Class/school-level factors that were significantly associated with higher science achievement among both boys and girls were: a higher school-average level of home resources for learning across the student body, infrequent teacher absenteeism (regarded by principals as not a problem), and science teachers whose qualification was in an area other than science or science education.

Two significant interactions with gender were observed. These interactions, involving the extent to which students feel confident in science and teachers’ age, are illustrated in Figs. 9 and 10. The first interaction term indicates that reporting feeling more confident in science is linked with greater gains in science achievement among grade 8 boys relative to grade 8 girls. The second interaction term indicates that boys’ achievement in grade 8 science is higher when taught by older teachers, whereas girls’ achievement is higher in classes taught by younger teachers (ages 30–39 years old) than in classes taught by older teachers.

Fig. 9
figure 9

Interaction between gender and student confident in science on science achievement, grade 8. The plot presents the predicted values from step 7 of the science model for grade 8 (Table 11)

Fig. 10
figure 10

Interaction between gender and teacher age on science achievement, grade 8. The plot presents the predicted values from step 7 of the science model for grade 8 (Table 11). Only the reference category and the category for which a statistically significant interaction with gender was found are presented

Discussion

The results presented in this paper shed new light on the factors that are associated with mathematics and science achievement in Saudi Arabia, and on the factors that contribute to the large differences in achievement in these subjects between boys and girls, or due to the gender-segregated school system in Saudi Arabia, the differences in achievement between schools attended by boys and schools attended by girls. Although there was variation across the four sets of multilevel models in terms of which variables were associated with student achievement when considered simultaneously, some consistency was also evident. Such consistency should help to identify key factors that can be considered as part of educators’ and policy makers’ efforts to raise achievement in elementary and intermediate schools in Saudi Arabia. Hence, summarized below are the most important findings of these analyses and some of the broader issues arising from them. In particular, attention is drawn next to the most robust findings with the clearest implications for educators and policy makers in Saudi Arabia.

Summary of main findings

The results of the analysis described in this paper suggest that, at the elementary level, early literacy and numeracy skills, student absenteeism, and school climate contribute to the observed gender gap in student performance in Saudi Arabia. Overall, several of the variables examined were found to be significantly associated with both mathematics and science achievement in grade 4, for both boys and girls. Higher scores in mathematics and science were associated with several factors, including students’: (a) having stronger early literacy and numeracy skills upon starting primary school, (b) liking mathematics or science, (c) being confident in mathematics or science, (d) being present at school more regularly, and (e) experiencing a lower frequency of bullying. The set of factors most consistently associated with achievement at grade 4 are predominantly at the student level and drawn mostly from the first two conceptual blocks entered into the models: the home background and student engagement and attitudes.

Similarly, several variables were found to be significantly associated with both mathematics and science achievement in grade 8. However, the pattern of common variables is somewhat different between the two grade levels. Among grade 8 students, higher scores in mathematics and science were associated with students’: (a) immigration status, (b) access to more learning resources at home, (c) feeling confident in mathematics or science, (d) regular presence in school, and (e) enrollment in a school where students have a higher average level of home learning resources. As at grade 4, each of these variables was part of the first two conceptual blocks in the models (the home background and student engagement and attitudes), with four of the five being student-level factors.

Accounting for observed gender differences in achievement in Saudi Arabia

Gender remained a significant predictor of science achievement in grade 4, and both mathematics and science achievement in grade 8, even after accounting for the other predictors. Although the gender difference in achievement is partially accounted for by the modeled variables—leading to a reduction in the gender difference in all models—grade 4 mathematics was the only one of the four sets of models where the final gender difference was no longer statistically significant. This implies that other factors, not examined in the models, contribute to the substantial residual gender difference in grade 4 science and grade 8 mathematics and science.

One possibility is that selection effects could be driving the observed differences—that is, if only high-achieving girls attend school or sit for assessments, but most boys do so, there is a possibility of bias such that girls’ average achievement would appear inflated. However, as enrollment in primary and intermediate education in Saudi Arabia is almost universal among both genders, selection effects are unlikely to be playing a role in this analysis.

Given evidence from other studies, it is likely that differences in reading proficiency play a role in explaining at least part of the remaining gender differences. This is particularly so in relation to science achievement, where test items are by necessity embedded in a context that often requires a greater degree of reading comprehension. Differences in reading proficiency between boys and girls might also contribute to explaining the remaining gender differences observed here in mathematics achievement in grade 8 as test items on certain areas may have a high reading load (i.e., require a higher volume of reading or more complex reading skills). For example, test items assessing applied reasoning or problem-solving skills in grade 8 are more likely to be embedded in a short scenario requiring some level of reading.

International assessments have consistently shown that reading achievement is, on average, substantially lower in Saudi Arabia than in many other countries, both at grade 4 (Mullis et al., 2012, 2017) and among 15-year-old students (OECD, 2019). Moreover, in Saudi Arabia, the reading achievement gap in which girls outperform boys is among the largest gender differences in the world. Differences between boys and girls in reading proficiency have been found to exceed half an SD in both the PIRLS and PISA studies (Mullis et al., 2017; OECD, 2019). Previous research on TIMSS mathematics and science items has shown that items with a higher reading load tend to be more difficult for students to answer correctly than items with a lower reading load, and also that weaker readers are disproportionately disadvantaged by a higher reading load (Mullis et al., 2013). For this reason, the magnitude and consistency of Saudi boys’ relative disadvantage in reading, seen across various studies, seems likely to play a role in contributing to their poorer results found here in mathematics and science achievement even after accounting for a range of contextual variables. The NALO 2018 results provide further support for this view, with boys at both grade levels being more likely than girls to report not having read a book (although it should be noted that this was the case even for a substantial proportion of girls).

The proposed importance of reading skills in underpinning mathematical and scientific achievement is consistent with the pattern of residual variance reported in the models, which indicates a role for other factors operating largely at the student level. After accounting for a range of other student- and class/school-level factors, the majority (approximately three-quarters) of class/school-level variance was explained in the models, whereas a majority of student-level variance remained unexplained. This suggests that residual gender differences in achievement are likely to be associated more strongly with student-level factors, such as reading skills, social and behavioral skills, or aspects of the home background, than with additional class/school-level factors.

The finding from NALO 2018 that more boys than girls have repeated a grade at school because of poor academic performance is worth noting in this regard. Similar data on the extent of grade repetition are available from PISA 2018 (OECD, 2019), where 13.0% of 15-year-old boys in Saudi Arabia reported repeating at least one grade, compared to 9.8% of girls. These findings hint at the likelihood that early disadvantages and difficulties with learning in the early grades may compound over time, and that there is a need for stronger learning supports for students with special educational needs and those who are struggling to enable progression through the education system. Although this issue affects both boys and girls in Saudi Arabia, the figures from NALO and PISA indicate that such compounding educational disadvantage is more clearly apparent among boys.

Teaching quality is another factor that may be associated with gender differences in achievement in Saudi Arabia. Female entrants to the teaching profession in Saudi Arabia tend to score higher than their male counterparts on the teacher licensure examination. This is consistent with evidence from other countries, which shows that the teaching profession attracts more high-ability female teachers than male teachers (Carroll et al., 2021; Corcoran et al., 2004). Differences in abilities between female and male teachers could be explained, in part, by the gender differences in returns to education across occupations (Cortes & Pan, 2018; World Bank, 2012). Research on teacher labor markets has shown that the opportunity cost of becoming a teacher is lower for women than men, due primarily to the more limited occupational opportunities for women outside the field of education (Carroll et al., 2021). Additionally, teaching is traditionally seen as a preferable profession for women in Saudi Arabia, which means that the competition among female graduates for teaching jobs is much higher than the competition among males. From the demand side, this implies a higher probability of selecting cognitively talented teachers from female graduates than from male graduates.

The analyses presented here have accounted for a high proportion of the observed variance in mathematics and science achievement in Saudi Arabia (ranging from 39 to 41% across subject domains and grade levels). Notably, these models largely account for the portion of variance in achievement that can be attributed to the class/school level. This suggests that policy makers may reasonably hope that focusing their attention on improving the class/school-level issues identified here (e.g., safe and orderly school climate, support for academic achievement, teacher attendance and timekeeping) would contribute to creating an education system that promotes higher levels of student achievement for all students. However, the fact that the majority of variance in achievement (70–75%) is attributable to student-level factors means that policy makers will also have to look at the home environment and broader society, as well as the school environment, in order to raise levels of achievement and close the (currently very wide) gaps in achievement between boys and girls in Saudi Arabia.

Limitations

The conclusions that can be drawn from the results reported in this paper are limited to being correlational in nature, as TIMSS and NALO are both cross-sectional studies. It would be incorrect to claim on the basis of these results alone that changes in any of the included variables will lead to corresponding changes in student achievement. In some cases, the findings presented here are clearly consistent with theoretical expectations and evidence from other settings—for example, promoting more regular attendance at a school with a learning-supportive climate may reasonably be expected to have positive implications for student learning. Nonetheless, readers should be aware that the model results need to be interpreted cautiously and with due regard to the wider theoretical and empirical literature. Informed decisions should be based on a broad reading of the literature and the evidence base, including the new results presented in this paper, rather than on any single study.

The results of the models hint at the importance of teachers and teaching quality as contributing factors to student outcomes. However, the strength of any conclusions related to teaching are constrained by limitations in the available data. For example, the TIMSS variable describing teachers’ qualifications at grade 4 was omitted from analysis due to an error identified by the authors in the Saudi Arabia dataset for TIMSS 2019. TIMSS collects some data related to teachers and classroom practices but more detailed analysis on teacher quality would be possible with other studies explicitly focused on this topic.

Finally, although the highly gender-segregated structure of the education system in Saudi Arabia presents an opportunity to examine the educational environments experienced by boys and girls in relative isolation, this same feature also imposes analytic constraints. As there are no cases in the available data of boys and girls taught in the same classes, boys taught by female teachers, or girls taught by male teachers, it is impossible to disentangle gendered differences in learning outcomes from other factors that covary completely with students’ gender. In these datasets, boys are universally taught by male teachers, who, in turn, received their education and teaching qualifications from all-male institutions, which may differ in important ways from the institutions attended by girls and female teachers. The ongoing rollout of a scheme to assign female teachers to boys in the early grades, as described earlier, will provide opportunities in future to reexamine outcomes among boys and girls in Saudi Arabia while controlling for teacher characteristics to a greater degree.

Conclusions and implications

The findings of this study point to the relevance of the school climate in understanding gender differences in achievement observed in Saudi Arabia. Although previous research points to the value to students of a stable and supportive school climate in general (Nilsen et al., 2016; Reynolds et al., 2014), the results of this study indicate that boys in Saudi Arabia may be especially impacted by a negative or unstable school environment. Most notably, the presence of a safe and orderly school climate for grade 4 students and a supportive climate for academic success for grade 8 students are particularly associated with higher achievement for boys relative to attending less orderly or less supportive schools.

School principals, teachers, and other educators should be cognizant of the importance of these factors and should take active steps to build and maintain positive school and classroom environments where students feel safe, connected, and positively challenged to learn and think. Where these conditions are not present, student learning is likely to be impeded. This is especially the case for boys, who may require a greater degree of behavioral support and guidance from adults to engage fully with schoolwork in a structured classroom setting in a single-gender school environment. Where such support and guidance are lacking, boys appear to fall behind in their learning and are at risk of being held back for a year to a greater degree than girls who similarly lack a positive school climate. This may be related to gendered differences in societal expectations (Ridge & Jeon, 2020) and, as indicated by the NALO data, greater support for learning for girls at home (Ridge & Jeon, 2020).

Teachers can help to create positive learning environments and encourage active student participation in their learning by, for example, integrating students’ interests into their lesson material where possible while remaining alert to the effects of stereotypes (such as boys being more suited than girls to science and mathematics, or girls being more suited to reading) on how students engage with lessons and how teachers communicate with their students (Brozo et al., 2014; OECD, 2015). It is also important that lessons are challenging but at a level that students can realistically engage with and understand. Where basic prerequisite learning has not been solidified, teachers are likely to find themselves covering more advanced topics with limited student engagement or understanding (Niemiec & Ryan, 2009). Other practices that teachers can integrate into their teaching in order to create a positive learning environment include offering students choices, providing rationales for decisions made or, where choices cannot be offered, encouraging students to ask questions and to offer their perspectives, listening to and acknowledging students’ contributions, and offering constructive feedback on how students can improve (Teixeira et al., 2020).

A supportive school environment is important for boys’ learning, but support for learning in the home is also crucial. By the time students begin attending school, they have been growing, developing, and learning at home and in the community for several years already. The TIMSS data show that boys in Saudi Arabia begin school with weaker early literacy and numeracy skills than girls. Moreover, the models indicate that boys’ science achievement is more strongly associated with their early literacy and numeracy skills compared to girls’. Boys who begin school with weak early literacy and numeracy skills tend to have considerably lower science achievement than their female counterparts with equivalent early literacy and numeracy skills by grade 4, while science achievement of boys and girls with stronger early literacy and numeracy skills tends to be similar. In other words, boys who begin school at an early learning disadvantage to their peers are further disadvantaged as they progress through the education system and appear to be at more risk of falling behind than girls who begin school with weaker early skills. This can also be seen in students’ reports, in NALO, that boys are more likely to repeat a year in school because of poor academic performance.

It is important that parents are aware that early childhood development lays a foundation for future education, health, well-being, and economic success. Public health and education agencies should promote awareness among parents and provide guidance and resources to encourage greater engagement in early learning in the family. For example, simple activities that can contribute to a child’s early literacy and numeracy development could include reading together, describing a scene in everyday life, counting everyday objects or singing counting songs, and using mathematical and spatial language while playing with shapes or other objects (e.g., “behind”, “above”, “beside”, “straight”, “curved”, “double”). Data from TIMSS 2019 show that parents in Saudi Arabia report engaging in activities of these types less frequently with young boys than young girls. Taking steps to increase the level of support for early childhood learning at home for boys, in particular, would likely lead to a stronger foundation in the future for boys starting school and to greater progress in learning among boys. Cultural and social barriers present in Saudi Arabia that contribute to low enrollment of young children in kindergarten—for example, social expectations relating to motherhood and childrearing at home—also need to be considered in this respect.

It is noteworthy that, despite the differences seen across the four models, two variables were found to be significantly associated with both mathematics and science achievement at both grade levels. These were students’ feeling confident in mathematics or science (positively associated with achievement in all cases) and students’ reported levels of absenteeism (more frequent absences being negatively associated with achievement in all cases). The consistency of these findings demands attention from Saudi Arabia’s education community.

In particular, student absenteeism, as an issue that is likely more responsive to policy making than many others, should be considered carefully. The analyses presented here have shown that student absenteeism in Saudi Arabia is widespread, frequent, and consistently associated with achievement in at least two key areas of study (mathematics and science), at both elementary and intermediate school levels. In many countries, student absenteeism is relatively rare, and structures are in place to monitor and promote regular attendance at school. These structures can encompass both informal channels (between the teacher or principal and the child’s parents) and formal channels (formal communication between the school and the parents or, in more extreme cases, a state agency tasked to ensure minimal levels of attendance at school). The frequency of absenteeism for many students in Saudi Arabia, coupled with the likely negative implications of regular absenteeism for achievement, suggest that Saudi Arabia’s policy makers should study efforts in other countries to combat absenteeism (e.g., Knoster, 2016; Rogers & Vegas, 2009) and consider how similar approaches could be usefully adapted to the local context.

A similar problem is apparent with the teaching workforce in Saudi Arabia’s schools. A substantial proportion of school principals, at both elementary and intermediate levels, indicated that teacher absenteeism and poor teacher timekeeping (teachers arriving late to school or leaving early) are problems in their schools. This is consistent with previous research indicating that teachers in Saudi Arabia’s schools often lack enthusiasm for the profession and are poorly motivated (OECD, 2020). Without taking steps to ensure that teachers are both highly skilled and present and engaged in teaching during scheduled working hours, students will continue to be at risk of failing to reach their full potential as a result of failures in school management practices. Other initiatives that may be taken to, for example, build supportive school climates, are likely to be limited as long as they are undermined by poor teacher attendance at school and lack of teacher enthusiasm (in itself, a contributory factor to a school environment that is not conducive to student learning).

Teacher training represents another area for improvement. Results from this paper show that despite male teachers’ greater exposure to education during initial training and their higher qualifications compared to female teachers,Footnote 11 boys in Saudi schools achieve much poorer outcomes than girls in both mathematics and science. Although holding higher qualifications does not necessarily imply a higher standard of teaching (Harris & Sass, 2011), especially when the focus of the qualification is unknown, these patterns may signal the poor quality of teacher education and training. Further study of these dynamics as they relate to student outcomes would be useful.

In general, efforts to raise educational achievement in Saudi Arabia require taking a broader view beyond the necessary focus on schools and teachers. As noted above, early child development (physical, cognitive, social) and early learning provide foundations for achievement in elementary and intermediate school, and beyond. Ongoing support for learning at home throughout childhood is also crucial, including modeling of positive behaviors (e.g., reading) and involvement in children’s education by their parents.

Suggestions for further research

Future efforts to explain the observed differences between boys’ and girls’ achievement in Saudi Arabia should, if possible, seek to include a broader range of out-of-school factors in the analysis than were possible with the TIMSS 2019 dataset. For example, some variables that are not available in TIMSS 2019 or NALO 2018 but that could be usefully considered in a future analysis of gender differences in achievement include students’ reading proficiency, students’ engagement in reading for leisure, and the (gendered) nature of parents’ expectations and aspirations for their child’s education, qualifications, and future careers. In particular, considering the importance of literacy as a foundational skill (Gregory et al. 2021), the inclusion of an indicator of reading achievement would help to control for gender differences relating to literacy and would allow more fine-grained examination of mathematical and scientific proficiency. Among international assessments, data from PISA or from a joint TIMSS and PIRLS assessment (such as TIMSS/PIRLS 2011) could be used for this purpose. At the national level, an administration of NALO that assessed reading as well as mathematics or science from the same students could also be used. Given that the majority of unexplained variance in the models presented in this paper was at the student level, extending future analyses in this way should provide further useful insights.

As noted above, a focused examination of teaching quality in Saudi Arabia—incorporating teacher characteristics, quality of teacher education, professional development, availability and use of resources, classroom management, professional collaboration, and pedagogy—would shed further light on some of the points raised in this paper. In particular, differences between the classroom environments of boys and girls, given the gender-segregated structure of the education system, merit closer inspection.

Finally, it would be useful to extend the work presented in this paper by drawing on data from other countries. In the first instance, subsequent research could focus on countries with similar cultural contexts such as other countries with comparable international data within the Gulf or MENA regions. Such research could examine (a) the extent of similarity between observed gender differences in Saudi Arabia compared to other countries, and (b) similarities and differences in the factors associated with student outcomes in each national context. Further work could also examine factors associated with gender differences in single-gender compared to mixed-gender schools (see Fig. 1), particularly in, but not limited to, the MENA region.

Availability of data and materials

The TIMSS datasets analysed in this study are publicly available on the TIMSS website (https://timssandpirls.bc.edu/timss-landing.html). NALO data are not publicly available. Permission to use the NALO data was granted to the authors by the Education and Training Evaluation Commission (ETEC), Saudi Arabia. Requests to access NALO data should be made to ETEC.

Notes

  1. In 2019, Saudi Arabia announced that boys would start to be educated by female teachers in grades 1 through 3. These boys’ classes are kept separated from the girls’ classes. Currently, there are few girls’ schools offering boys’ classes with female teachers, though the intention is that this number will increase in the coming years.

  2. GER can exceed 100% as it may include students who are younger or older than the official age cohort for primary or secondary school.

  3. A composite indicator of learning outcomes at the country level, based on data from large-scale assessments such as TIMSS, PISA, Progress in International Reading Literacy Study (PIRLS), and early-grade reading or mathematics assessments (Angrist et al., 2021).

  4. Representative regional-level data from NALO could be exploited to examine the varying availability of resources and variability in practices across the different regions of Saudi Arabia, and how region-level differences are related to differences in achievement and the gender gap.

  5. Plausible values are generated by imputing a set of values (five values in the case of TIMSS) representing ‘plausible’ estimates of student achievement based on student responses to the assessment and background variables. Plausible values are not suitable for reporting individual-level results, but at the population level, the use of plausible values facilitates the calculation of appropriate standard errors for complex survey designs such as those used by TIMSS where each student is administered only a small subset of the items in the assessment.

  6. While gender information in TIMSS is collected at the student level, in the case of Saudi Arabia, where schools are gender-segregated, it also reflects the school gender type.

  7. The TIMSS variable describing teachers’ qualifications for grade 4 shows that the majority of teachers in Saudi Arabia have secondary education. This is not consistent with official data from the Ministry of Education or other available data sources, which indicate that most teachers in the country have at least a bachelor’s degree. Due to this inconsistency, the authors of the study decided to omit this variable from the analysis.

  8. The relatively large proportion of grade 4 boys reporting having repeated a year at school because of poor academic performance is not consistent with existing data in Saudi Arabia. Provided that, and also that information on grade repetition is self-reported by students, which makes it subject to measurement error and other types of recall errors, especially at the primary level, these data should be interpreted with caution.

  9. The coefficients of the gender achievement gap in the grade 4 science, grade 8 mathematics, and grade 8 science models increase in step 6. When the interactions between gender and the variables within the School leadership and resources block were examined, none of them were statistically significant. Hence, this increase could not be attributed to an interaction between gender and these variables. A potential explanation for this increase is that the inclusion of additional variables in each of the models introduced new missing cases due to listwise deletion, which, in turn, may have had an impact on the gender coefficient.

  10. The coefficient for sense of school belonging is negative, suggesting a negative correlation between student achievement and sense of school belonging. One potential explanation is that high-achieving students in Saudi Arabia may feel alienated within schools or not appreciated/respected by their peers, which is consistent with some existing literature (e.g., Jha & Pouezevara, 2016).

  11. Male teachers are more likely than female teachers to report holding a master’s or doctorate-level qualification, as are principals of boys’ schools compared to principals of girls’ schools.

Abbreviations

CSEC:

Caribbean Secondary Education Certification

ETEC:

Education and Training Evaluation Commission

GDP:

Gross Domestic Product

GER:

Gross Enrollment Ratio

IEA:

International Association for the Evaluation of Educational Achievement

IDB:

International Database Analyzer

MENA:

Middle East and North Africa

NALO:

National Assessment of Learning Outcomes

OECD:

Organisation for Economic Co-operation and Development

PIRLS:

Progress in International Reading Literacy Study

PISA:

Programme for International Student Assessment

SPSS:

Statistical Package for the Social Sciences

TALIS:

Teaching and Learning International Survey

TIMSS:

Trends in International Mathematics and Science Study

References

  • Angrist, N., Djankov, S., Goldberg, P. K., & Patrinos, H. A. (2021). Measuring human capital using global learning data. Nature, 592, 403–408. https://doi.org/10.1038/s41586-021-03323-7

    Article  Google Scholar 

  • Autor, D., Figlio, D., Karbownik, K., Roth, J., & Wasserman, M. (2016). School quality and the gender gap in educational achievement. American Economic Review, 106(5), 289–295. https://doi.org/10.1257/aer.p20161074

    Article  Google Scholar 

  • Autor, D., Figlio, D., Karbownik, K., Roth, J., & Wasserman, M. (2019). Family disadvantage and the gender gap in behavioral and educational outcomes. American Economic Journal: Applied Economics, 11(3), 338–381. https://doi.org/10.1257/app.20170571

    Article  Google Scholar 

  • Bertrand, M., & Pan, J. (2013). The trouble with boys: Social influences and the gender gap in disruptive behavior. American Economic Journal: Applied Economics, 5(1), 32–64. https://doi.org/10.1257/app.5.1.32

    Article  Google Scholar 

  • Booth, A. L., & Nolen, P. (2012). Gender differences in risk behaviour: Does nurture matter? The Economic Journal, 122(558), F56–F78. https://www.jstor.org/stable/41418970

  • Brozo, W. G., Sulkunen, S., Shiel, G., Garbe, C., Pandian, A., & Valtin, R. (2014). Reading, gender, and engagement. Journal of Adolescent & Adult Literacy, 57(7), 584–593. https://doi.org/10.1002/jaal.291

    Article  Google Scholar 

  • Buchmann, C., & DiPrete, T. A. (2006). The growing female advantage in college completion: The role of family background and academic achievement. American Sociological Review, 71(4), 515–541. https://doi.org/10.1177/000312240607100401

    Article  Google Scholar 

  • Carroll, D., Parasnis, J., & Tani, M. (2021). Why do women become teachers while men don’t? The B.E. Journal of Economic Analysis & Policy, 21(2), 793–823. https://doi.org/10.1515/bejeap-2020-0236

  • Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Lawrence Erlbaum Associates.

  • Corcoran, S. P., Evans, W. N., & Schwab, R. M. (2004). Women, the labor market, and the declining relative quality of teachers. Journal of Policy Analysis and Management, 23(3), 449–470. https://www.jstor.org/stable/3326261

  • Cornwell, C., Mustard, D. B., & Van Parys, J. (2013). Noncognitive skills and the gender disparities in test scores and teacher assessments: Evidence from primary school. Journal of Human Resources, 48(1), 236–264. https://doi.org/10.3368/jhr.48.1.236

    Article  Google Scholar 

  • Cortes, P., & Pan, J. (2018). Occupation and gender. In S. L. Averett, L. M. Argys, & S. D. Hoffman (Eds.), The Oxford Handbook of Women and the Economy (pp. 424–452). Oxford University Press. https://doi.org/10.1093/oxfordhb/9780190628963.013.12

  • DiPrete, T. A., & Buchmann, C. (2013). The rise of women: The growing gender gap in education and what it means for American schools. Russell Sage Foundation.

  • DiPrete, T. A., & Jennings, J. L. (2012). Social and behavioral skills and the gender gap in early educational achievement. Social Science Research, 41(1), 1–15. https://doi.org/10.1016/j.ssresearch.2011.09.001

    Article  Google Scholar 

  • Downey, D. B., & Vogt Yuan, A. S. (2005). Sex differences in school performance during high school: Puzzling patterns and possible explanations. The Sociological Quarterly, 46(2), 299–321. https://doi.org/10.1111/j.1533-8525.2005.00014.x

    Article  Google Scholar 

  • Eisenkopf, G., Hessami, Z., Fischbacher, U., & Ursprung, H. W. (2015). Academic performance and single-sex schooling: Evidence from a natural experiment in Switzerland. Journal of Economic Behavior & Organization, 115, 123–143. https://doi.org/10.1016/j.jebo.2014.08.004

    Article  Google Scholar 

  • Fortin, N. M., Oreopoulos, P., & Phipps, S. (2015). Leaving boys behind—Gender disparities in high academic achievement. Journal of Human Resources, 50(3), 549–579. https://doi.org/10.3368/jhr.50.3.549

    Article  Google Scholar 

  • Fritz, C. O., Morris, P. E., & Richler, J. J. (2012). Effect size estimates: Current use, calculations, and interpretation. Journal of Experimental Psychology General, 141(1), 2–18. https://doi.org/10.1037/a0024338

    Article  Google Scholar 

  • Gregory, L., Taha-Thomure, H., Kazem, A., Boni, A., Elsayed, M. A. A., & Taibah, N. (2021). Advancing Arabic Language teaching and learning: A path to reducing learning poverty in the Middle East and North Africa. World Bank. https://openknowledge.worldbank.org/handle/10986/35917

  • Harris, D. N., & Sass, T. R. (2011). Teacher training, teacher quality and student achievement. Journal of Public Economics, 95(7–8), 798–812. https://doi.org/10.1016/j.jpubeco.2010.11.009

    Article  Google Scholar 

  • Hattie, J. A. C. (2009). Visible learning: A synthesis of over 800 meta-analyses relating to achievement. Routledge.

  • IBM Corporation. (2020). IBM SPSS Statistics for Windows (No. 27). IBM Corporation.

    Google Scholar 

  • IEA. (2021). Help manual for the IEA IDB Analyzer (Version 4.0). https://www.iea.nl

  • Jackson, C. K. (2012). Single-sex schools, student achievement, and course selection: Evidence from rule-based student assignments in Trinidad and Tobago. Journal of Public Economics, 96(1–2), 173–187. https://doi.org/10.1016/j.jpubeco.2011.09.002

    Article  Google Scholar 

  • Jha, J., & Pouezevara, S. (2016). Measurement and research support to education - Strategy goal I - Boys’ underachievement in education: A review of the literature with a focus on reading in the early years. United States Agency for International Development (USAID).

  • Knoster, K. C. (2016). Strategies for addressing student and teacher absenteeism: A literature review. U.S. Department of Education, North Central Comprehensive Center.

  • LaRoche, S., Joncas, M., & Foy, P. (2020). Sample design in TIMSS 2019. In M. O. Martin, M. von Davier, & I. V. S. Mullis (Eds.), Methods and procedures: TIMSS 2019 technical report (pp. 3.1–3.33). TIMSS & PIRLS International Study Center, Lynch School of Education and Human Development, Boston College, and International Association for the Evaluation of Educational Achievement (IEA). https://timssandpirls.bc.edu/timss2019/methods/chapter-3.html

  • Legewie, J., & DiPrete, T. A. (2012). School context and the gender gap in educational achievement. American Sociological Review, 77(3), 463–485. https://doi.org/10.1177/0003122412440802

    Article  Google Scholar 

  • Mullis, I. V. S., Martin, M. O., Foy, P., & Drucker, K. T. (2012). PIRLS 2011 international results in reading. TIMSS & PIRLS International Study Center, Lynch School of Education, Boston College, and International Association for the Evaluation of Educational Achievement (IEA).

  • Mullis, I. V. S., Martin, M. O., & Foy, P. (2013). The impact of reading ability on TIMSS mathematics and science achievement at the fourth grade: An analysis by item reading demands. In M. O. Martin & I. V. S. Mullis (Eds.), TIMSS and PIRLS 2011: Relationships among reading, mathematics, and science achievement at the fourth grade —Implications for early learning (pp. 67–108). TIMSS & PIRLS International Study Center, Lynch School of Education, Boston College, and International Association for the Evaluation of Educational Achievement (IEA). https://timssandpirls.bc.edu/timsspirls2011/downloads/TP11_Chapter_2.pdf

  • Mullis, I. V. S., Martin, M. O., Foy, P., & Hooper, M. (2017). PIRLS 2016 international results in reading. TIMSS & PIRLS International Study Center, Lynch School of Education, Boston College, and International Association for the Evaluation of Educational Achievement (IEA).

  • Mullis, I. V. S., Martin, M. O., Foy, P., Kelly, D. L., & Fishbein, B. (2020). TIMSS 2019 international results in mathematics and science. TIMSS & PIRLS International Study Center, Lynch School of Education and Human Development, Boston College, and International Association for the Evaluation of Educational Achievement (IEA).

  • Muthén, L. K., & Muthén, B. O. (1998-2017). Mplus user’s guide (8th ed.). Muthén & Muthén.

  • Niemiec, C. P., & Ryan, R. M. (2009). Autonomy, competence, and relatedness in the classroom: Applying self-determination theory to educational practice. Theory and Research in Education, 7(2), 133–144. https://doi.org/10.1177/1477878509104318

    Article  Google Scholar 

  • Nilsen, T., Blömeke, S., Hansen, K. Y., & Gustafsson, J.-E. (2016). Are school characteristics related to equity? The answer may depend on a country’s developmental level (Policy Brief No. 10). International Association for the Evaluation of Educational Achievement (IEA).

  • OECD. (2015). The ABC of gender equality in education: Aptitude, behaviour, confidence. PISA, OECD Publishing. https://doi.org/10.1787/9789264229945-en

    Book  Google Scholar 

  • OECD. (2019). PISA 2018 results (volume I): What students know and can do. PISA, OECD Publishing. https://doi.org/10.1787/5f07c754-en

    Book  Google Scholar 

  • OECD. (2020). Education in Saudi Arabia (Reviews of National Policies for Education). OECD Publishing. https://doi.org/10.1787/76df15a2-en

    Book  Google Scholar 

  • OECD. (2021). Positive, high-achieving students? What schools and teachers can do. TALIS, OECD Publishing. https://doi.org/10.1787/3b9551db-en

    Book  Google Scholar 

  • Page, E., & Jha, J. (Eds.). (2009). Exploring the bias: Gender and stereotyping in secondary schools. Commonwealth Secretariat. https://doi.org/10.14217/9781848590427-en

  • Pahlke, E., Hyde, J. S., & Allison, C. M. (2014). The effects of single-sex compared with coeducational schooling on students’ performance and attitudes: A meta-analysis. Psychological Bulletin, 140(4), 1042–1072. https://doi.org/10.1037/a0035740

    Article  Google Scholar 

  • Pahlke, E., Hyde, J. S., & Mertz, J. E. (2013). The effects of single-sex compared with coeducational schooling on mathematics and science achievement: Data from Korea. Journal of Educational Psychology, 105(2), 444–452. https://doi.org/10.1037/a0031857

    Article  Google Scholar 

  • Patrinos, H. A., & Angrist, N. (2019). Harmonized learning outcomes: Transforming learning assessment data into national education policy reforms. World Bank Blogs. https://blogs.worldbank.org/opendata/harmonized-learning-outcomes-transforming-learning-assessment-data-national-education

  • Reynolds, D., Sammons, P., De Fraine, B., Van Damme, J., Townsend, T., Teddlie, C., & Stringfield, S. (2014). Educational effectiveness research (EER): A state-of-the-art review. School Effectiveness and School Improvement, 25(2), 197–230. https://doi.org/10.1080/09243453.2014.885450

    Article  Google Scholar 

  • Ridge, N., & Jeon, S. (2020). Father involvement and education in the Middle East: Geography, gender, and generations. Comparative Education Review, 64(4), 725–748. https://doi.org/10.1086/710768

    Article  Google Scholar 

  • Rogers, H. F., & Vegas, E. (2009). No more cutting class? Reducing teacher absence and providing incentives for performance. World Bank Policy Research Working Paper No. 4847. https://doi.org/10.1596/1813-9450-4847

  • Rutkowski, L., Gonzalez, E., Joncas, M., & von Davier, M. (2010). International large-scale assessment data: Issues in secondary analysis and reporting. Educational Researcher, 39(2), 142–151. https://doi.org/10.3102/0013189X10363170

    Article  Google Scholar 

  • Stromquist, N. (2007). The gender socialization process in schools: A cross-national comparison (2008/ED/EFA/MRT/PI/71). Background paper for the Education for All Global Monitoring Report 2008, Education for All by 2015: Will we make it?

  • Teixeira, P. J., Marques, M. M., Silva, M. N., Brunet, J., Duda, J. L., Haerens, L., La Guardia, J., Lindwall, M., Lonsdale, C., Markland, D., Michie, S., Moller, A. C., Ntoumanis, N., Patrick, H., Reeve, J., Ryan, R. M., Sebire, S. J., Standage, M., Vansteenkiste, M., Weinstein, N., Weman-Josefsson, K., Williams, G. C., & Hagger, M. S. (2020). A classification of motivation and behavior change techniques used in self-determination theory-based interventions in health contexts. Motivation Science, 6(4), 438–455. https://doi.org/10.1037/mot0000172

    Article  Google Scholar 

  • von Davier, M., Gonzalez, E., & Mislevy, R. J. (2009). What are plausible values and why are they useful? IERI Monograph Series: Issues and Methodologies in Large-Scale Assessments, 2, 9–36.

    Google Scholar 

  • World Bank. (2012). World development report 2012: Gender equality and development. World Bank. https://openknowledge.worldbank.org/handle/10986/4391

  • Younger, M., & Cobbett, M. (2014). Gendered perceptions of schooling: Classroom dynamics and inequalities within four Caribbean secondary schools. Educational Review, 66(1), 1–21. https://doi.org/10.1080/00131911.2012.749218

    Article  Google Scholar 

Download references

Acknowledgements

The authors would like to thank Harry Patrinos, Georgios Sideridis, and Tarek Mostafa for their valuable comments on an earlier version of this paper. The team is also grateful to Laura Gregory, Andreas Blom, Yisgedullish Amde, and Jee Yoon Lee for their helpful feedback and support.

Funding

This research has been conducted as part of the Technical Cooperation Program between the World Bank and the Education and Training Evaluation Commission (ETEC), funded by the Ministry of Finance of the Kingdom of Saudi Arabia under the Reimbursable Advisory Services (RAS) framework.

Author information

Authors and Affiliations

Authors

Contributions

All authors contributed to the design of the study. AC and VP analyzed and interpreted the data and together with ME, who managed the study, wrote all sections of the paper. NA and KA provided access to the national data, shared information about the national context as well as feedback in writing and in group discussions throughout the study, and contributed to the drafting of the paper. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Mahmoud A. A. Elsayed.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1: Table S1.

Predictor variables from the TIMSS and NALO databases. Table S2. TIMSS cut-scores for categories of continuous indices, TIMSS 2019. Table S3. Equations for each step of the hierarchical two-level linear regression models. Table S4. Hierarchical two-level linear regression model for mathematics achievement, grade 4, TIMSS 2019. Table S5. Hierarchical two-level linear regression model for science achievement, grade 4, TIMSS 2019. Table S6. Hierarchical two-level regression model for mathematics achievement, grade 8, TIMSS 2019. Table S7. Hierarchical two-level regression model for science achievement, grade 8, TIMSS 2019.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Elsayed, M.A.A., Clerkin, A., Pitsia, V. et al. Boys’ underachievement in mathematics and science: An analysis of national and international assessment data from the Kingdom of Saudi Arabia. Large-scale Assess Educ 10, 23 (2022). https://doi.org/10.1186/s40536-022-00141-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s40536-022-00141-9

Keywords

  • Gender
  • Boys’ underachievement
  • Mathematics
  • Science
  • TIMSS
  • NALO
  • Hierarchical two-level linear regression modelling
  • Saudi Arabia