Classroom, media and church: explaining the achievement differences in civic knowledge in the bilingual school system of Estonia

This study investigates civic and citizenship education in a unique post-Communist context–in the bilingual education system of Estonia. Estonia continues to have a bilingual school system where there are Estonian and Russian language schools in parallel. While Estonian language school students are ranked very high in international comparisons, there is a significant difference between the achievement of Estonian and Russian language school students. We claim that this minority achievement gap in the performance of civic and citizenship knowledge is in addition to family background characteristics explained by behavioral and attitudinal factors that are moderated by the school language. Behavioral and attitudinal independent variables that we consider relevant in our analysis are classroom climate, trust in various media channels, and students’ beliefs in the influence of religion. We rely on hierarchical modeling to capture the embedded data and aim to explain how the different layers (school- and student level) interact and impact civic knowledge. We show that an open classroom is beneficial to students and part of the gap can be explained by Russian school students’ lower involvement in such practices. The strength of the belief in the influence of religion, on the contrary, is hurting students, despite that the negative effect is smaller for minority students there is a higher aggregate negative effect of it and therefore it also contributes to the minority achievement gap. Media trust indicators explain the gap marginally while the high trust of social media hurts students’ civic knowledge scores–still more Russian school students trust social media more than Estonian school students.

focusing on contextualised notions of citizenship studies and civic and citizenship education in unique post-Communist context, often under-represented in related scholars. In exploring whether students from different ethnic groups in linguistically divided school system experience the potential influencers of civic knowledge differently and how such differences relate to civic knowledge outcome would be an enhancement in this line. Second, measures of student religiosity and its associations with civic and citizenship education have remained mostly unexplored (ibid. 22).
This study investigates civic and citizenship education in the research gap described above in a unique post-Communist context in the bilingual education system of Estonia. We aim to show how in addition to student background characteristics the differences in classroom climate, trust in media and the belief in the influence of religion interplay with the school level characteristic such as the language of school and how this interplay is associated with the minority achievement gap in civic and citizenship knowledge (CCK). Estonia became a host of a sizeable Russian minority after World War II. Since then, a large proportion of the population (1/3) is Russian speaking, and the schools with Estonian or Russian as their respective languages of instruction still coexist in the contemporary education system of Estonia. There have been attempts to integrate these two educational streams (e.g., regulation requiring 60 per cent Estonian-language use at the gymnasium level in Russian language schools since 2011 and various language immersion programs). However, the differences between the performance of children studying in schools offering Estonian-language instruction and those offering Russian-language instruction (hereinafter Russian and Estonian schools) are significant. Estonian school students, for instance, are one of the top performers in international comparisons (e.g., OECD's Programme for International Student Assessment (PISA, Schleicher 2019) and International Civic and Citizenship Education Study (ICCS, Schultz et al. 2018a). However, a significant gap between the achievement scores of Estonian and Russian schools exist (hereinafter minority achievement gap). While lower achievement scores of immigrant background students are often associated with the additional difficulties of coping with the language, in Estonia (as in many other post-Soviet countries), the situation is different-the population in question are so-called historical immigrants; students and often their parents are born in countries of residence and study in their mother tongue. Besides, civic and citizenship education is provided as one of the general competences and as a separate subject in both language schools, and there are differences neither in curriculum nor the governance of these schools.
This dichotomy poses the question of why students in schools with different languages of instruction score so differently despite a uniform national curriculum? We aim to show that after controlling for individual-level background characteristics the minority achievement gap in CCK in Estonia is, to a large extent, moderated by cultural-ideological factors such as perceived school climate, trust in different media channels, and the role of religiosity in everyday decisions that, first, have remarkable between-group differences in Russian and Estonian language schools; second, these cultural-ideological effects have specificities stemming from the school language.
Our data originate from International Association for the Evaluation of Educational Achievement (IEA) second cycle of the Schulz et al. (2018a). Estonia was one of 24 participating countries. The data structure contains the student-level data-the CCK test, student questionnaire based assessment of background, attitudes and involvement; for the school-level we use school language to distinguish between Estonian and Russian schools. In Estonia, 3,255 students participated in the second cycle of ICCS, from both Estonian and Russian language schools (2,391 and 864, respectively). For the empirical technique, we employ hierarchical modelling, which handles well the embedded data structures such as educational data. However, modelling specificities due to missing data, scales of measurement and robustness allow multiple approaches that we discuss in the paper.
To accomplish our aim, we start with the short case description and posing the literature 'enlightened' hypothesis, illustrated by stylised facts from exploratory data analysis. Then we describe the data, impute missing data and give a discussion over statistical tools for the confirmatory analysis. After that, we continue with school random effect hierarchical modelling and end with a discussion.

Case specificities, exploratory analysis, and hypotheses
The education system in Estonia is characterised by comprehensive schooling; most schools are owned and operated by municipalities. Besides, the education system is single track, i.e., children often stay the same school for 12 years, schools follow the same curriculum, and transfer of students between schools is not common nor encouraged. Even in the Soviet period, Estonian schools were able to keep Estonian as a language of instruction. Russian schools were introduced to serve the educational needs of firstgeneration immigrants. Following the restoration of independence, the language-based segregation of the school system remained, and the schools with Estonian or Russian as their respective languages of instruction still coexist in today's education system. Estonian government's approach to Russian schools has been characterised as laissez-faire (Khavenson and Carnoy 2016), indicating the degree that Russian schools continued teaching mainly, if not entirely, in Russian. Also, reform initiatives to increase the share of the Estonian language classes in Russian schools remained slow. Not until 2007 did a secondary-school language ruling require 60 per cent of the curriculum of Russian upper secondary schools to be taught in Estonian by 2011 (Siiner 2014). The activity of reform was arguably accelerated by the increased attention brought by PISA tests (Khavenson and Carnoy 2016), in 2006 that revealed Estonian schools high scores in PISA but a considerable minority achievement gap between Estonian and Russian schools. The 60-40 requirement was implemented with a certain degree of flexibility and schoolby-school, avoiding the resistance to Latvia's similar 2004 law (Siiner 2014). The issue of Russian school progress has remained salient in parliamentary elections since 2015, and all mainstream parties have promised to improve the situation with Russian schools. However, despite the recent political emphasis, a concrete plan is still not yet agreed by today's incumbency, and the minority achievement gap in Estonia revealed by several international comparisons is still worrisome.
The usual suspects in explaining differences in educational outcomes are differences in students' socio-economic characteristics such as immigrant status, parental education, occupations, and the number of books at home. Students in Russian schools in Estonia do originate more frequently from immigrant families, and they have fewer books in their homes (Põder et al. 2017). However, the share of the whole gap these differences can explain is small (ibid.). Furthermore, pointed out above, in Estonian schools, the language spoken at home and in school by students with a migrant background generally differs, although, in the case of schools in which the language of instruction is Russian, immigrants primarily learn in their mother tongue. Thus, the language barrier that might explain the gap associated with immigrant status is not a relevant explanation for Russian schools in Estonia. Also, Lindemann (2013) has found certain differences in the professional positions of the parents of students in Estonian and Russian schools and Mikk et al. (2012) found a variance in the teaching methods at schools and the learning skills of students. However, again, these explain only a very small part of all the differences in academic results. Therefore, while in admitting the relevance of parental background in explaining the minority achievement gap in Estonia, in posing our hypotheses, we rely on behavioural and attitudinal explanations.
In exploring literature and our data, we illuminate specific patterns in the data that allow us to pose three hypotheses that might explain the achievement gap in CCK of Estonian and Russian school students.

Hypothesis 1: Classroom climate differs
Previous research on classroom climate has revealed a positive association between civic knowledge and openness to discussion and plurality of opinions (Campbell 2008;Knowles and McCafferty-Wright 2015). The assumption behind the association between political socialisation and participation is that a good citizen does not necessarily appear spontaneously, but rather, early socialisation within the family, and particularly at school, should aim to provide the tools that allow future citizens to become successfully involved in political life and to show good civic knowledge (Castillo et al. 2015). A more recent line of research in this field has highlighted that not only'what' is transmitted by the schools (knowledge) but also'how' this is transmitted, arguing that exposure to a democratic environment in the school can be as significant (or even more) as civic knowledge (ibid.). This focus has emphasised the role of the classroom climate as an environment that could have an impact on students' political outcomes and knowledge (Alivernini and Manganelli 2011;Quintelier 2010). While the strength of an open classroom climate to support the civic development of youth has been documented in dozens of studies spanning decades (see overview in Knowles et al. 2018), many schools are often unable to provide such spaces for all students (Castro and Knowles 2017;Levinson 2010). While many factors contribute to this deficiency, the political and ideological nature of the schooling experience likely plays a role.
While schools in Estonia are considered amongst the most autonomous, there is still limited horizontal diversity; schools rely on the same central curriculum and schools are not very diverse by the teaching methods (OECD 2016). Whereas the rhetoric of curriculum reforms has been de-centralist, it has not always meant a real shift in power nor professional autonomy of teachers (Erss 2015). Recent reform agendas have tried to minimise the standardisation of the education system and to emphasise the child-centric, renewed and more active approach of teaching. However, the level and speed of these initiatives are often dependent on the motivation and the resources of the school heads, and there are some concerns (Klaas-Lang et al. 2014;Loogma et al. 2009) that Russian schools have had problems revising their educational practices due to language deficiencies and inadequate preparation of teachers for bilingual classrooms. Furthermore, Kiilo and Kutsar (2012) have shown that state language policy have had unintended consequences in the form of the feeling of powerlessness of the Russian-speaking educators, which in its turn restricts the balanced approach towards legitimization of the state language and causes deficiencies in self-efficacy and disempowerment. Thus, we may speculate that Russian schools have been less prone to innovate and remained more traditional in their approaches, having also impact on classroom climate. Therefore, in posing our hypothesis based on the students' assessment of an open classroom climate, we treat it as a proxy of school developments in terms of the changing paradigm of education.
H1: Lack of school resources and low self-efficacy to adapt to a new educational paradigm has resulted in a more traditional, and less-open classroom climate in Russian schools that has negative consequences on their students' CCK scores and increases the minority achievement gap.
We have one composite indicator from the ICCS international database to measure classroom climate-OPDISC, which indicates students' perception of openness in classroom discussions. Higher values indicate perceptions of more openness. OPDISC is derived through the scaling of items (currently six categorical variables) by using item response modelling (Schulz et al. 2018a). We use this index in a standardised format. This index reflects answers to questions such as whether teachers encourage students to make up their minds and express their opinions (even if they differ from the class opinion), but also whether students bring up current political events for discussion in class and teachers present several sides of the issues when explaining them in class. Explorative insights from ICCS data show that in case of Russian-language schools the group mean is lower than zero (− 0.126), and the standard deviation is higher than one (1.041), indicating that Russian school students have relatively lower values for the OPDISC and more variance than Estonian school students (mean = 0.047 and standard error = 0.981). The overall statistic of the OPDISC variable is indicated in Table 1. Figure 1 reveals the data indicating mean scores by groups (vertical lines) and geometrical distribution of the data. We see group difference as Estonian school students have relatively more observations with the upper right part of Fig. 1

Hypothesis 2: Trust in media has non-linear effects
Previous research has shown that civic engagement and knowledge is beside to individual level characteristics such as social demographics variables strongly associated also with media use habits (Quintelier and Hooghe 2011;Romer et al. 2009;Pasek et al. 2006). More recently, social media has received attention and interest as people are making active use of the Internet in ways that may or may not contribute to civic engagement (Martens and Hobbs 2015). Whereas some scholars believe that civic engagement arises naturally from digital media use (Kahne et al. 2014), others believe that media literacy education is needed to provide the cognitive and social scaffolding that systematically supports civic engagement (Martens and Hobbs 2015). Part of this literature agrees that media usage and trust in media are important predictors of civic knowledge and engagement, but, first, the effect of traditional media and social media are different in that regard, and, second, the effect of the usage and trust in social media might be adverse. For instance, there is some evidence (Koršňáková and Carstens 2017) that among active social media users there is a group of students who are not very knowledgeable concerning the key dimensions of the ICCS framework (civic society and systems, principles, participation, and identities) and often not very tolerant, raising concerns about'disengaged' or even'alienated' youth. Furthermore, Fennema and Tillie  (1999) argue that ethnic minorities might have different'ethnic' social capital. They also enjoy watching their cultural television stations in their languages, and this may influence young people's ethnic identity, feeling of belonging and internal group cohesion. However, ethnic television-watching can work both ways. On the one hand, as members of a minority group gradually perceive their ethnicity as more salient, one might argue that their political participation will increase in favour of their ethnic group (Miller et al. 1981). On the other hand, the counter-argument could be that ethnic television creates a closed community attitude, with less political participation. Thus, both the non-linearity and the type of media consumed seem to be key in understanding the associations between trust in media and CKK. We see the question of usage and trust in media to be relevant in analysing the potential explanation behind the minority achievement gap in CCK in our case. In Estonia, the media enjoy a high degree of freedom. For quite a long time, Russian-speakers have been portrayed as quite monolithic and a rather inactive part of Estonia's society. However, the recent studies (Integration Monitoring … 2017; Tammaru et al. 2017) have revealed that although clustered according to language, this group is indeed very heterogeneous in terms of its ethno-cultural background, citizenship, political activity and preferences, educational and socio-economic parameters, proficiency in the Estonian language, and media consumption. Furthermore, quite a large share of Russian speakers rely on Russian sources in their media consumption which are potential risks in biased news or disinformation (CIMU-SEE, 2018). In general, three main media segments can be distinguished in Estonia: national Estonian-language media, local Russian language media and foreign media (including Russian-language media from Russia). Furthermore, it has been demonstrated that social media networks are a more important source of information for young Russian speakers than for young Estonian speakers (Integration Monitoring … 2017). Given the distinctiveness in media consumption between the Russian-language and Estonian-language cohorts in terms of the type and intensity of usage, we assume the trust in media to be one of the explaining factors behind the gap in civic and citizenship education. However, we admit the potential problem of reverse causality here, i.e. students with a better understanding of how society works trust (social) media less than others. We take this concern into account in interpretation.

H2.1 and H2.2:
The association between trust in media and CCK scores has a reversed U-pattern-the ones who are the least and the most trustful, show the lowest result; those with modest trust have the highest results across language groups. While Estonian school students are relatively more trusting on classical media ("completely" 7% vs 4%, "quite a lot" 44% vs 28%), they trust less social media ("not at all" 13% vs 11%, "a little" 57% vs 52%). H2.1 poses the hypothesis regarding traditional media of television, newspapers and radio. H2.2 poses the same hypothesis regarding social media platforms, i.e. Twitter, blogs, and YouTube.
The trust in media variable is based on the question "How much do you trust each of the following groups, institutions or sources of information?". We use question IS3G26G for operationalizing trust in'traditional' media (TRUST_C) and IS3G26H (TRUST_SM), which is asking how much you trust social media (Twitter, blogs, YouTube). ICCS data reveals that Russian school students have a much lower trust towards media (12% do not trust media at all; 6% of students in Estonian schools do not), while there is no significant difference in trust to social media (11% vs 13% don't trust social media at all). As Table 1 indicates, these are four-category variables; ordered categories are "completely, quite a lot, a little, not at all". Figure 2 reveals that there are considerable achievement differences between Russian and Estonian school students regarding trust in media; middle categories among Estonian school students perform considerably better with a small variance compared with almost no such effects in case of Russian school students. Also, as we see in Fig. 3, there is a clear distinction of CCK among Estonian school students who have complete trust compared to all lower trust categories. In the case of Russian school students, there are no significantly different CCK effects over the different categories of trust in social media. Thus, expected non-linearity between trust in media and CKK seems to be present only in case of Estonian students, a phenomenon potentially partly explained by Russian speakers' distinctive media sources some of which are not necessarily beneficial for civic knowledge and engagement. We hope to cast more light on that in the section of analysis.

Hypothesis 3: Context-dependent effects of influence of religion
Recent scholarship has reinvigorated research of religion's influence on political life (Van Heuvelen 2014). More broadly, results suggest that religion plays an important role in structuring a citizenry's view of the moral economy, and thus has real consequences for  political outcomes (Edgell 2012;Svallfors 2012). However, there are contradictory views on both-the direction of the relationship and on the causal mechanism. Furthermore, there is evidence that the relationship between religiosity and civic life might be moderated by context, and religion tends to empower (have a positive impact) only in settings where religiosity is a prevalent practice. There are studies (Sobolewska et al. 2015;Djupe and Gilbert 2006) that show that religious attendance has a strong positive effect on non-electoral participation in the case of those religions-Islam and Sikhism-that are politically salient, but not in the case of Hinduism, which is not politicised. Christianity presents a more complex case, where regular attendance bolsters electoral, but not nonelectoral participation.
In our analysis, we focus on the importance of religiosity in guiding decisions in civic life instead of the effects of different religious practices. First, Estonia has moderately low diversity in terms of religious practices followed; second, there are considerable differences in the share of believers across the Estonian-and Russian-speaking community in Estonia. While Estonians are one of the least religious nations in Europe (Eichhorn 2012;Inglehart et al. 2014), almost half of Russian speakers claim to regularly practice, mainly Orthodox Christianity, making the latter the most-practised religion in Estonia (Statistics in Estonia, 2013). In Estonia, 66% of Estonians and 30% Russians do not follow any religion (ibid.). Thus, assuming the context dependency of the relationship between religiosity and civic life-religiosity restricts civic life in the contexts where religiosity is rare and empowers where religiosity is prevalent and considering the differences in the religiosity between Russian and Estonian groups, we assume the minority achievement gap to be diminished by religiosity.
H3: We hypothesise the religiosity to be negatively associated with CKK in Estonian schools and positively associated in Russian schools.
We use the independent variable RELINF to capture the impact of religion on CCK. RELINF indicates students' endorsement of the influence of religiosity in society. Higher values indicate more positive attitudes toward the influence of religion.
RELINF is a composite of six categorical variables. These six variables are statements students can agree with on a scale of four categories ("strongly agree" to "strongly disagree"). The statements are the following: (a) religion is more important to me than what is happening in national politics; (b) religion helps me to decide what is right and what is wrong; (c) religious leaders should have more power in society; (d) religion should influence people's behaviour towards others; (e) rules of life based on religion are more important than civil laws; (f ) religious people are better citizens. So, a general composite score of RELINF is not a measure of religiosity as such; it is a proxy measuring students' attitudes toward the role of religion in society. Summary statistics given in Table 1 show that we use a standardised score of RELINF. Group summary statistics of RELINF indicate that the mean of a standardised composite score of Russian schools (0.207) is statistically significantly different from Estonian school mean (− 0.075). Also, standard errors differ; in Russians schools there is less variance (s.e = 0.974) compared to Estonian schools (s.e = 1.038). As Fig. 4 explores correlations between civic knowledge and influence of religion, we see in the case of both groups a negative correlation. However, in the case of Russian schools, the slope of a simple linear trend line is smaller than in the case of Estonian schools. Thus we see a larger negative association between the religiosity and civic knowledge in Estonian school, the exploratory finding which contradicts our theory-driven hypothesis (H3), but gives some evidence that trust in religion can be less harmful to CCK in settings where religiosity is a prevalent culture.

Data, methods and statistical techniques
Our data originate from the IEA study ICCS, the cycle of 2016. To repeat, CCK scores between Estonian and Russian language schools differ 57 points (more than 10% of the mean score). We aim to find an association between the factors behind differences in Estonian-Russian school educational achievements in the CCK score and explain these differences. In general, our research design will rely on hierarchical modelling to capture the layered data and aim to explain how the different layers (school-and studentlevel) interact and impact civic knowledge. We hypothesise that school language is not explaining the minority achievement gap; instead, school language is the proxy for indicating'social space' which moderates the effects of the independent student level variables.
In general student achievement is mostly explained by parental background characteristics, school-teacher, and peer effects. Regarding school and classroom level, literature has shown that citizenship competencies are not related to the subject matter, teaching time, the teachers' educational approach, tailoring of teaching to student needs, and other school level variables (Dijkstra et al. 2015). Similarly, for most aspects of school quality, no significant effects were found in multilevel models (see ibid). Thus we rely on student level independent variables mostly and allow the estimation technique take care or school level effects by random slopes and intercepts.
Our data is hierarchical, and we show the data structure in Fig. 5. In general, data allow us to indicate the school language (ITLANG), school-level and multitude of individual-level variables. In setting up our model, we distinguish between following categories of explanatory components (Fig. 5): the language of school (school level); students' activities in school and outside school including perceptions on classroom climate (student level); the trust of various media channels (student level) and religiosity-related characteristics (student level). As mentioned, we apply hierarchical modelling that allows to handle the unobservable school level heterogeneity without intensive school level controls. So our estimation strategy is not allowing us to directly control for school-level factors, instead school level random intercepts are applied.
We can assume that our data require a random intercept and slope model, where groups are schools. We support our assumptions by rich data structures of ICCS data and theoretical insights (e.g. Clarke et al. 2010) that indicate-when the selection mechanism is fairly well understood, and the researcher has access to rich data, the random effects model should be preferred. Random effects estimators of regression coefficients and shrinkage estimators of school effects are more statistically efficient than those for fixed effects. Barr et al. (2013) recommend in the case of embedded data structures specifying the maximal random effect structure, which means including random intercepts and random slopes for all within-subjects and within-items factors, as well as allowing correlations between within-unit random effects. These models are referred to as maximal LMEMs. Unfortunately, maximal LMEMs are prone to model non-convergence, which is the failure of the model estimation algorithm to reach a solution. In our case we were able to estimate random slopes and random intercepts (RMEL) using R package lmer4 (Bates et al. 2015).
We use the stepwise modelling technique, however in many steps we include the covariates that proxy the same phenomenon (e.g. background characteristics or student involvement factors). Our null model initiated by standard assumptions of hierarchical data structures is a two-level model where level 1: y ij = π ij + e ij and level 2: π 0j = β 00 + u 0j . Random coefficients (level 2 predictors) is used to predict student performance in each school, while intercept values for school performance are assumed to be a sample of the intercepts from a larger population of schools. For estimating the model, we use the following specification: where Y indicates CCK scores, L is language at school, F is the vector of family characteristics of the individual student and B is the vector of school and classroom quality indicators (measured at student level), M , S and R are the trust in media, the trust in social media, and the trust in religion indicators, respectively. Also, we include the interaction between the language of the school L and independent variables M , S and R to see how school language (identified as Estonian) moderates the effect of studentlevel independent variables. In the Schulz et al. (2018a) dataset, there are categorical (ordered) variables and scaled composite variables. A scale is a special type of derived variable that assigns a score value to students based on their responses to the component variables. Schulz et al. (2018a), composite variables were typically calculated as weighted likelihood estimates, where scores with a mean of 50 and a standard deviation of 10 for equally weighted countries. We turn all scaled variables into standardised variables with country mean zero and standard deviation equals to one. The description of transformed data (see Table 1) contains 2,857 observations; we assume that all missing values have been randomly allocated and thus, we applied imputation of missings (see Appendix 2). Imputations are important to make all our models comparable while we use maximum likelihood estimations as a method to fit the model. We have 2857 observations from 164 schools, while 75% of observations originate from Estonian language schools. Table 1 reveals the descriptive statistics of the imputed dataset (descriptives about raw ICCS data are indicated in Appendix 1). Many of our variables have self-explanatory abbreviations. In Student background characteristics, HISEI is a composite indicator indicating the highest parental occupational status, and HISCED is categorical highest parental educational level. Latter uses standard scales from "did not complete ISCED level 2" to "ISCED level 6, 7 or 8".
As indicated, in addition to standardised scored variables we have many categorical variables from survey design. Due to four (or less) category property of latters we will not treat any of the categorical variables as linear (numeric or continuous) and thus we estimate each category separately.
So to sum up our design: data require us to use hierarchical modelling, we opt for random effect model that has school random slopes and intercepts. In addition we imputed missing observations using Multivariate Imputation via Chained Equations and standardised all composite variables. All categorical variables are used as factor variables estimating the effects for each category. For model estimations we use RMEL and student total weights.

Results: confirmatory analysis
We apply hierarchical modelling to show the effects of, class culture (OPDISC) trust in media-related variables (TRUST_C and TRUST_SM) and influence of religion (RELINF) on civic knowledge score. We have following models: Model 0: empty Model; Model 1: Model 0 + ITLANG (variable indicating the language used by the student in a test, ITLANG = 1 indicates Estonian); Model 2: Model 1 + student background variables; Models 2A and 2B inflict student involvement factors step by step (first SINT and then SCHPART); Model 3: Model 2B + OPDISC; starting from Model 3 we use stepwise method to show the effect of our main independent variables -classroom climate, trust in traditional and social media, and trust in religion (higher values indicating more positive attitudes toward the influence of religion). Interaction terms between Estonian schools (ITLANG) and independent variables are applied in Model 6 and 7.

Interclass correlations
In each case, we see how much variation of the dependent variable is explained between schools or between individuals (students). To get meaningful interclass correlation coefficients (ICC) also for models with random slopes, we use option "adjusted" available in R icc package (sjstats library). The adjusted ICC uses the mean random effect variance, which is based on the random effect variances for each value of the random slope and random effects (see also Johnson 2014). In general, total variance in data is 16.5% which can be decomposed between schools, school language, and student. We show that language-level variance is larger than school-level variance (see Table 2), while the latter shows the'similarity coefficient' which indicates the average degree of similarity between students in the same school. It is visible that in the case of Model 1 that a large part of school-level variance can be explained by school language. Also controlling for individual-level characteristics affects school level variability (M2). Controlling for student involvement (M2A and M2B) we see some marginal effect on school level variability. While our variable of interests-OPDISC (M3) has slightly bigger effect. In addition, including trust in media and religion variables (Model 4 and 5) we see a marginal decrease of school-level variability, so we can assume that language explains more the similarity of the students than the religious attitudes and trust in media (including interaction). Still, we almost eliminate the school level similarity between students, by full specification of the model (Model 7).

Confirmation of hypotheses
In Table 3, we report the regression estimates for models 2, 3, 5 and 6. In Model 2 we see that a 57-point CCK gap is decreased to 52 points when we control for student specific characteristics such as gender, language at home, immigrant status, the highest educational level and the highest occupational level of the parents. So as expected, background characteristics explain the gap, but not in full extent. Before estimating Model 3 which includes our independent variable for classroom climate-OPDISC, we estimate Models 2A and 2B which we report only in Table 2.
In Table 3, we report the control of the first hypothesis H1. H1: Russian language school OPDISC effect is larger than average effect and highly significant. While for Estonian school students effect size is smaller ( Interaction1 is negative). We see that on average, one standard deviation change of OPDISC has a significant positive effect on student achievements. In model 5, we reveal the effects of the other independent variables-trust in media, trust in social media and attitudes toward the impact of religion in society (RELINF). In the case of the latter, we see as expected significant negative and a large effect-one standard deviation change in the belief of the influence of religion in society will affect student results by 23 points. However, this insight will not yet allow us to confirm H3.
Interpretation of the effects of trust indicators of media and social media are more complex. In general, we see nonlinearity as expected. In case of trust in traditional Table 2 Variance explained by between students and schools Table reports ICC as a proportion of total variance using adjusted ICC that is the mean random effect variance, which is based on the random effect variances for each value of the random slope suggested by Johnson et al. (2014)  media, there is a considerable increase in student results while answering "quite a lot" or "a little" compared to "completely". While "not at all" has insignificant effect compared to "completely". In the case of trust in social media, the effect sizes are increasing if the'degree of trust' is decreasing. We have significant effects in all cases, so we interpret estimates as follows-smaller trust in social media will increase average student scores in civic knowledge. In Model 7 Table 4, we control for hypothesis 2 (2.1 and 2.2) and 3. We add interactions between ITLANG and independent variables to Model 6. As we see, all media trust variables and ITLANG school-level variables turn insignificant. As now independent student-level variables reflect on Russian school students' effects, we conclude that in the case of H2.1 (trust in media): Russian schools don't differ significantly from Estonian schools-the media effect by school language turns insignificant. While in the case of H2.2 (trust in social media): Russian schools differ significantly from Estonian schools-social media effects are insignificant in the case of Russian schools but are significant and with the expected sign in Estonian schools. In Estonian schools, smaller trust to social media will bring along higher students' scores, especially in case of "not at all" compared to "completely". In addition, as Model 7 captures moderating effects of the school language to independent variables of trust in media, we see that it turns school language variable (ITLANG) itself insignificant, both statistically and in magnitude. Finally, H3 (attitudes toward influence of religion): we see a negative significant impact in the case of both schools, less belief that religion has an influence in society will bring along higher scores in civic knowledge. However, Estonian schools' effects are higher, meaning more negative. So, we don't confirm our H3 that in the case of Russian schools RELINF will have a positive effect on CCK, still it is less negative compared to Estonian school students. See Appendix 3 for the full specifications of the model with controls; moreover, we show model fit statistics in Appendix 3. For the comment, most of the controls have signs and significance as expected, including also student involvement factors. From student level controls-gender (girl), having high HISEI parents, books at home and test language the same as the home language, have considerable positive effects. While HISCED and IMMIG have unexpected insignificant effects. Also, school-level variables such as SINF and SPART have positive and significant effect.

The mechanism behind heterogeneous effects: language of the school or not
In this section, we look at a subsample-non-Estonians in Estonian school. Figure 6 illustrates the distribution of civic education scores of non-Estonians in Estonian schools (136 observations) compared to all Russian school students. This exercise explains whether there is some self-selection to the Estonian language schools based on background characteristics and most of all exhibits a robustness check of our argument that school language is the mediator not the explanation for the achievement gap.
Descriptively in Fig. 6 we see clear differences in distribution, while mean scores of both groups are similar-505.56 for a Russian school and 505.24 for non-Estonians in Estonian schools. The standard error of the subsample of non-Estonians in Estonian schools is smaller, so there is less variance in data.
Hence, we are interested in how our estimates of Russians school students (Models 6 and 7) look like compared to the same specifications of the subsample. We report estimates in Table 5. We see mostly insignificant effects. Due to maximum likelihood property, our models are not directly comparable in Table 3 and Table 4; however, we see a significant similarity in effect sizes and statistical significance. We make a weak statement that non-Estonians perform similarly (their effect sizes of independent variables are similar) after controlling for student background and involvement in Estonian schools compared to students in Russian schools. So, we speculate that school language is just a'cultural indicator' that moderates our independent variables and school language itself is not an explanatory cause in driving CCK scores.

Conclusions
Our aim was to find out what is the explanation for the achievement gap (16.5% of total variance) between Estonian and Russian school students in Estonia. The empirical strategy for this was to minimise (explain) between-school and -student variance. We showed that the gap between schools (16.5%) could be decomposed between school variance and between student variance. In Model 1, we saw that between school variance drops considerably if we test for school language. In Model 2, we indicated that individual background characteristics explain most of the between-school variance. Finally, after controlling for behavioural and attitudinal independent variables such as perceived openness of classroom climate, belief in the influence of religion, trust in traditional and social media, CCK between schools is less than 1.5%. We use this model as a final specification.
We hypothesised that open classroom climate, trust in social and classical media, and the belief in the influence of religion, explain the CCK gap between Russian and Estonian school students. In general we compiled three hypothesis-H1, H2 (H2.1 and H2.2.) and H3.
First, H1 stated that open classroom climate supports Estonian school students more. We saw the opposite-Russian school students benefit more from an open classroom climate, and this effect is highly significant. However, as in Russian schools on average the level of the classroom climate is less open, thus the gap is still partially explained by this.
Even though trust in classical media channels is low on both sides of the language divide, trust in classical media channels is exceptionally low among Russian school students. We were interested in whether this distinctiveness in trust in media (H2.1) and trust in social media (H2.2) have a significant explanatory power behind the knowledge gap in different language schools. We found that the trust in the classical media has some positive reverse u-shape effect. Thus, students who trust media quite a lot or moderately (compared with completely trusting) have higher achievements in civic knowledge scores. Still, there is no significant difference between Estonian and Russian language school students in that regard. At the same time, social media as an independent variable acts differently-a small amount of trust in social media tend to increase student scores. However, this effect is present only in the case of Estonian school students.
Finally, we hypothesised (H3) that belief in influence of religion in society matters in associations with civic knowledge and this effect is heterogeneous-positive in the case of Russian schools and negative in the case of Estonian school students. Descriptive statistics revealed that Russian school students have higher mean scores and less variance on the composite index measuring'influence of religion' on everyday life. We showed that belief in influence of religion on civic life matters for both language groups-a lower role of religion in everyday life has an association with higher CCK scores. However, this effect is significantly larger for Estonian school students. So, we found no confirmation of H3-in both schools, the'religion variable' has negative effects, however different in size.
Moreover, our final specification allowed us to'get rid of ' the Estonian school (language)) effect. We saw that when interactions are brought in, the Estonian school (language) effect turns statistically insignificant and also small in size. Thus, the language of the school influences the educational outcome through the perception on the openness of classroom, trust in media and religion, meaning indirectly.
We confirmed the robustness of our results by using a subsample of non-Estonians in Estonian schools. These students have similar mean scores in civic knowledge compared to Russian school students. We also see a lot of similarities in effect sizes. This demonstrated that the positive effect of 'openness of classroom' and the negative effect of 'belief in the impact of religion' make non-native students worse off independently from school language. Thus, we speculate that the civic knowledge gap of Russian-language school students is in addition to student background characteristics explained by attitudinal and behavioural variables like trust in social media and religiosity a phenomenon revealed also by others (Schulz et al. 2018b;Schulz and Ainley 2018). The explanation of student background characteristics lies in socio-economic characteristics, working through lower parental occupational status (HISEI).
To conclude, our school random effects hierarchical models showed that most student background and involvement characteristics work as expected. Regarding behavioural and attitudinal variables we showed that student-oriented classroom practices benefit Estonian student more than minority school students because these are experienced more often in Estonian schools. Russian school students have lower trust in classical media, and it can be speculated that this reflects their mingling between Estonian and Russian media channels. At the same time we found marginal evidence that high levels of trust (or no trust al all) is hurting students. Estonian students benefit from low trust to social media considerably, while effects are insignificant for Russian school students. In addition, these associations turned out to be more complex than expected and should elaborated further addressing the concerns of reverse causality. Finally, the impact of belief in religion (instead of civil law) is stronger (more negative) for Russian schools and has significant effects-the lower is the role of religion, the better are the results.
What could one learn from this comparative exercise? First, as literature is struggling between input quality (student background) and school quality (cf. Timmermans 2012) in explaining student achievements, we bring in another dimension-how can individual-level attitudinal and behavioural characteristics affect student civic achievements in different cultural settings (operationalized as the language of the school)? So this'social experiment' in Estonia, where minority-and majority-language students embedded in different cultural settings (school language), can give us some weak evidence that the cultural setting (keeping Russian speaking students in Russian language schools) is moderating the effects of behavioural and attitudinal factors. However, we have no evidence that it can reverse the effects (e.g. belief in religion is still negatively associated with student achievement in civic knowledge,), at least as far as we discuss students largely from a European-Christian background where the cultural split is mostly between Christianity in Orthodox-Protestant dimensions. However, it can be also argued that the moderating effect of cultural setting shows how minority students are detached from civil society in specific case-contexts (see also Masso et al. 2013), and thus our study has limited external validity.
However, given the robustness of the positive association between open classroom climate and CKK revealed, one of the policy implication at best can be related to our first hypotheses -Russian school students can benefit from a changed teaching paradigm, proxied as the openness of classroom climate. It means that freedom to express their opinions, discuss and dispute over politics can benefit minority students to a large extent and policies that could eliminate the barriers that has hold Russian schools back in implementing that change would be an important step in diminishing minority achievement gap in Estonia.
In addition to the external validity concerns stated above, our study has limitations originating from our estimation strategy. A hierarchical model specification using random slopes and intercepts treats well unobservable heterogeneity between schools, so the estimation strategy is designed to adjust for unmeasured covariates, however, this is not enough to call the method a causal estimation technique. Moreover, all effects should be interpreted as associations or partial correlations at best. Our second hypothesis needs specific care. However, we have some theoretical evidence that media consumption is a possible explanatory factor for civic engagement, our estimation strategy, and model specification is not ruling out the problem of reverse causality. Namely, low scores in civic and citizenship education can lead to extreme values on trust in media-either no trust or complete trust. So, we encourage further studies and causal methods to test for the model specification error.

Appendix 2: Imputation of missing data
In the case of 12 variables, we have missing data. In the case of HISEI, which indicates the highest parental occupational status, we have 81 missing observations. See Table  7, which indicates that disregarding HISEI variables have less than 3% of missing observations. Best practices for missing data management (e.g. Schlomer et al. 2010) including revision of patterns of missing data and description of strategies for dealing with them suggest that mean substitution is a poor method for handling missing data. Our measurement scales of variables are often ordered categorical, but we also have binary or standardised continuous scale variables.
We use Multivariate Imputation via Chained Equations (MICE) package for creating multiple imputations (Van Buuren and Groothuis-Oudshoorn 2011) as compared to a single imputation (such as the mean), which is considered to take care of uncertainty in missing values. MICE assumes that the missing data are Missing at Random (MAR), which is supported by the pattern of missing observations is revealed in Fig.  7. MAR means that the probability that a value is missing depends only on an observed value and can be predicted using them. It imputes data on a variable by variable basis by specifying an imputation model per variable. Multiple imputation is particularly well suited to deal with missing data in large studies (see Burgette and Reiter 2010), including interactions and nonlinear relations. We use classification and regression trees (cart) which is suitable for categorical and continuous variables.