Family related variables effect on later educational outcome: a further geospatial analysis on TIMSS 2015 Finland

Family-related factors, like parent’s educational level, their values and expectations have a significant impact on child’s early skills and later educational outcomes. Further, parents provide their child, alongside with other learning environments, a broad mathematical and early literacy input. This study investigates the relationship between family-related socio-economic and other factors like, parental education, amount of books at home, parental attitudes towards mathematics and science, parental perception of child’s early skills and student’s later academic achievement. This is studied in the light of the Finnish data collected for Trends in International Mathematics and Science Study (TIMSS) 2015. The results are presented with the help of a geospatial method called Kriging that reveals regional variance. The results indicate that family-related background variables have different effects on child’s later achievement in mathematics across Finland. The results suggest, that some areas in Finland are better in ‘levelling the playing field’ for children and minimising the effect of family related variables on educational outcomes than others.

correlated with later mathematical and reading skills (Watts et al. 2014). In all, an significant amount of studies have been conducted on the effect of different family related background variables on student achievement, which will be closer described in the theoretical framework of this paper.
In this paper we will also add a new, geographical view, to the discussion of family's effect on student's educational outcome. The Finnish educational system strives for equality and equity in education for all. Despite this strive, there is still significant variance in socio-economic variables as well as in educational outcomes in mathematical, scientific and reading literacy across the country, as evidenced in previous PISA studies (Harju-Luukkainen and Vettenranta 2013; Vettenranta and arju-Luukkainen 2013;Harju-Luukkainen et al. 2014;Vettenranta 2015;Harju-Luukkainen et al. 2016). However, we do not have any previous studies that would shed a light in which areas the parental-related variables and child's early skills have a clear effect on students later educational outcome and where the effect is minimal. Therefore we do not have an understanding in which areas of Finland are good in "levelling of the playing field".
From these premises, this study investigates the effect of family-related factors and student performance in the Trends in International Mathematics and Science Study (TIMSS) 2015. More specifically, the analysis is focused on the areal variation of Finnishfourth-graders' test performance in mathematics as explained by certain family-related factors. The results are presented with the help of a geostatistical method called Kriging. This method offers a relatively new way of illustrating spatial distributions of educational variables and enables, for instance, related analyses of regional variation (Harju-Luukkainen and Vettenranta 2013; Vettenranta and arju-Luukkainen 2013;Harju-Luukkainen et al. 2014;Vettenranta 2015;Harju-Luukkainen et al. 2016).

Socio-economic status and student performance
The index of socio-economic status (SES) has been typically employed in sociological and educational research on educational inequality to address or control for socio-economic differences. SES is generally regarded as one of the major variables in explaining student performance, together with institutional variables and other social status indicators such as ethnic background, gender, and other family-related factors (e.g. Yang 2003). Sirin (2005) conducted a meta-analytic review (comprising 58 articles from 1990 to 2000) of the relationship between SES and student achievement at both individual and school level. According to Sirin (ibid.), family's socio-economic status is a clear correlate of academic performance at the individual level (average correlation of 0.299). Students with higher family SES are found to have much higher educational achievement than those having poorer family resources, and vice versa (e.g. Okpala et al. 2001;Engin-Demir 2009;Yang and Gustafsson 2004;Battle and Lewis 2002;Tomul and Savasci 2012). In another study, Sutton and Soderstrom (1999), 9 also emphasised the "significant relation" between SES and student performance, where SES constituted 74 percent of variance in achievement, together with other factors over which "schools have no control", such as ethnic composition.
The explanatory power of SES-related factors for student achievement varies in different countries. In Finland socio-economic background explains very little of the betweenstudent variance in mathematical literacy. According to the Harju-Luukkainen et al. (2014), student's socio-economic and cultural status index (ESCS) could only explain approximately 9 percent of the between-student variance in PISA 2012 mathematics test results. This was among the lowest percentages within the OECD countries (OECD average 15 percent). Also among students with an immigrant background in Finland the ESCS index explained only 10 percent of the variance between students. This has led also to an assumption that the ESCS explanatory power would be as low throughout Finland and the education span.
However, the influence of SES-related factors is not constant throughout the span of schooling. Caro et al. (2009) found out that the achievement gap caused by SES varies by student age groups from childhood to adolescence. According to their findings, the gap remains stable from the age of 7 to 11 years and widens increasingly from 11 to 15 years. This bears no clear implications for interventions, however-these are most likely needed and benefit disadvantaged children at any age from early childhood to adolescence. As such, SES-related differences in learning outcomes seem to remain more modest during elementary school but tend to grow significantly wider at the upper grades and stages.
The Trends in International Mathematics and Science Study (TIMSS) provides information on SES indicators related to student achievement including the parental education level, the number of books at home, and home educational resources. TIMSS data over time has consistently shown positive effects between student achievement and SES indicators (see Baker et al. 2002;Bouhlila 2015;Byun and Kim 2010;Chudgar and Luschei 2009;Hanushek and Luque 2003;Harris 2007;Liu et al. 2006;Takashiro 2016;Yang 2003). For example, Baker et al. (2002) explored both the effect of student SES and school resource quality on student achievement in 36 countries, finding parent education levels and the number of books at home as important. Similarly, Baker et al. (2002) reported that the student SES explained from 1.5 to 20% of math and science test score variation. In Korea, Byun and Kim (2010) also reported a strong relationship between student SES and achievement in TIMSS data from 1999, 2003, and 2007. Specific indicators of student SES have been explored within TIMSS. Recently, Takashiro (2016) found that the number of books, the possession of computers, and parental education as student SES indicators had a positive effect on student mathematics achievement by using TIMSS 2013 Japan data. The largest predictor was the number of books, which accounted for 10.7% variance in student achievement. Overall, student SES appears as a clear contributor to student achievement in TIMSS.

Early skills and later performance
Early childhood has become a priority within policy in many countries (Garvis et al. 2018). There is a wide notion that high quality early childhood education (or learning environments) will provide many benefits for children and families both in short as well as in long term. However, according to Taguma et al. (2012) these positive benefits are related to the 'quality' of early childhood education. The challenge in this lies in the fact, that the definition of quality differs across countries or across different interest groups. There are to be found research on the quality of early childhood environments from different perspectives, but lesser focus has been paid towards family's influence, thus parents are child's first educators.
Parents provide their child, alongside with other learning environments, a broad mathematical and early literacy input. What type of an input this is, is of importance, since early years mathematical knowledge is strongly correlated with later mathematical and reading skills (Watts et al. 2014). Similarly, according to Hannover Research (2016) early academic skills related to literacy and math are the most significant predictors of future academic achievement. Also children's early non-academic skills, such as social competence and self-regulation, also contribute to school success.
However, not all kinds of support have an impact on the child's skills. Zippert and Rittle-Johnson (2018) found barely any links between parent support and children's broad mathematical skills. Further, according to a recent longitudinal study of 554 three-yearold children, conducted by Lehr et al. (2019), show that book exposure and the quality of verbal interaction regarding mathematics both predicted mathematical outcomes in secondary school and those effects were mediated through early language and arithmetic skills. Reading outcomes in secondary school were not directly predicted by early home learning environments but indirectly via early language and literacy skills. Path models revealed that the different dimensions of the early home learning environments were differentially associated with preschoolers' early competencies. All effects remained significant when including the concurrent home learning environments during secondary school which predicted reading outcomes directly. Therefore, the quality of early learning environments seems to have an impact on later outcomes, which in turn have an impact on student future prospects. The impact of early academic skills on students educational outcome can in turn vary dependent on gender, socioeconomic status and English proficiency (Hannover Research 2016).
With this study we answer one research question: Do family-related background variables parental educational level, parental attitudes towards mathematics and science, parental perception of child's early skills in TIMSS 2015 data have a different effect in different areas of Finland on students educational outcome? In order to answer the question we fit a linear regression model to the data and the results are then displayed in the form of a contour map of Finland, visualising the effect of the family-related background variables on student achievement geographically. Something that has not been done previously in educational research.

Data
This paper draws on the TIMSS 2015 data for fourth-graders in Finland. The data comprises 158 schools and 5251 students. TIMSS is an international assessment of mathematics and science at the fourth and eighth grades. The first TIMSS assessment took place in 1995 and the program has continued with subsequent rounds every 4 years since then. Approximately 70 countries were involved in TIMSS 2015, which makes it one of the largest international assessments in the world. TIMSS is conducted by the International Association for the Evaluation of Educational Achievement (IEA), which is an independent international cooperative of national research institutions and government agencies doing cross-national achievement studies. The assessment yields information not only about students' overall achievement, backgrounds and attitudes toward mathematics and science but also about their teachers' education and training, classroom characteristics and activities, and school contexts for learning and instruction in mathematics and science. Therefore, TIMSS 2015 assessment employed questionnaires for students' as well as for parents, schools and teachers, respectively.

Variables
For this study, the below-described family-related index variables were chosen from the parent questionnaire (Early Learning Survey) for further analysis. These index variables were chosen according to their joint explanatory power for children's performance in mathematics and also taking into account the simplicity of the statistical model developed. The variables were derived from following questions connected to their child's early skills (a)"How well could your child do the following when he/she began primary/ elementary school?" (b)"Could your child do the following when he/she began primary/ elementary school?" As well as questions about parent's educational level (c)"What is the highest level of education completed by the child's father (or stepfather or male guardian) and mother (or stepmother or female guardian)?", question about how parent's value mathematics and science (d)"How much do you agree with these statements about mathematics and science?" and the amount of books at home (e)"About how many books are there in your home?". A full list of variables and their options can be found in Appendix. All four variables are indices and their mean is calculated into 0 and standard deviation is 1. The variables used in the modeling and the number of observations are shown in Table 1. The numerous missing observations were excluded from the analysis, as we had no particular reason to presume that the missing data would be regionally weighted, although the coefficients of the model could be affected.

Analytical methods
This study investigates the effect of the aforementioned family-related factors on Finnish fourth-graders' performance in the TIMSS 2015 mathematics test as well as related regional variance. This is done in two steps: Firstly, a one-stage linear regression model was fitted to the data, using the following formula: where SC ijk is student's ijk weighted (total student weight in TIMSS data) score point in mathematics, X ijk is a vector of indices formed on the grounds of student's ijk home questionnaire and e ijk is model residual for student ijk in class jk and school k. Normally, when dealing with stratified data as TIMSS-data is, the hierarchical structure of the data is considered by fitting the multilevel model to the data as following: (1) SC ijk = f X ijk + e ijk , where SCH k is school random factor, Cl j is class random factor and e ijk is the residual of student's ijk in class jk and in school k. In this case the variation between students can be divided in variation between classes, between schools and residual variation. Thus the distribution of e ijk would be much narrower in Model 2 than in Model 1. As we are especially interested in students' deviation from the national average as well as in total regional differences in students' conditions, we used the model 1n our preliminary analysis. Secondly, in this paper, a method called Kriging is applied to the data. This method offers a relatively new way of illustrating spatial distributions of educational variables and enables, for instance, related analyses of regional variation (Harju-Luukkainen and Vettenranta 2013; Vettenranta and arju-Luukkainen 2013;Harju-Luukkainen et al. 2014;Vettenranta 2015;Harju-Luukkainen et al. 2016). More specifically, Kriging is a geostatistical interpolation method based on the statistical relationship among the measured points' spatial autocorrelation. The geographical distribution of the schools in the TIMSS sample is not even. This means, for example, that southern Finland and the larger cities have more sampled schools than the more dispersed areas. When using the Kriging method, predictions can be made by a fixed number of near observations instead of a predefined search radius. This makes the method suitable even for more dispersed areas. Kriging weights the surrounding measured values to derive predictions for non-measured locations according to the distance between measured points, the prediction location and the overall spatial arrangement among the measurements (McCoy and Johnston 2001). The results are presented as a geographical map. In this case, the method produces a contour map illustrating areal differences of the different variables from the TIMSS-data affecting student achievement in mathematics and model residuals' regional averages. For every school in the data, the average of model residual and other used variables were calculated. These school averages were used as spatial observations describing the realization of regional differences. The regional average of residuals describes the county level bias of the model 1. Kriging predictions were calculated according to the 12 nearest observations (school averages) for each 10 by 10 km raster nodes across the country by weighting the observations according to the distance from each calculated node. Then the predicted values were smoothed onto maps. The scales in the maps were fixed manually in order to maximise readability and simplicity.
Thirdly, with the support of the regional model residual distribution observed from the map, Finland was divided into two regions according to the sign of the residuals (overestimates and underestimates) and this division was used as an areal dummy variable in the third model. The interactions between areal dummy and other variables was studied, as well.

Results
In order to determine family-related factors, which could explain best fourth graders mathematical skills and on the other hand the differences in the effects of family across Finland, we fitted one stage model to the TIMSS 2015 data. The coefficients of the model are shown in the Table 2. Among four candidates in this study the most powerful factor explaining student's mathematical skills at fourth grade is student's basic skills in the beginning of the school. This model can explain about 31 percent of the total variation of students' actual mathematics score points. In Finnish conditions the class level variation (2) SC ijk = f X ijk + SCH k + Cl jk + e ijk , is so small that in three level model (Model 2) school level variation almost disappears. The two level model (without random school factor) could explain better the total variation of students' actual mathematics score points, but the distribution of the residuals becomes more narrow that would have been unoptimal for regional inspection.
In this paper the main results are illustrated by means of contour maps produced by applying the Kriging method to the TIMSS 2015 data for Finland. Figure 1a-d) depicts the entire country in terms of the four indices. (For the purposes of statistical analysis, the national mean of each index was set to 0 with standard deviation 1). The maps illustrate areal variation in the different indices across the country. As regards the index for children's basic skills when beginning school (1a), we can find the most negative conceptions among parents in the eastern and western areas of Finland (− 0.46…− 0.2). In other parts of Finland the parents had a more positive view of their child's basic skills in mathematics and science. Figure 1b shows the results for the index for number of books at home. On average, the reported numbers of books were lowest in the northern parts and highest in the southern parts of Finland. Parents' level of education (Fig. 1c) seems to be scattered around Finland. It is hardly a surprise, However, that the areas with the highest educational levels coincide with the location of universities in Finland. Figure 1d presents how much parents value mathematics and science. The results for this index vary to some extent all over the country, but the most positive attitudes can be found in the Southern and Mid-parts of Finland. Figure 2 combines the variables presented in Figs. 1a-d, the statistical model and children's actual achievement level into a map and shows the residuals of the model applied to the data ( Table 1). The indices in this model can explain about 31 percent of the total variation of students' actual mathematics score points. In the light areas of the map the students reached higher scores in TIMSS than other students with the same level of family-related factors. In other words, in these areas the students performed better than their family-related factors would suggest. Correspondingly, in the areas marked with dark the students underachieved in this respect, that is, scored lower than what could be expected based on these factors. Hence, the effect of family-related factors on student achievement is by no means a constant but varies across the country. This also means that in the light areas of the map, family background seems to play a smaller role in the student's success than in other places. Therefore, there is evidently something in these areas that is levelling the playing field and supporting the students.
For closer investigation of areal effects on student's performance, 7 counties from southeast, west and south-west Finland, marked with "X" in the map, were chosen to the group of positive model residuals. This group consist of 35 schools and about 20% of the total number of students. When two-stage linear model was fitted to the data with areal dummy, the coefficients of the model were following (Table 3). This model could explain about 35% of the total variation of students' actual mathematics score points. The intercept was 8 points higher and the effect of the basic skills was about 20% smaller (coeff.: 27.5, − 5.4) in the positive residual areas than in the other parts of the Finland. The interaction between other variables and the areal dummy were not significant. Thus, the basic skills were only variable used in the modelling, which strength was significantly different in chosen areas. Fig. 1 (a, b, c and d). Areal variation of family-related factors used in the mathematics score point model Harju-Luukkainen et al. Large-scale Assess Educ (2020) 8:3

Discussion
This study investigated the relationship between family-related socio-economic and other factors like, parental education, amount of books at home, parental attitudes towards mathematics and science, parental perception of child's early skills and student's later academic achievement in the Trends in International Mathematics and Science Study (TIMSS) 2015. More specifically, the analysis is focused on the areal variation of Finnish-fourth-graders' test performance in mathematics as explained by certain family-related factors. The results are presented with the help of a geostatistical method called Kriging. This method offers a relatively new way of illustrating spatial distributions of educational variables and enables, for instance, related analyses of regional variation ( -Luukkainen et al. Large-scale Assess Educ (2020) 8:3 regions is the nature of the method, which reveals spatial discrepancies on the grounds of the data. In this study we firstly identified areal variation (a) in parental view on their child's basic skills in mathematics and science, (b) in the amount of books at home (c) in parental educational level and (d) in how parents value mathematics and science. This was done in order to produce a understanding of how family-related factors affect children's educational outcome in Finland in the different areas. This model could explain about 31 percent of the total variation of students' actual mathematics score points. All of these (a-d) were combined into Fig. 2 that visualises children's actual achievement level and shows the residuals of the model applied to the data (Table 1). On this map it is possible to recognise areas where with the same family-related factors children achieve better than would be expected regarding their family-related factors. However, we could also find areas where children underachieved regarding the studied family-related factors. Therefore, as a result of the research question this study, we can say that family related factor's effect are not constant, but they vary across Finland. In some areas the familyrelated background seems to play a more limited role in student's educational outcome and some factors in these areas seems to be "levelling the playing field" for the students.
Very often international assessment results are presented as country score points and the underlying assumption in the public is that the results are constant. The results of this study indicate that the effects of parental involvement and socio-economic background factors are by no means static or invariable throughout the country. On the contrary, there is evident geographic variation in terms of these effects (see also Harju-Luukkainen and Vettenranta 2013;Vettenranta and arju-Luukkainen 2013;Harju-Luukkainen et al. 2014;Vettenranta 2015;Harju-Luukkainen et al. 2016). In other words, family-related background factors have different effects on student achievement in mathematics across Finland: In some areas the effect is stronger than in others. This gives us new information about the complex connections that exist between family-related background variables and students' educational outcomes. According to the results of this paper some areas in Finland are better in 'levelling the playing field' for students irrespective of the familyrelated factors. To find answers to this question, further research is needed. Therefore we need to take a closer look at what is done at these areas, in their schools and social support systems, that minimises the impact of SES related variables on students educational outcome, or what are the cultural and societal circumstances, which affect the different regional realisation of parental attitudes and values. Further, the Finnish education strives for equality in educational questions for all of its students and parents are involved in schools' everyday work thru out the entire country. However, the results of this study indicates that Finland is not necessarily as equal when it comes to questions connected to levelling the educational playing field of students, as often described.
We have seen in previous studies that naming, blaming or shaming the schools in not a good way to proceed (Elstad 2009 areas. This is important from a societal perspective as well. With the help of geographical methods it is easier for, for instance policy makers, to understand areal differences and to take needed actions. The method also makes it possible to view simultaneously outcomes of several assessments (like demonstrated in Figs. 1a-d and 2). This is important, since with a larger perspective there will be a better understanding of the challenge at hand. Further, according to Elstad (2009) the negative media coverage of schools can spur schools into improvement mechanisms, provoke a hostile reaction or result in panic measures. With the help of larger geographical illustrations we can to some extent avoid these local panic reactions in education. We argue that applying the Kriging method to larger international data sets would give valuable information about different areal variances as well an better understanding of reasons behind these inequalities from a global perspective. As it is today, only the Finnish international datasets have been studied with the help of Kriging method. There are naturally limitations to this study. The limitations can be considered to be the requirements of applying the Kriging method, used when analysing regional differences and further in the interpretation of the results. Although the Kriging method provides the best unbiased linear prediction of spatial stochastic realisation (Suutari and Tarvainen 1999), the weakness of this method lies in its requirements for continuity of the spatial process and in the determination of stochastic spatial process. Individual schools may cause discontinuities in the studied process (the impact of regional factors on social relationships), which, in turn, may create a bias in the prediction near the point of discontinuity. This problem is avoided by looking at the differences between provincial-level areas, as it is done in this study. The charts should be analysed this in mind and minor differences in them should therefore be ignored.

Conclusion
The conclusion of this study is that family related factor's effect are not constant, but they vary across different areas in Finland. In some areas the family-related background seems to play a more limited role in student's educational outcome and some underlying factors in these areas seems to be "levelling the playing field" for the students. The results of this study were presented with the help of a geostatistical method called Kriging. We argue that this method offers a way of illustrating spatial distributions of educational variables and enables, for instance, related analyses of regional variation in a way that educational sciences have not been utilised before.

Variable Variable
Questions about child's basic skills How well could your child do the following when he/she began primary/elementary school? a) Recognize most of the letters of the alphabet b) Read some words c) Read sentences d) Read a short paragraph e) Write letters of the alphabet f ) Write some words Could your child do the following when he/she began primary/ elementary school? a) Count by himself/herself b) Recognize written numbers c) Write numbers and options: yes, no d) Do simple addition e) Do simple subtraction Options for answers Very well, moderately well, Not very well, not at all Not at all, up to 10, up to 20, up to 100 or higher Questions about parents' education What is the highest level of education completed by the child's father (or stepfather or male guardian) and mother (or stepmother or female guardian)?
How much do you agree with these statements about mathematics and science? a) Most occupations need skills in math, science or technology b) Science and technology can help solve the world's problems c)Science explains how things in the world work