Are large surveys of adult literacy skills as comparable over time as we think?

Introduction The concept of literacy, as defined in this article, is broader than just being able to read or write. Specifically, literacy skills here refers to the “the ability to understand, evaluate, use and engage with written texts to participate in society, to achieve one’s goals, and to develop one’s knowledge and potential” (OECD 2013, p. 20). Indeed, as part of its Programme for the International Assessment of Adult Competencies (PIAAC), the Organisation for Economic Co-operation and Development (OECD), in collaboration with many experts, has refined the measurement of adult literacy skills by conceptualizing Abstract

There is abundant theoretical and empirical literature exploring the determinants of literacy proficiency. Age is a dimension often considered in analyses as it appears to have a significant effect on skill level. The literature mentions the existence of an inverted "U" curve that reflects the fact that adult literacy skills seem to peak among individuals in their thirties, after which they decline (Green and Riddell 2001;OECD 2016;Statistics Canada 2013;Willms and Murray 2007). This curve is thought to be the result of the negative effect of aging and the "practice effect" of literacy activities in daily life (Statistics Canada, and OECD 2005). On the one hand, cognitive performance decreases, which affects the average literacy level of older individuals (Smith and Marsiske 1997). Secondly, under the adverse effect of aging on literacy levels, Wagner (2002) shows that literacy levels also decline over time due to the absence of practicing literacy activities in everyday life, known as life-wide factors. The rising part of the inverted U-shaped curve (associated with people aged 15 to 30) illustrates the central role of education in the development of people's skills. Moreover, the fact that the top of the curve corresponds to an age that is beyond the average schooling period highlights the importance of lifelong learning-lifelong factors such as continuing education-as well as the practice of literacy activities at work or at home (life-wide factors) (Desjardins 2003;Reder and Bynner 2009).
This inverted "U" curve may also reflect the presence of significant variations in education quality and the number of years of schooling received by individuals (i.e. cohort effects). Indeed, the cross-sectional design of adult literacy surveys makes it difficult to isolate the cohort effects from the effects of aging. To disentangle age, period, and cohort effects all combined in the cross-sectional relationship between skills and age, experts adopt a pseudo-longitudinal approach to simultaneously analyze the different adult literacy surveys. Combining data from different adult skills surveys in Canada and using the synthetic cohort 2 methodology, experts (mostly labour economists) show a negative cohort effect (Barrett and Riddell 2016;Flisi et al. 2019;Green and Riddell 2013;Murray et al. 2016;Paccagnella 2016a;Willms and Murray 2007). In other words, a negative cohort effect means that Canadians' literacy levels would be declining uniformly from one cohort to another after controlling for age, education, etc. These studies suggest that the age-related decline in literacy skills may be somewhat underestimated due to differences in cohort composition. 3 These studies never mention the period effect nor the level of comparability of the surveys, all of them focusing on the assessment of the cohort effect.
These studies show a negative cohort effect for Canada as well as for other countries like the United States and Norway. However, positive cohort effects are measured in other OECD countries such as Italy and the Netherlands. A negative cohort effect has very serious implications, yet, few interpretations of this negative cohort effect are discussed in the literature. The only explanation given for this negative cohort effect observed in Canada (and elsewhere) is that the school system has become less effective at training individuals, in forming their literacy skills. Green and Riddell (2013, p. 26) write: " […] these results may suggest that schools are doing a poorer job of imparting literacy at any given [education] level […]". This summary explanation, despite the fact that it was put forward to answer how such a consequential and far-reaching problem could exist, raised some doubts and further questions in our minds. How can today's school system, a major component of government spending, be so dysfunctional and produce less skilled graduates, all other things being equal, than those trained three or four decades ago? Are there no other reasons, related to the measuring instrument itself, that could provide some additional explanations for this observation? This is particularly intriguing because these studies, which show a negative cohort effect in Canada, do not address the issue of survey comparability at all.
The main objective of this research is to contribute to advancing knowledge about the effect that age and cohort has on adult literacy levels in Canada. First, the pseudolongitudinal analyses found in the labour economics literature are replicated using data from the three surveys of adult skills in Canada. Specifically, Green and Riddell's (2013) method is replicated to observe the negative cohort effect that is thought to be combined in the age effect of cross-sectional analyses. Second, an age-period-cohort (APC) analysis is conducted to better understand the effect of time on literacy skills.

Data and methods
This analysis uses data from the three surveys of adult skills in Canada: the 1994 IALS, the 2003 ALL and the 2012 PIAAC. Since each of the three surveys provides a representative sample of Canada's adult population, the literacy score of a given cohort is measured at three different points in time. It should be noted, however, that the 1994 survey sample size is much smaller than the 2003 and 2012 surveys; the sample size is approximately 3000 respondents in 1994 compared to about 16,000 in 2003 and 22,000 in 2012. The level of accuracy of the 1994 data is inevitably much lower and consequently analyses sometimes focus only on the 2003 and 2012 data.
All three surveys use a cross-sectional multi-stage sample design. The sampling unit is the household and the sampling frame is the Canadian Census. 4 Literacy skills are measured by psychometric tests, which calculate and spread the literacy score of respondents over a scale going from 0 to 500. This measure not only enables the identification of illiterate persons (those with scores lower than 175 points within the scale), but also distribute individuals along a continuum which denotes how well people use information to function in society and the economy (Statistics Canada 2013). To help with the interpretation of scores, a consortium of international experts led by the OECD divided this scale into five (or six, depending on the survey cycle) different literacy proficiency levels. At a score of 276-325 points, Level 3 is considered "as a minimum for persons to understand and use information contained in the increasingly difficult texts and tasks that characterize the emerging knowledge society and information economy" (Statistics Canada, and OECD 2005). 5 Following Green and Riddell's (2013) approach, our estimation models' dependent variable corresponds to the log of the literacy score, so the estimated coefficients show impacts in terms of percentage changes in literacy. The control variables considered in the underlying estimation models are also in line with most literature on the determinants of literacy proficiency (Barrett and Riddell 2016;Desjardins 2019;Murray et al. 2016;Scandurra and Calero 2017;Vézina et al. 2019;Willms and Murray 2007). Apart from Age-period-cohort variables, the following control variables are hence included in the models : sex, province of residence, type of region (urban/rural), level of education, knowledge and use of official languages, mother's level of education, practice of literacy activities at home, use of writing skills at work (and labour market participation status), and immigration status.
By combining the data from the three surveys into a single database, synthetic cohorts are generated according to the same methodology developed by Green and Riddell (2013). This methodology assumes that, for example, individuals aged 29 to 37 who were surveyed in 1994 are representative of those aged 38 to 46 in 2003 and those aged 47 to 55 in 2012. These synthetic cohorts allow for so-called pseudo-longitudinal analyses. The "cohort" variable contains eight categories as shown in Table 1.
In each survey, respondents are classified into cohorts based on their year of birth. The time range of categories for the cohort variable is 9 years, which allows the age group of individuals to be kept constant from one survey to the next, since all three surveys were conducted at 9-year intervals.
When using synthetic cohorts in pseudo-longitudinal analyses, one must assume that the composition of the different groups remains stable from one survey to the next (Vallin and Caselli 2001). To this end, Green and Riddell (2013) (2013) provides a more detailed description of Level 3 proficiency level: "Texts at this level are often dense or lengthy, and include continuous, non-continuous, mixed, or multiple pages of text. Understanding text and rhetorical structures become more central to successfully completing tasks, especially navigating of complex digital texts. Tasks require the respondent to identify, interpret, or evaluate one or more pieces of information, and often require varying levels of inference. Many tasks require the respondent to construct meaning across larger chunks of text or perform multi-step operations in order to identify and formulate responses. Often tasks also demand that the respondent disregard irrelevant or inappropriate content to answer accurately. Competing information is often present, but it is not more prominent than the correct information. " (p. 16).

exclude individuals under
25 and over 65 years of age from their analyses to eliminate the effect of education and retirement on the composition of synthetic cohorts. We also use the same strategy, for example, by focusing our analyses on cohorts of individuals born between 1948 and 1974, which are targeted in each of the three surveys. The individuals in these cohorts are at least 20 years old in 1994 and at most 64 years old in 2012. Of course, the effect of differential mortality between individuals by literacy level cannot be neutralized over the 1994-2012 period. This is not an overly problematic issue since the mortality rate of the working-age adult population is very low. Nonetheless, careful treatment of immigrants is necessary to minimize the potential bias of newcomers' arrival between two surveys. Immigrants admitted to Canada after the 1994 survey necessarily have different characteristics from those surveyed in 1994. Therefore, immigrants who arrived during the study period (between the time of the first and last survey) are removed from the analyses to maximize the comparability of the cohorts studied. Moreover, we sometimes run specific analyses on natives to eliminate the effect that immigrants could have on the relationship between literacy level, age and cohort. Indeed, the integration process implies an additional layer of complexity as it influences the relationship between age, cohort and period. Finally, residents of the three northern territories, as well as nonpermanent residents, are also excluded from analyses since these two sub-groups were not targeted in the 1994 survey.
It should be noted that this pseudo-longitudinal analysis is made possible thanks to a recent statistical re-estimation and rescaling of the 1994 and 2003 data by Statistics Canada. In fact, the 1994 and 2003 surveys measure respondents' literacy through two distinct dimensions: prose literacy and document literacy. Plausible values 6 were also recalculated and merged to best match the instrument and scale used in 2012. Such data harmonisation work was carried out in 2014 specifically to carry out this type of trend analysis. In all tables and estimation, a specific statistical treatment specific to this type of data is applied, including the the jackknife sampling weights, in order to produce robust and unbiased estimates of the skill level of individuals (Wu 2005). To undertake our analysis, we use piaacdes, piaactab and piaacreg commands, which were developed for the STATA ® software by OECD experts (Pokropek and Jakubowski 2014) and by Statistics Canada. These commands take into account not only the different sets of plausible values but also the jackknife sampling weights that are provided with the data.
Despite this statistical harmonization, the questionnaires of the different surveys considerably evolved between 1994, 2003 and 2012, which may affect the comparability of the data. As mentioned above, the 1994 IALS and the 2003 ALL reported literacy as two separate domains on two separate scales, covering prose literacy and document literacy. The 2012 PIAAC does not distinguish those dimensions and reports literacy as a single domain that covers the reading of not only prose and document texts, but also digital (such as websites, results pages from search engines and blog posts) and mixed format texts (i.e. texts containing both continuous and non-continuous elements). In a nutshell, the PIAAC survey conceives literacy more broadly than the previous surveys; respondents' literacy skills were therefore assessed on different bases and it is not possible to re-estimate nor re-scale the 2012 literacy scores to allow more direct comparisons with the specific literacy skills assessed in 2003 and 1994. Furthermore, the PIAAC survey "was mainly designed as a computer-based assessment" (Paccagnella 2016b), contrary to IALS and ALL surveys where respondents filled up a paper questionnaire (paper-based assessment). Undoubtedly, this difference in the delivery mode can also negatively affect the comparability of results across surveys. The sampling method also vary among surveys. For example, in 2003, the survey was designed to provide reliable estimates for a variety of special target populations such as recent and established immigrants, Francophones in New Brunswick, Manitoba and Ontario, Anglophones in Quebec, Urban Aboriginals in Manitoba and Saskatchewan, Youth in Quebec and British Columbia and Aboriginal residents in the three northern territories. The target population of the surveys also varied slightly between 1994 and 2012. All three surveys exclude residents living in institutions, residents of Indian (Aboriginal) reserves, residents of some sparsely populated areas and members of the Armed Forces. In 2012, the target population is limited to people aged 16 to 65 years old, while the 1994 and 2003 surveys targeted the 16+ . However, in the 1994 survey, residents of the territories as well as non-permanent residents were all excluded.
This paper goes beyond the pseudo-longitudinal analyses already published in the literature. Indeed, it presents the results of regression analyses which test not only the age and cohort effects but the "period" effect as well. To do this, the relationship between literacy level and age is calculated for each cohort and separately for each survey. Immigrants are excluded from these analyses and only three cohorts of native-born are considered in order to maximize the homogeneity of the sample studied. The age variable (continuous variable) is used and a quadratic parameter is added to estimate the relationship, similar to the methodology developed by Willms and Murray (2007). Using the STATA ® "margins" function, the literacy score predicted by the age-specific literacy regression model is illustrated in a graph to show trends.
A cohort effect would then be observed if the trends for a given cohort are relatively uninterrupted from one survey to another and the curves of the different cohorts evolve in parallel, but at distinct levels. (See Fig. 1 for an illustration of this case). Table 2 presents some descriptive statistics on the population under study.

Descriptive analyses
First, we see that between 1994, 2003, and 2012, the sex composition of the population does not change substantially. Indeed, there are almost as many men as women in each survey. In contrast, the age structure is much older in 2012 than in 1994; the proportion of people aged between 55 and 64 increases from 14.5% in 1994, to 18.5% in 2003, to 24.1% in 2012. Conversely, the proportion of people aged 25 to 34 and 35 to 44 decreases from 31 to 24% between 1994 and 2012.
The information contained in Table 2 not only reflects the aging of the population over the past decades, but also shows general trends in the regional distribution of the population in Canada. The relative share of Ontario and Alberta is increasing mainly at the expense of Quebec and the Atlantic provinces. A small increase in British Columbia's relative share is observed, while the opposite is true for Manitoba and Saskatchewan.
There was a clear increase in the educational level of Canadians aged 25 to 64 between 1994 and 2012. The proportion of individuals with low levels of education (less than a high school diploma) decreases from 31.0 to 11.2%, while the proportion of university graduates increases from 18.1% in 1994 to 29.6% in 2012. The proportion of people with a high school diploma or post-secondary-non-university education also increases by almost ten percentage points over the same period.
The data in Table 2 are also in line with what is known about the recent demo-linguistic trends, namely that the proportion of Canadians whose mother tongue is either French or English is decreasing in favor of Canadians whose mother tongue is neither French nor English. 7 Between 1994 and 2012, the relative share of individuals with neither English nor French as either their mother tongue or as the home language almost doubles, rising from 8.8 to 14.0%. All these trends can be partly explained by the fact that the relative share of immigrants in the Canadian population is increasing. In fact, Table 2 shows that the proportion of foreign-born Canadians increased from 21.3% in 1994 to 24.4% in 2012 among the total population aged 25 to 64. Table 3 contains a descriptive analysis of the cohorts, the objective being to validate the use of the synthetic cohorts' methodology. If, for example, the sex ratio of a given cohort was to vary considerably from one survey to another, the comparability of the surveys and, consequently, the results of the pseudo-longitudinal analysis through the synthetic cohorts would have to be questioned. For the sake of synthesis, only descriptive statistics relating to three cohorts-those born between 1948 and 1974-are described since these cohorts form the core of the sample used for pseudo-longitudinal analyses. It should be noted that immigrants who arrived between 1994 and 2003 are removed from the sample in order to maximize the comparability of cohorts through time.  Table 3 shows that the distribution of the three cohorts by sex, province of residence and by language is relatively stable from one survey to the next. The greatest variations are observed in the distributions by level of education. Percentages suggest that cohorts' education levels increased between 1994 and 2012. Indeed, for a given cohort, the proportion of individuals with a low level of education (less than high school diploma) decreases in favor of university graduates, leaving the proportion of people in the intermediate category relatively stable. It should also be noted that the proportion of individuals with a low level of education is significantly lower for more recent cohorts. In summary, the data show that individuals' educational attainment not only increased between 1994 and 2012, but that some individuals' educational paths did not stop in their late twenties and continued over a large part of the life cycle.
More likely, the increase in cohorts' educational attainment between 1994 and 2012 is probably overestimated. Unexpected variations are also observed from one survey to another, which calls into question the comparability of survey data. For example, the proportion of university graduates among those born between 1948 and 1956 decreases between 1994 and 2003, while it systematically increases among the two younger cohorts (1957-1965 and 1966-1974). Such problematic variations in cohorts' composition should prevent the use of the synthetic cohort approach. In our view, despite being quite simple, Table 3 is very informative about the comparability of cohorts from one survey to another. Nonetheless, no similar descriptive information is found nor discussed in the pseudo-longitudinal studies that this paper replicates the analyses. Finally, it should be noted that the percentages calculated from the 1994 survey data must be analysed with caution given the small sample size of this survey. As a result, there are indeed larger differences between the first two surveys (1994 and 2003) than between the 2003 and 2012 surveys, where the variation is generally plus or minus 1%.  Table 4 shows the results of a very simple regression model, i.e., only including the "cohort" variable without any other control variables except age and sex. This regression model is called the "gross model" as it gives an approximation of the "gross" cohort effect. The first column of Table 4 contains the results obtained using data from the three surveys (1994, 2003 and 2012). The other three columns contain the coefficients corresponding to the results obtained by taking only two points in time in order to see the effect of this methodological choice on the results obtained. The purpose of this exercise is to determine whether taking into account the 1994 survey data, which are drawn from a very small sample, is biasing the obtained coefficients. Thus, under the second column,  Table 4 do not show a significant cohort effect at the 95% threshold. 8 However, the negative effect of age is very clear. The same observations are made regardless of whether all three surveys are taken into account or only two of the three. The level of significance of the coefficients is lowered when the 1994 data are included in the analyses, which is a direct consequence of the small sample size of this survey. Indeed, the age group coefficients are more significant in the second column, which is the only column that does not include the 1994 data. Table 5 reports the coefficients obtained for the same dimensions as those in Table 4. This time, however, several control variables are included in the analysis: province of residence, type of region (urban/rural), level of education, knowledge and use of official languages, mother's level of education, practice of literacy activities at home, use of writing skills at work (and labour market participation status), and immigration status. Table 5 therefore replicates the analyses published in labour economics literature mentioned in the introduction. Similarly, we find a significant negative cohort effect. All other things being equal, the average score of younger cohorts is significantly lower than the score of older cohorts. 9 The difference between the cohorts with the greatest positive effect (1948)(1949)(1950)(1951)(1952)(1953)(1954)(1955)(1956) and those with the greatest negative effect (1993+) is between 8 and 10% depending on the surveys considered. Along with the negative cohort effect, we find a strongly negative and significant correlation between skill level and age. Again, the results are very similar, whether or not the 1994 survey data are included in the analyses, but the level of significance of the coefficients is higher when the 1994 data are excluded from analyses.

Age-period-cohort (APC) analysis
As already mentioned in the introduction, the literature provides very few explanations as to why this negative cohort effect on the adults' literacy skills is found. We took our analyses further to better understand the mechanics and verify the validity of the effects being measured and the instruments used as part of the measuring.
In addition to age and cohort effects, an attempt was made to measure the possible "period" effects that should be understood rather as the survey effects. The variable "Survey year" was therefore created to measure the importance of this effect by classifying respondents according to the survey in which they participated (3 categories). In fact, by inserting the "Survey year" variable in the regression model, two effects are measured: the period effect itself, but also and the potential effect of methodological changes between the three surveys on the estimated coefficients. However, it is not possible to distinguish between these two effects. Like the analyses presented in the first column of Tables 4 and 5, the APC model uses data from the three surveys (1994, 2003 and 2012). The first column of Table 6 is identical to the first column of Table 5, which contains the coefficients generated by the model that considers the dimensions of age, cohort and many other control variables. However, the coefficients of the control variables are not displayed in Table 6 for the sake ***p ≤ 0.001; **p ≤ 0.01; *p ≤ 0.05; † p ≤ 0.1. Robust standard errors in brackets of brevity; indeed, the analytical interest lies in the coefficients of the age, period, and cohort variables. The subsequent columns of Table 6 contain the coefficients generated by similar models that alternate age, cohort and period ("Survey year") variables. It is not possible to generate a model that combines all three dimensions simultaneously since the relationship between age, period and cohort is circular; changes in the demographic process observed through one of these three dimensions cannot be statistically distinguished from those observed through the other two (Wilmoth 2001, p. 380). Table 6 shows a very strong and significant effect of the "Survey year" variable, even though all control variables are included in the regression models. In other words, with equal characteristics in terms of age or cohort, individuals surveyed in 2003 have a higher level of literacy than those surveyed in 2012. The difference is even greater (and still significant) for respondents to the 1994 survey. We see that the period-cohort (PC) model shows a positive cohort effect, since the birth cohort variable captures the negative effect of age. The age effect is generally larger than the cohort effect. Finally, the fact that the variable "Survey year" is significant in both age-period (AP) and periodcohort (PC) models demonstrates the potential lack of comparability from one survey to another.
Period effects are associated with significant societal or technological changes that affect all cohorts (or all age groups) at a more or less specific time. For example, unemployment rates in a given year are strongly influenced by the current economic situation, as they are higher in a recession than in a period of economic expansion. Similarly, secularization and societal changes combined with increased accessibility to effective contraceptive methods in the 1960s, explain (as a period effect) the drastic changes in fertility rates at the same time for all age groups in Canada (and elsewhere in the West). 10 In this case, it is highly unlikely that a major specific event or major change (social or policy) has occurred which could possibly explain such changes in literacy levels across cohort and age between 1994-2003 and 2003-2012. Rather, this significant difference suggests that the literacy score value of an individual with the same characteristics is significantly different from one survey to another, mainly because it is estimated differently. In short, it is likely that this negative "period effect" would result from methodological changes between the three surveys rather than from a genuine period effect as defined above.
An additional test was performed to ensure the validity of our findings. In order to limit potential biases related to the sample selected for analysis, the same regressions were conducted for the native-born only (excluding all immigrants from the analysis), belonging to the three birth cohorts observed in the three surveys (Table 7).
Results show that the variable "Survey Year" is significant, suggesting that with equal characteristics, individuals surveyed in 2012 were assigned a lower literacy score than in 2003 and 1994. Moreover, as in Tables 6 and 7 shows that by alternating the age, period and cohort dimensions, only the cohort effect is reversed. Indeed, the age effect remains negative in the age-cohort and age-period models and that the period effect remains positive in the age-period and period-cohort models. These results weaken the idea of a clear negative cohort effect; much of this effect would in fact be caused by a non-comparable methodology that systematically assigned individuals lower scores in the most recent survey cycles. Vézina and Bélanger Large-scale Assess Educ (2020)

Trend analysis of the relationship between literacy level and time
To examine with a different approach and illustrate the age-period-cohort relationship with adult literacy scores in Canada, we measure the relationship between literacy and age separately for each cohort and separately for each survey year. In other words, we stratify the regressions by survey year and cohort, and limit our analysis to the native-born subgroup. The continuous age variable is used and a quadratic age parameter is inserted in the model. Using the STATA ® "margins" function, we measure the literacy score predicted by the age-specific literacy regression model. These values are illustrated in Fig. 2 to show the relationship between literacy score and age, observed at three different points in time, across the three cohorts. When comparing the age-literacy relationship separately for three cohorts of nativeborn, we observe a significant gap (about 10 points on the literacy scale) between the Table 7 Weighted linear regression of the natural logarithm of the literacy score,

Canadian-born individuals born between 1948 and 1974, Canada, 1994 IALS, 2003 ALL and 2012 PIAAC data combined (full model-with all control variables and alternating age, period and cohort variables)
***p ≤ 0.001; **p ≤ 0.01; *p ≤ 0.05; † p ≤ 0.1. Robust standard errors in brackets The regressions also include many other variables whose coefficients are not shown in the table: sex, province of residence, type of region (urban/rural), level of education, knowledge and use of official languages, mother's level of education, practice of literacy activities at home, use of writing skills at work (and labour market participation status), and immigration status

Variables
Native  Figure 2 illustrates how the trend analysis is more difficult to make between 1994 and 2003 since the coefficients generated by the 1994 survey data are estimated from relatively small samples. Nevertheless, there is a clear gap between 2003 and 2012, which is problematic as it suggests data incomparability. Hence, this gap also questions the validity of analyses that use the synthetic cohort method using cross-sectional surveys on adult literacy skills. Figure 3 combines all elements contained in Fig. 2, except for those resulting from the 1994 survey data. The relationship between literacy and age measured in 2003 and 2012 is therefore displayed for the three targeted native-born cohorts: individuals born between 1948 and 1956, those born between 1957 and 1965, and those born between 1966 and 1974. The relevance of this combination is revealed in Fig. 4, which adds linear trend lines by cohort and by survey year.  Fig. 2 Age-literacy relationship for three native-born cohorts, Canada, 1994IALS, 2003ALL and 2012 The predicted values of the literacy score are generated by regression models stratified by cohort and survey year. These models also include the following control variables: sex, province of residence, type of region (urban/rural), level of education, knowledge and use of official languages, mother's level of education, practice of literacy activities at home, and use of writing skills at work (and labour market participation status) Literacy score Age 1966-1974Cohort 1948-1956  These models also include the following control variables: sex, province of residence, type of region (urban/ rural), level of education, knowledge and use of official languages, mother's level of education, practice of literacy activities at home, and use of writing skills at work (and labour market participation status) Figure 4 shows that the age-literacy relationship is more correlated by survey year than by cohort. Indeed, the fit is better by survey year than by cohort. In short, within a given cohort, trends are discontinuous from one survey to the next; conversely, the trend is much smoother within the same survey year even despite moving from one cohort to another. Figure 4 illustrates well why and how the effect of the variable "Survey year" is strong and always significant in the pseudo-longitudinal analyses, the results of which appear in Tables 6 and 7. This suggests that the negative cohort effect is due mainly to the fact that the assessment of respondents' literacy score is systematically lower in 2012 than in 2003. Using the synthetic cohort method in such situation (where the measurement instrument is not fully comparable) leads to erroneous conclusions.
The age-literacy relationship estimated in our analyses does not correspond to what is shown in Fig. 1. Admittedly, the patterns of the cohort trend lines shown in Fig. 4 evolve in parallel at significantly different levels when the analysis is carried out without taking into account the survey year ("period" effect). In fact, with equal characteristics in terms of age, cohort, and many other dimensions (control variables), Fig. 4 clearly shows that individuals surveyed in 2003 have a higher level of literacy than those surveyed in 2012. On the other hand, the obvious discontinuity in cohort trends from one survey to another (shown in Fig. 3) further suggests that there is a significant problem with the comparability of the measurement instrument and, therefore, casts serious doubt on the validity of any pseudo-longitudinal analysis of adult skills survey data using the synthetic cohort method.

Conclusions
To summarize, in this article we first replicate the results already published in the literature (Barrett and Riddell 2016;Flisi et al. 2019;Green and Riddell 2013;Murray et al. 2016;Paccagnella 2016a;Willms and Murray 2007). After aggregating data from the three most recent cross-sectional surveys on adult literacy skills to build synthetic The predicted values of the literacy score are generated by regression models stratified by cohort and survey year. These models also include the following control variables: sex, province of residence, type of region (urban/rural), level of education, knowledge and use of official languages, mother's level of education, practice of literacy activities at home, and use of writing skills at work (and labour market participation status) cohorts, we measured a negative cohort effect after controlling for the effect of different socio-economic factors (control variables). However, in a second stage of analysis to better understand and describe the mechanisms of skill maintenance over time, we found the existence of a negative effect that age has on literacy scores, but also a significant period effect that can mainly be explained by the effect of a change in the instrument used to measure respondents' literacy skills. The contribution of these findings is important because they highlight the limitations of analyses that can be done with the synthetic cohort method on cross-sectional adult literacy survey data not only in Canada, but also in all other OECD countries where these surveys have been conducted. It also questions the validity of the cohort literacy declines, along with their socio-economic implications, identified by these analyses. For example, by observing a net literacy skills loss among younger cohorts, experts conclude that "Canada's education system is failing to impart durable skills, or at least the attitudes, values and behaviours that would allow their graduates to retain the literacy skills they learned" (Willms and Murray 2007, p. 23). The economic loss to Canada associated with this decline in literacy is estimated at not less than $118 billion per year (Murray et al. 2016, p. 35). Quite surprisingly, the issue of the comparability of the skill measurement instrument is never raised in any of the studies, despite the fact that synthetic cohort analyses are applied to other countries such as the United States and Norway with similar results (see, for example, (Barrett and Riddell 2016)). In the literature, we now note the emergence of other analyses using the same methodology (synthetic cohorts) and adult skills survey data to answer other research questions (see for example Chmielewski (2018) and Flisi et al. (2015)).
We acknowledge that this approach is attractive to researchers. In the absence of truly longitudinal survey data on adults' skills, scientists rightly rely on available data from cross-sectional surveys to improve our understanding of the dynamics and mechanisms for skill gains and losses over the life cycle. This understanding is indeed at the heart of very important issues with substantial socio-economic implications. Hence, considerable efforts are made to construct these synthetic cohorts from cross-sectional surveys to measure the age effect, the cohort effect, but also the effect of other life-cycle events (continuing education, graduation, unemployment episode, etc.) on an individual's skill maintenance and acquisition over time.
Basically, we can currently only theorize about the evolution of adult literacy skill level over the lifespan and across time. Hertzog et al. (2009) illustrate the extent of possible trajectories, based on neurobiological studies showing the effects of behaviour and environment on brain structures and cognition (Fig. 5).
The shaded area represents the possible limits of the trajectories and the curves show examples of trajectories. Although all trajectories start at the same level of cognition at age 20, the figure illustrates four different scenarios of changes in cognitive skill level. The red curve corresponds to an individual who would be immersed in optimal conditions for learning and using skills throughout his or her working life. Conversely, the yellow curve illustrates the theoretical evolution of an individual's cognitive skill level under adverse conditions. The blue curve illustrates an average trajectory and the green curve illustrates the idea that the evolution of an individual's cognitive skill level can go in all directions over time, resulting in rather heterogeneous trajectories, all other things being equal. More precisely, when describing the green curve, the authors of Fig. 5 suggest: "[…] that late-life improvement in cognitive functioning is possible if an individual engages in enrichment behaviors in midlife that are of a quality and degree not previously manifested at earlier ages, with those behaviors pushing the individual toward optimality" (Hertzog et al. 2009, p. 8). This hypothesis is directly related to Reder's (1994) Practice-Engagement theory which states that an individual's skill level is influenced by learning and exercising through the practice of literacy activities in his or her daily activities, whether at home, at work or in everyday life. Hence, the negative age effect on skills could be cancelled out-or at least slowed down-by frequent and regular use of literacy skills and practice of literacy activities in everyday life for a significant time period. As such, these dimensions were rightly included in our analyses.
Only a longitudinal survey would really test this theory and assess the possible trajectories of changes in individuals' skill levels over the life cycle and across time. Such an investigation would represent a significant improvement in our understanding of the maintenance of skills. Efforts in this direction are currently underway in Germany, where the literacy skills of the same respondents are measured at two points in time, 2012 and 2015 (Rammstedt et al. 2017). To our knowledge, no studies about adult literacy skill trajectories or the measurement of a possible cohort effect based on the analysis of these new data have yet been published.
In Canada, there is one longitudinal survey of adults: The Longitudinal and International Study of Adults (LISA). This survey collects information from respondents across Canada every 2 years since 2012 on a variety of topics (work, education, health and family). Although the first wave of this longitudinal survey partly corresponds to the information collected by the 2012 PIAAC Survey-including the literacy score of respondents-the second wave of the LISA did not measure respondents' literacy skills. According to information available about this survey on Statistics Canada's website, the third and subsequent waves are unlikely to have measured and will not re-measure respondents' literacy proficiency levels (Statistics Canada 2018). However, it would be highly desirable that this dimension be included in the LISA questionnaire, similarly to  Hertzog et al. (2009) what is done in the German research programme mentioned above. This would potentially enrich our understanding of adults' literacy skill gain and loss over time and throughout the life cycle. Of course, longitudinal surveys and data have some other specific type of limitations, such as sample attrition. Nonetheless, a longitudinal survey on adult literacy skills in Canada would provide complementary information to the already-existing collection of cross-sectional surveys. By providing more comparable measurement instruments of adult literacy skills over time, these new data would help move toward filling the gaps outlined in this article, and hence would potentially bring greater focus and clarity to public policies concerning adult literacy skills.