Motivation to learn by age, education, and literacy skills among working‑age adults in the United States

research to develop a national of to learn among the of in Using nationally representative data, this study adopted the alignment optimization method (see the “Analytic approach” section for a detailed description and relevant associated research) to estimate motivation to learn across subpopulations defined by educational attainment, literacy skill levels, gender, and age. Motivation to learn is critical for promoting adult education and training for economic security in a dynamic era of job automation. However, little is known about the distributions of motivation to learn at the national level. Abstract This study highlighted how particular intersections of personal characteristics were related to Motivation to Learn (MtL) among adults. MtL is a prerequisite for adult education and training participation. However, little is known about MtL across subpopulations due to several methodological limitations. This study developed a national profile of MtL by key subpopulations that are defined by combinations of age, gender, education level, and literacy proficiency in the United States. Data were obtained from 2012/2014/2017 Program for International Assessment of Adult Competencies (PIAAC) restricted use file ( N = 8400). The alignment optimization (AO) method was employed to estimate subpopulation means of a PIAAC-based latent MtL construct. Subpopulations with younger age, greater educational attainment, and higher literacy proficiency showed significantly greater MtL.

Page 2 of 20 Yamashita et al. Large-scale Assessments in Education (2022) 10:1 and expectancy often are critical components (Cook & Artino, 2016;Wigfield & Eccles, 2000). Valence is a perceived value that is placed on learning outcomes (Knowles et al., 1998). Valence can be further distinguished by intrinsic (e.g., interest, fun) and extrinsic (e.g., rewards such as career advancement and social status) sources (Cook & Artino, 2016). Expectancy is the perceived chance of success in learning activities (Vroom, 1964). In the context of training and development activities, when perceived needs for further education/training are in alignment with the expected benefits, learning activity is considered valuable and therefore, motivational (Noe & Wilk, 1993). In other words, MtL is jointly determined by valence and expectancy, and this notion is depicted in the well-known expectancy-value theory (Vroom, 1964;Wigfield & Eccles, 2000).

Benefits of motivation to learn
In the context of adult education, MtL plays a critical role to facilitate participation in a variety (e.g., formal, non-formal, informal) of adult education and training (AET) activities or lifelong learning (Boeren, 2017;Lazowski & Hulleman, 2016;Yamashita et al., 2019). For example, a variety of activities, such as attending a degree program, taking an online course, going to a public lecture, participating in job-related training, and reading books or using a computer to learn new things can be considered AET. AET is associated with human, social and identity capital (e.g., individual knowledge and skills; social support; self-confidence), which jointly benefit multiple life domains, such as employability and social/political participation (Schuller et al., 2004;Tikkanen, 2017). Additionally, MtL is linked to greater training satisfaction (Klein et al., 2006), training effectiveness (Bell & Ford, 2007) and persistence (i.e., consistent AET participation; (Colquitt et al., 2000). Moreover, MtL promotes higher use of metacognition, which reflects greater depth of thinking and learning, as well as higher performance (e.g., academic grades; Klein et al., 2006). Furthermore, those who are motivated to learn are more likely to achieve training transfer or to use acquired skills and knowledge in practice (Gegenfurtner et al., 2009). On a related note, above and beyond learning outcomes, increased AET participation leads to the wider societal benefits such as more social, cultural, and political participation, as well as greater life satisfaction (Hammond, 2005). Although the individual capability for AET participation is often structured by the socioeconomic context (e.g., Bounded Agency Model; see Rubenson & Desjardins, 2009), one may argue that MtL not only initiates AET participation but also facilitates desirable AET outcomes. However, the AET participation rate occurring over individuals' life course is relatively low, and in need of improvement in the United States. Indeed, although anyone may benefit from AET, the AET participation rate in the last 12 months among those age 16 to 65 years has held consistently at approximately 50% between 2012 and 2017 (Desjardins, 2015;OECD, 2016a). Arguably, to improve AET participation rates, MtL is a necessary prerequisite because increased MtL can lead to increases in AET participation and maximize the utility of AET (Boeren, 2017). Although barriers to AET participation-including situational barriers (e.g., money, time, caregiving responsibility), institutional barriers (e.g., AET location and schedule) and dispositional barriers (e.g., older age, previous negative experience in education;Cross, 1981;Liu et al., 2011) have been discussed extensively, the potential benefits of enhancing MtL have received relatively less attention. The lack of MtL is an additional example of a dispositional barrier.
There are three key demographic and socioeconomic characteristics relevant to the context of MtL-age, gender and education (Hughes et al., 2005). Previous research shows that older age is associated with lower MtL (e.g., Boeren, 2017;Cummins et al., 2015), although findings related to the effects of age remain mixed in the literature (Gegenfurtner & Vauras, 2012). Also, compared to younger adults, older adults tend to focus more on intrinsic motivation, including the motivation related to the acquisition of new knowledge and the joy of learning (Liu et al., 2011). Older age also is related to underlying cognitive and sensory functional limitations (e.g. vision, hearing) and to other known barriers (e.g., mobility, transportation, time) that correspond to lower MtL (Roosmaa & Saar, 2017;Wlodkowski & Ginsberg, 2017).
Gender is another key determinant of MtL in the context of AET. Gendered career trajectories and cultures in organizations (e.g., differentiated learning opportunities and gender-related attitudes in the work place) necessitate the importance of examining gender effects in AET research (Hughes et al., 2005). In the recent decades, women's postsecondary education participation has outpaced men's (Ryan & Bauman, 2016). Women also are more likely than men to see education as a pathway to upward social mobility (i.e., economic gain; (Wlodkowski & Ginsberg, 2017). Overall, gender consistently has been included in education research (e.g., Gorges et al., 2017;Smith et al., 2015).
In addition to basic demographic characteristics such as age, gender, and socioeconomic status-education is particularly relevant to MtL and in turn, AET participation. Higher educational attainment is associated with greater participation and performance in subsequent AET (e.g., postsecondary education participation; Robbins et al., 2004). Indeed, a positive experience in AET (e.g., successful degree program completion) and MtL can reciprocally enhance each other among adult learners (Chang & Lin, 2011). At the same time, lower qualification (e.g., lower educational attainment) as well as failure and negative experience in earlier life AET can suppress MtL and, correspondingly, AET participation (Roosmaa & Saar, 2017;Saar et al., 2013). Also, compared to their counterparts with higher educational attainment, adults with less education are more likely to act based on extrinsic motivation (e.g., economical advantage) than on intrinsic motivation (Rothes et al., 2014). Although the effects of age, gender, and educational attainment clearly are important, the importance of other sociodemographic factors such as race/ ethnicity or employment status should not be overlooked (Hughes et al., 2005;Klein et al., 2006). Indeed, race/ethnicity and employment are interrelated with the aforementioned factors (age, gender, and education; Center for Community College Student Page 4 of 20 Yamashita et al. Large-scale Assessments in Education (2022) 10:1 Engagement, 2014). Building on the existing MtL literature that provides rich empirical evidence on age, gender, and education but relatively little on race and employment, this study focused on age, gender, and education as the established foundation. One existing gap in the MtL and AET research literature involves basic skills. Emerging research with the large-scale assessment data has shown that basic skills such as literacy skill are associated with MtL (e.g., Patterson & Paulson, 2015). Literacy skills (e.g., a foundation to understand more complex topics) can reflect a part of readiness to learn (Smith et al., 2015). Also, literacy skills can partially explain differences in readiness to learn (e.g., for postsecondary education), motivation, and AET participation across demographic and socioeconomic characteristics (Center for Community College Student Engagement, 2014; OECD, 2016a). Basic skills may be especially important among older adults because of existing gaps between formal education and current skills. In fact, along with MtL, literacy skills are predictive of AET participation among adult learners (Yamashita et al., 2019).

Measurements of motivation to learn
In adult education settings, two of the most commonly used MtL assessment tools are those developed by Noe and Schmitt (1986) and Duncan and McKeachie (2005). Noe and Schumitt's 16-item MtL School Administrative Descriptive Survey (SADS) consists of three domains-intensity, persistency and direction. The original SADS as well as the modified (for work settings) scale have been used extensively in previous research (e.g., Colquitt & Simmering, 1998;Facteau et al., 1995). Indeed, over 60% of 38 studies reviewed in a meta-analysis used the original or modified versions of Noe and Schumitt's MtL scale. The validity evidence supporting Noe and Schumitt's MtL scales is somewhat limited although high reliability (Cronbach's alpha = 0.81) has been reported based on a large sample with over 1000 survey respondents (Noe & Wilk, 1993). Duncan and McKeachie (2005) (Duncan & McKeachie, 2005) showed that the PIAAC 4-item MtL scale is conceptually in alignment with the critical propositions (e.g., intrinsic nature) of the expectancy-value theory (Gorges et al., 2016;Wigfield & Eccles, 2000). Also, scores obtained from the 4-item PIAAC MtL scale showed evidence of sound construct validity (i.e., predictive of AET participation) and comparability (i.e., measurement invariance) across 21 economically developed nations (Gorges et al., 2016). Recent studies have adopted the 4-item PIAAC MtL scale or a modified version and reported adequate psychometric properties with the middle-aged and older adults (Liu et al., 2019;Yamashita et al., 2019). Gorges et al. (2017) further examined the psychometric properties of the 4-item MtL scale that is derived from the PIAAC data, across gender, age, education level, and immigration background, and reported that scores from the scale met measurement invariance criteria (e.g., partial scalar/strong invariance) across most of the 21 countries. Although SADS and MSLQ have been used extensively, and psychometrically sound properties for Gorges et al. 's PIAAC-based MtL scale have been identified, a variety of other assessment tools are regularly used in research, yet validity evidence often is under-reported (Gegenfurtner & Vauras, 2012).

Gaps in the literature
One of the critical missing elements in the literature involves MtL assessment among diverse populations. Additionally, to our best knowledge, except for Gorges et al's (2016) 4-item PIAAC-based MtL scale, validity evidence has not been presented for existing MtL assessment tools, nor have those tools been used with data from nationally representative adult populations. In addition, whereas Gorges et al. (2017) showed assessment comparability across countries and by sub-populations (i.e., young vs. middle-age vs. older age; females vs. male; high vs. intermediate vs. low level of education; and immigrants vs. native born), there is a need for a more in-depth investigation of the comparability of scale scores across more detailed sub-populations. Such investigation would lead to increased understanding of MtL in demographically and socioeconomically diverse nations like the United States. Despite several decades of research, relatively little is known about the distribution of MtL at sub-population levels.
There are two major reasons why MtL at the adult population level has been overlooked. First, to our best knowledge, nearly all studies focusing on MtL have used nonrepresentative or convenience samples (Gegenfurtner & Vauras, 2012). Some exceptions include studies using the PIAAC data (e.g., Gorges et al., 2016Gorges et al., , 2017Yamashita et al., 2019). Gorges et al. (2017) conducted rigorous measurement invariance analysis and showed the comparability of the PIAAC-based MtL scale scores across gender (female and male); age (early working age 16-29, mid-life working age 30-49, later working age 50-65); education level (low, intermediate and high level); and immigration background (based on native language) among the U.S. population. However, these findings are based on separate analyses that employ one grouping variable at a time. As such, the comparability between more detailed sub-populations involving intersections of characteristics (e.g., older, working aged women with low educational attainment compared to men with the same background) still is uncertain. By the same token, although Georges et al. (2017) demonstrated that their PIAAC-based MtL scale can be used to estimate the mean MtL across sub-groups in many countries including the United States, the mean MtL across more detailed sub-populations has yet to be evaluated.
Second, conventionally used measurement invariance tests often are restrictive. Multigroup confirmatory factor analysis (MGCFA) and item response theory (IRT) arguably are the most common methodological approaches in this context (Munck et al., 2018). The conventional invariance tests evaluate configural invariance (i.e., consistency in model specification across groups), metric/weak invariance (e.g., consistency in the factor loadings across groups) and scalar/strong invariance (i.e., consistency in both the factor loadings and intercepts across groups; Kline, 2016;Wang & Wang, 2012). However, although it is arguably the most common approach, MGCFA generally is used for comparisons involving a small number of groups (e.g., 2-3 groups) and is not suitable for a comparisons involving larger numbers of groups (Kim et al., 2017). Further, the conventional approach to measurement invariance, which is considered as an exact approach, often requires data-driven step-wise adjustments to the model (e.g., based on modification indices) and typically results in a model that differs from the initially-specified model when many groups are compared (Lomazzi, 2018). For example, if 16 groups are involved in an analysis, 120 tests would be required for assessment of metric and scalar invariance. As such, even just for the metric invariance of a simple model with three freely estimated factor loadings, the number of pairwise comparisons for each parameter (i.e., factor loading) could be up to 360. With a larger number of pairwise comparisons, there is a corresponding need to modify individual models and the increased potential for statistical error (e.g., Type 1 error) should be of a concern when conducting mean comparisons across the many groups (Asparouhov & Muthén, 2014). As such, obtaining evidence for strong invariance-which is a requisite condition for the latent mean comparisons-is difficult, and strict standards for this evidence often are relaxed, allowing for partial measurement invariance (i.e., allowing for non-invariance in a small number of factor loadings/intercepts; (Byrne & Vijver, 2017). In short, due to limited research with nationally representative data and challenges with the conventional measurement invariance tests, estimation and comparison of mean MtL levels for detailed sub-populations have not been reported.
The objective of this study is to develop a national profile of MtL scores by sub-populations in the United States. Building on the rigorous psychometric work (Gorges et al., 2016(Gorges et al., , 2017, this study employed the alignment optimization method (Asparouhov & Muthén, 2014;Muthén & Asparouhov, 2018) and the nationally representative PIAAC data to estimate and compare the mean MtL scores across key sub-populations.

Methods
Data were obtained from the 2012/2014/2017 U.S. PIAAC restricted use file (RUF; Krenzke et al., 2019). PIAAC is an ongoing international study (commencing in 2012) that collects data on basic skills (e.g., literacy and numeracy) from adult populations in 38 countries. The U.S. PIAAC data includes participants aged 16 to 74 years. PIAAC adopted a repeated, cross-sectional, complex sampling design, as well as the rigorous skill assessment to provide nationally/internationally representative skill proficiency data (OECD, 2016b). PIAAC systematically assessed basic skills such as literacy and numeracy and generated 10 sets of plausible values for use in statistical analyses. The U.S. RUF provides the 2012, 2014, and 2017 wave data together with adjusted survey weights, and data from the combined waves can be used as a single, large, cross-sectional data set (For the technical details of each individual wave data, see Krenzke et al., 2019). The PIAAC RUF data security protocol and data use license (# Masked for Blind Review) were approved by the Institute of Education Sciences (IES), U.S. Department of Education. This study focused on the typical working age adult population consisting of individuals between 25 and 65 years of age. Among those who were administered the basic skills assessment, 8050 to 8400 had valid responses to necessary measures of interest (i.e., MtL items) and grouping variables for the analyses of this study. The sample sizes for the sub-groups are reported in Figs. 1 and 2. The percentage of missing values was less than 1%. In this study, the sample size and percentage of missing values have been rounded per the IES data security guidelines.

Measures
Outcome variable: The 4-item PIAAC MtL scale (Gorges et al., 2016) was used as the outcome variable. The scale consists of four survey items, including "like learning new things, " "I like to get to the bottom of difficult things, " "I like to figure out how different ideas fit together, " and "If I don't understand something, I look for additional information to make it clearer. " These items have 5-point Likert-type response Note: n = unweighted count; The sample size was rounded to the nearest 50 in accordance with IES data security guidelines; Detail may not sum to totals because of rounding. College+ = college or higher education; No college = less than college education. categories: 1 = "Not at all, " 2 = "Very little", 3 = "To some extent", 4 = "To a high extent", and 5 = "To a very high extent. " Sub-population (sub-group) indicators: Gender was a dichotomous measure indicating women and men (where men served as the reference group). Age was recorded as age groups with 10-year increments (1 = 25-34 years old; 2 = 35-44 years old; 3 = 45-54 years old; 4 = 55-65 years old). Although the information on age in single-year increments was available in the U.S. PIAAC RUF, these 10-year age groups were used considering the methodological concerns (e.g., sample size) and practical implications (e.g., average MtL for a specific age has little utility). Education level was a dichotomous measure indicating the respondents with college or higher education (i.e., postsecondary degrees including the associate degree) vs. less than college education-the reference group. Literacy level was a dichotomous variable indicating high or medium proficiency vs. low proficiency (reference group). PIAAC provides guidelines for interpreting literacy proficiency level (from Below Level 1 to Level 5) based on the estimated proficiency scores (0-500 points; OECD, 2016b). Whereas different classifications were possible, a two-level classification [high or medium (levels 3-5 or score 276-500) vs. low (Below level 1-level 2 or score 0-275)] was employed, considering precedent provided by the U.S. National Center for Education Statistics (2019) and methodological concerns (e.g., cell sample sizes). In this study, the provided plausible values for literacy proficiency scores were used to classify the respondents into proficiency levels for those analyses that involved literacy proficiency levels.
Sampling weights, replicate weights, and all 10 sets of plausible values were used in the weighted descriptive statistics, and the sampling weights were used for the main statistical analysis (i.e., alignment optimization method or AO method). However, given technical incompatibility (see below for the AO method description), we were not able to use all 10 plausible values or replicate weights in the AO analyses. We used the first set of plausible values for the proficiency level classification for the AO analysis. Therefore, the statistical significance close to the cut-off point needs to be interpreted with caution due to the potentially underestimated error variance. At the same time, we believe that the results with the first set of plausible values are trustworthy for the purpose of this study because the AO analysis results that were obtained using other individual plausible values (e.g., second plausible value) were consistent.

Analytic approach
The weighted descriptive statistics were computed using all the sampling weights, replicate weights and a set of 10 proficiency level classifications-each set based on a different set of plausible values. This study applied the AO method to overcome some of the methodological issues with the conventional measurement approaches such as MGCFA and IRT (Asparouhov & Muthén, 2014). Essentially, the AO method is an exploratory MGCFA that seeks to find the measurement model with the minimum amount of measurement non-invariance, and to compare the estimate latent means across many groups (Muthén & Asparouhov, 2018). The AO method is considered an approximate invariance approach, and is more flexible than the conventional, exact invariance approach used (Byrne & Vijver, 2017). In the conventional approaches, at least partial strong invariance (i.e., equivalent factor loadings and mostly equivalent intercepts) is required to compare estimated latent means across groups.
However, when many constraints simultaneously are imposed (e.g., fixing all factor loadings to be equal across groups) in a situation involving many groups, achieving strong invariance with an adequate model fit is challenging without applying multiple modifications to the model. Additionally, in the presence of many groups, manually adjusting models in a data-driven manner (e.g., through examination of modification indices) can be complex and inefficient. The AO method only requires configural invariance among groups to estimate and compare the latent means, even when up to 25% of parameters are non-invariant (Muthén & Asparouhov, 2018). The AO method algorithm is implemented in an efficient, automated manner in Mplus (Muthén & Muthén, 2017). In summary, the AO method is useful to address some of the limitations inherent in conventional methods, and to compare the latent means across many groups even in the presence of measurement non-invariance. The strengths of AO method make it eminently suitable for comparing sub-populations.
A brief explanation of the AO method is provided here as the technical details are published elsewhere (Asparouhov & Muthén, 2014;Lomazzi, 2018). There are three steps in the AO method. First, the AO algorithm fits a measurement model (i.e., confirmatory factor analysis) with the mean and variance of the latent variable fixed to 0 and 1, respectively, in all sub-groups. This process is mainly to establish the baseline model fit and configural invariance across the sub-groups. Second, the latent means and variances for all sub-groups are freely estimated to find the best combination of them, which minimizes the amount of non-invariance by optimizing a simplicity function. That is, the freely estimated latent means and variances are incorporated into the estimation of intercepts and factor loadings, and the differences in each combination of sub-groups are summed in the simplicity function (Muthén & Asparouhov, 2018). The role of the simplicity function in the AO method is essentially equivalent to rotation in exploratory factory analysis, which extracts information without the changing the resultant structure or measurements of relevant variables (Lomazzi, 2018). Finally, the post-estimation algorithm finds the combination that minimizes the overall differences (i.e., optimizes the simplicity function) across sub-groups. In this final process, a significance test with an adjusted type I error rate (α = 0.001) is conducted for the mean intercept or loading, as well as each combination of the intercepts and/or factor loadings to classify the invariant and non-invariant sets of parameters (i.e., intercepts and factor loadings). The largest set of the invariant parameters (or conversely, the smallest set of parameters that are different across groups) indicates the optimal model that can be used for the latent mean estimation and comparison even with up to 25% of parameters invariant across sub-groups (Muthén & Asparouhov, 2018). Regardless of the final model, the model fit remains the same as the configural model. Automated AO functionality is incorporated into Mplus version 7.1 and higher (Muthén & Muthén, 2017).

Procedure
The weighted descriptive summary was estimated with the imputation technique. The main analysis was conducted using the AO method implemented in Mplus version 8 (Muthén & Muthén, 2017), two different sets of sub-groups were examined. The first set included 16 sub-groups that were classified based on combinations of education level (college or higher vs. less than college), age group (4 groups), and gender (women vs. men). The second set included 16 sub-groups that were classified based on combinations of literacy proficiency (low proficiency vs. medium or high proficiency), age group, and gender. The classification scheme and specific sub-groups are shown in Figs. 1 and 2. Additionally, as an exploratory analysis, a set of 32 sub-groups were created based on combinations of gender, age group, education level, and literacy level, although the cell sample sizes for two of these sub-groups were concerning (cell sample sizes less than 100, see Additional file 1: Figure S1). Following previous research (Gorges et al., 2016), the measurement model with the latent MtL variable Page 11 of 20 Yamashita et al. Large-scale Assessments in Education (2022) 10:1 and the four manifest variables was specified (see Additional file 2: Figure S2). In a preliminary analysis, MGCFA was conducted and the model fit was evaluated based on the chi-square statistic, comparative fit index (CFI), root mean square error of approximation (RMSEA) and standardized root mean squared residual (SRMR), and using the cutoff criteria recommended by Wang and Wang (2012): CFI > 0.90, RMSEA < 0.10, and SRMR < 0.10. In each AO model, one of the latent means was fixed to 0 rather than freely estimated, using the ALIGNMENT = FIXED option in Mplus. Robust maximum likelihood (MLR) estimation was used given the 5-point Likert-type response in the MtL items and somewhat negatively skewed distributions (DeMaris, 2005). The final sampling weights (SPFWT0) were applied for all analyses to allow for nationally representative point estimates. Table 1 presents the weighted descriptive summary for all measures of interest. The means of the motivation to learn items ranged from 3.8 to 4.2 out of 5.0. Age group and gender were approximately uniformly distributed. There were more working age adults without college degrees (57%) than those with college degrees (43%). Slightly more than half of adults (52%) had middle-to-high literacy proficiency. Before applying the AO method, a preliminary MGCFA configural model was fitted. For the multigroup model in which groups were formed by educational attainment, age, and gender, the model fit was adequate [χ 2 (113) = 294.26, p < 0.001; CFI = 0.97; RMSEA = 0.06; SRMR = 0.08] after adjusting for nine (freely estimated-the list is available upon request) out of sixty-four intercepts. When groups defined by literacy proficiency level, age, and gender were considered, the fit of the configural model was adequate [χ 2 (119) = 310.12, p < 0.001; CFI = 0.96; RMSEA = 0.06; SRMR = 0.08] after adjusting for three (freely estimated-the list is available upon request) out of sixty-four intercepts. Based on these models, the AO method was applied to both groups. Table 2 presents the results from the AO method for the groups defined by educational attainment, age, and gender. These results show the national profile of mean motivation to learn for each specific sub-population, and statistically significant differences across all pairwise combinations of groups were observed. For example, Group #4 (ranked first, with the highest level of MtL) had MtL significantly higher than 17 other groups. The approximate measurement invariance test showed that four out of sixty-four intercepts were non-invariant. Yet, 94% of the intercepts and all factor loadings were invariant. Given that Muthén and Asparouhov's (2018) suggested criterion of fewer than 25% parameter estimates showing non-invariance was met, latent mean comparison was conducted. Overall, MtL rankings could be predicted by educational attainment. Regardless of age and gender, college educated adults tended to have higher MtL. Particularly, college educated younger adults had significantly higher MtL than the other Table 2 Ranking by mean motivation to learn for 16 subgroups defined by gender, age group, and educational attainment

Ranking Group Group description
Factor mean Group with significantly smaller factor mean a 1 2 College+, age 25-34, men 0.59 3,5,6,7,8,9,10,11,12,13,14,15,16 2 4 College+,men 0.53 3,6,9,10,11,12,13,14,15 groups. On the other hand, older working age adults (age 55-65 years) without college degrees had significantly lower MtL than the other groups. Table 3 presents, based on application of the AO method, the subgroup rankings by their estimated means, and significance test results for all combinations of groups defined by literacy proficiency, age, and gender. The approximate measurement invariance test showed that three out of sixty-four intercepts, and one out of sixty-four factor loadings were non-invariant. Yet, 95% of the intercepts and 98% of the factor loadings were invariant. Given that the non-invariant parameters constituted fewer than 25% of all estimated parameters, latent mean comparisons were conducted. Similar to the results observed for educational attainment, the groups with higher literacy proficiency were more likely to have higher MtL. At the same time, younger adults with low literacy proficiency had MtL equivalent to their counterparts with higher literacy proficiency. Younger adults with higher literacy proficiency had significantly higher MtL than the individuals from the remaining groups. In comparison, older working age adults (age 55-65 years) with low literacy proficiency had significantly lower MtL than the other groups.
As a follow-up exploratory analysis, the AO method was applied to the 32 groups that were defined by combinations of educational attainment, literacy proficiency, age group, and gender. These groups and corresponding results are shown in Additional file 1: Figure  S1 and Additional file 3: Table S1. In the MGCFA invariance test, after several attempts to modify the model and freely estimating seven out of 128 intercepts, the model still Table 3 Ranking of mean motivation to learn for 16 subgroups defined by gender, age group, and literacy proficiency level

Ranking Group Group description
Factor mean Group with significantly smaller factor mean a 1 2 Literacy+, age 25-34, men 0.51 1, 3,5,6,7,8,9,10,11,12,13,14,15,16 2 4 Literacy+,men 0.46 5,6,8,9,11,12,13,14,15 showed only marginal fit [χ 2 (243) = 639.88, p < 0.001; CFI = 0.93; RMSEA = 0.08; SRMR = 0.13]. Additionally, two groups (exact sample sizes are not reported in accordance with IES data security guidelines) had sample sizes slightly smaller than the recommended sample size of 100 (Muthén & Asparouhov, 2018). Considering that observed CFI and RMSEA values were adequate and considering that the majority (94%) of the groups had the sample sizes that were more than 100, we proceeded to apply the AO method. The approximate measurement invariance test showed that five out of 128 intercepts were non-invariant whereas all factor loadings were invariant. As fewer than 25% of the intercepts and loadings were invariant, mean comparisons were conducted. The estimated means and rankings are reported in Additional file 3: Table S1. Overall, college-educated younger adults with middle-to-high literacy proficiency tended to be more motivated than their counterparts. There were no clear patterns by gender. At the same time, older men with less than college education were significantly less likely to be motivated than most other groups. Similarly, older women with less than a college education were less motivated than most other groups. Results from these 32 groups should be treated as preliminary findings due to the marginal baseline model fit and borderline sample sizes for two groups.

Discussion
This study applied the seldom-utilized AO method in the context of MtL among working age adults in the United States. When estimating latent means and comparing them across many groups, the AO method has several advantages over conventional methodological approaches such as MGCFA and IRT. Using the existing 4-item MtL scale, rankings based on the estimated means, which can be considered to represent a national profile of MtL by the indicated subpopulations, were computed. Overall, higher educational attainment, middle-to-high literacy proficiency levels, and younger age, as well as the combinations of these three factors best represented the MtL rankings. However, there were no clear patterns by gender in the sub-group comparisons. It is notable that educational attainment, self-evaluation, and MtL each provide individuals with a "feedback loop" over their life course that can lead to engagement with AET. That is, successful completion of, or positive experiences in education can lead to more positive self-evaluation (e.g., self-efficacy) and greater MtL and, in turn, increased participation in continuing AET (Chang & Lin, 2011). Interestingly, literacy proficiency seemed to follow patterns similar to MtL. Literacy levels may partially explain the pathways between education and MtL, and/or a part of readiness to learn, which is linked to MtL (Center for Community College Student Engagement, 2014;Smith et al., 2015). In other words, education contributed to literacy skills, which might have enhanced MtL. The observed patterns of younger adults with higher MtL are consistent with some previous research (Boeren, 2017;Cummins et al., 2015). However, the findings in this study only show associations. Additional qualitative inquiry is needed to identify more detailed reasons why-in addition to identifying current, known aging-related barriers (e.g., lower sensory functions, mobility; Roosmaa & Saar, 2017;Wlodkowski & Ginsberg, 2017)-older age is linked to lower MtL. Although gender did not show any clear association with the MtL distributions, gender perhaps plays distinct roles in AET participation, satisfaction, and effectiveness (Bell & Ford, 2007;Klein et al., 2006).
Whereas the potential explanations for individual MtL determinants are useful, the intersectionality of key MtL determinants requires further exploration. The findings from this study provide a foundation to further advance MtL research. In particular, the rankings and detailed subpopulation characteristics defined by education, literacy proficiency, age, and gender illuminate specific subpopulations. For example, a combination of lower education and older age appeared to be negatively associated with MtL. The rankings from this study provide empirical justifications for further inquiries. For example, given the brevity yet sound psychometric properties of the 4-item PIAAC MtL scale, it can be administered and used relatively easily in applied research with an adult education institution, organization and/or community. Such case studies would be useful to compare to the findings from the current study, where the latter provides benchmark, national averages to evaluate results obtained from local MtL assessments for particular sub-groups of adults. However, theoretical explanations of how educational attainment and older age result in lower MtL remain unclear. Older age may indicate exposure to negative educational experiences in early life. At the same time, older age may indicate possible functional limitations (e.g., vision and hearing impairment) and amplify negative effects of lower educational attainment. These are merely speculations, however, and future research is needed to identify the mechanisms driving the MtL rankings of the subpopulations.
Given the observed national profile of MtL, several implications for education policy and AET practice are worth exploring. First, the rankings and mean differences across subpopulations can inform AET promotion and intervention programs targeted to specific groups. In particular, providing supports to middle-aged and older working age (45-65 years) adults with lower levels of education as well as lower proficiency is meaningful. In this case, non-formal AET, which does not directly lead to formal qualification/ degree attainment to facilitate a positive educational experience and enhance intrinsic motivation, may be a more suitable starting point (Saar et al., 2013). Also, providing AET opportunities to younger, educated adults may result in a higher return on investment (e.g., greater training/education outcomes) as they tend to be more motivated to learn.
Third, given the complex gender differences in MtL, current structures and programs for AET should be reviewed in terms of accessibility. Women and men are similarly motivated overall with some exceptions. For example, college educated women and men in the age 25-34 group had equivalent MtL, although college educated men in the age 35-44 group showed significantly greater MtL than college educated women in the same age group. Also, men with high literacy proficiency in the age 25-34 group had significantly greater MtL than women with high literacy proficiency in the same age group. Overall, better access to AET opportunities should result in positive outcomes for both women and men, although additional interventions may be warranted in the cases of the gender differences by education and literacy proficiency. On a relevant note, future research may further explore how individuals with different MtL may interact in a lifelong learning environment. For instance, sub-groups of adults with low MtL may be inspired by those with higher MtL when they are included in the same learning environment. It is also possible that low MtL may be transmitted through social and learning networks. Yet, little is known about whether individual or group learning may be more appropriate to improve MtL. An exploration of how MtL changes based on the adult learning environment would be informative to develop effective interventions. Finally, given the MtL rankings and differences across subpopulations, education policy should aim at improving the national averages as well as subpopulation averages to promote equality in learning. The mean estimation strategies (i.e., four-item PIAAC MtL items and AO method) that were used in this study can be incorporated in the national data collection efforts and used to monitor MtL trends at the subpopulation-levels. With respect to national profiles and education policy implications, applications of the AO method in other PIAAC participating countries would open more opportunities for international comparisons of MtL sub-group profiles as well as inform lifelong learning policies in a variety of contexts and locations.

Limitations
Several limitations should be noted. While the AO method is relatively novel and efficient, varied user inputs such as the estimation methods, options (e.g., fixed vs. freely latent means), and sample sizes could return inconsistent findings. In this study, the methodological decisions were made based on the theoretical framework and feasibility with the data. In addition, we referred to the published guidelines and recommendations to make methodological decisions. More empirically rigorous approaches (e.g., employing a simulation for power analysis) were beyond the scope of this study. Moreover, the findings from this study should be supplemented with qualitative inquiries to identify underlying explanations for differential MtL by the selected characteristics. For example, the findings from this study showed significant differences in MtL between older women with high literacy proficiency and those without (see, for example, Group #7 vs. #15 in Table 3). However, unlike educational attainment, limited empirical evidence on literacy and MtL is available. Qualitative inquiries may uncover more detailed explanations of how basic skills may impact MtL. Finally, given the exploratory nature of the AP method, more research with future releases of national data sets (e.g., cross-sectional and longitudinal data) to verify the findings is an indispensable next step.

Contributions
This study made three important contributions to the literature. First, the use of the AO method extended existing research (Gorges et al., 2016(Gorges et al., , 2017 on the MtL construct to more detailed socio-demographic subpopulations. It demonstrated that the four-item MtL measure can validly be used to compare subgroups of working-age adults with varying combinations of age, educational attainment, and literacy levels. Previously, sub-group comparisons were limited to one characteristic at a time, but the introduction of the AO method now allows for simultaneous examination of the latent MtL construct across multiple sub-groups. Our findings added to the literature increased confidence about the key MtL determinants and also shed light on the intersectionality of sub-groups. Second, a national profile/ranking of MtL by the subgroups of workingage adults was developed, which is useful to identify groups of adults with lower MtL. Such at-risk groups (e.g., older working age adults with less than a college education) may need additional assistance or intervention to promote their participation in lifelong learning. At the same time, groups with higher motivation should be encouraged to participate in lifelong learning because positive learning outcomes can be expected.
Page 17 of 20 Yamashita et al. Large-scale Assessments in Education (2022) 10:1 Finally, this study showcased an underutilized analytical approach-the AO method-in a population-level study in the field of education. Application of AO methods may address common measurement issues (e.g., cross-group validity, latent mean comparisons) in previously published research with the conventional MGCFA.

Conclusion
Developing national MtL profiles by subpopulations has been challenging due to several limitations of conventional methods such as MGCFA and IRT. Use of a novel AO method produced, arguably, the first national MtL rankings by subpopulations defined by age, gender, educational attainment, and literacy proficiency levels. While the findings from this study were consistent with previous research in terms of identifying factors that might promote MtL, including younger age, greater education level and higher literacy proficiency, this study highlighted the complexity of intersections across these promoting factors. This study is still an early attempt to better understand MtL, which positively stimulates AET participations at the population-level. Preliminary adult education policy discussions are warranted in view of the results of this study. Yet, more data collection, analysis of different and more detailed subgroups, as well as refinement of methodological approaches including both conventional and novel methods are needed to document MtL distributions and assess how various intersections of factors may promote MtL.