Asparouhov, T., & Muthén, B. (2010). Bayesian analysis using Mplus: Technical implementation (Mplus Technical Report). http://statmodel.com/download/Bayes3.pdf. Accessed 12 November 2020.
Asparouhov, T., & Muthén, B. (2014). Multiple-group factor analysis alignment. Structural Equation Modeling: A Multidisciplinary Journal, 21(4), 495–508. https://doi.org/10.1080/10705511.2014.919210
Article
Google Scholar
Aßmann, C., Steinhauer, H. W., Kiesl, H., Koch, S., Schönberger, B., Müller-Kuller, A., Rohwer, G., Rässler, S., & Blossfeld, H.-P. (2011). 4 Sampling designs of the National Educational Panel Study: Challenges and solutions. Zeitschrift Für Erziehungswissenschaft, 14(S2), 51–65. https://doi.org/10.1007/s11618-011-0181-8
Article
Google Scholar
Bast, J., & Reitsma, P. (1998). Analyzing the development of individual differences in terms of Matthew effects in reading: Results from a Dutch longitudinal study. Developmental Psychology, 34(6), 1373–1399. https://doi.org/10.1037/0012-1649.34.6.1373
Article
Google Scholar
Baumert, J., Klieme, E., Neubrand, M., Prenzel, M., Schiefele, U., Schneider, W., Stanat, P., Tillmann, K.-J., & Weiß, M. (2001). PISA 2000: Basiskompetenzen von Schülerinnen und Schülern im internationalen Vergleich. Leske + Budrich. https://doi.org/10.1007/978-3-322-83412-6
Book
Google Scholar
Baumert, J., Stanat, P., & Watermann, R. (2006). Schulstruktur und die Entstehung differenzieller Lern- und Entwicklungsmilieus. In J. Baumert, P. Stanat, & R. Watermann (Eds.), Herkunftsbedingte Disparitäten im Bildungssystem (pp. 95–188). VS Verlag für Sozialwissenschaften.
Google Scholar
Baumert, J., Trautwein, U., & Artelt, C. (2003). Schulumwelten—institutionelle Bedingungen des Lehrens und Lernens. In J. Baumert, C. Artelt, E. Klieme, M. Neubrand, M. Prenzel, U. Schiefele, W. Schneider, K.-J. Tillmann, & M. Weiß (Eds.), PISA 2000. Ein differenzierter Blick auf die Länder der Bundesrepublik Deutschland (pp. 261–331). Leske u. Budrich.
Chapter
Google Scholar
Bayer, M., Goßmann, F., & Bela, D. (2014). NEPS technical report: Generated school type variable t723080_g1 in Starting Cohorts 3 and 4 (NEPS Working Paper No. 46). Bamberg: Leibniz Institute for Educational Trajectories, National Educational Panel Study. https://www.neps-data.de/Portals/0/Working%20Papers/WP_XLVI.pdf. Accessed 12 November 2020.
Becker, M., Lüdtke, O., Trautwein, U., & Baumert, J. (2006). Leistungszuwachs in Mathematik. Zeitschrift Für Pädagogische Psychologie, 20(4), 233–242. https://doi.org/10.1024/1010-0652.20.4.233
Article
Google Scholar
Blossfeld, H.-P., Roßbach, H.-G., & von Maurice, J. (Eds.), (2011). Education as a lifelong process: The German National Educational Panel Study (NEPS) [Special Issue]. Zeitschrift für Erziehungswissenschaft, 14.
Bos, W., Bonsen, M., & Gröhlich, C. (2009). KESS 7 Kompetenzen und Einstellungen von Schülerinnen und Schülern an Hamburger Schulen zu Beginn der Jahrgangsstufe 7. HANSE—Hamburger Schriften zur Qualität im Bildungswesen (Vol. 5). Waxmann.
Google Scholar
Brown, T. A. (2006). Confirmatory factor analysis for applied research. Guilford Press.
Google Scholar
Camilli, G. (1993). The case against item bias detection techniques based on internal criteria: Do item bias procedures obscure test fairness issues? In P. W. Holland & H. Wainer (Eds.), Differential item functioning: Theory and practice (pp. 397–417). Erlbaum.
Google Scholar
Camilli, G. (2006). Test fairness. In R. Brennan (Ed.), Educational measurement (4th ed., pp. 221–256). American Council on Education and Praeger.
Google Scholar
Chall, J. S. (1983). Stages of reading development. McGraw-Hill.
Google Scholar
Cohen, J. (1969). Statistical power analysis for the behavioral sciences. Academic Press.
Google Scholar
Cortina, K. S., & Trommer, L. (2009). Bildungswege und Bildungsbiographien in der Sekundarstufe I. Das Bildungswesen in der Bundesrepublik Deutschland: Strukturen und Entwicklungen im Überblick. Waxmann.
Google Scholar
Ditton, H., Krüsken, J., & Schauenberg, M. (2005). Bildungsungleichheit—der Beitrag von Familie und Schule. Zeitschrift Für Erziehungswissenschaft, 8(2), 285–304. https://doi.org/10.1007/s11618-005-0138-x
Article
Google Scholar
Edossa, A. K., Neuenhaus, N., Artelt, C., Lingel, K., & Schneider, W. (2019). Developmental relationship between declarative metacognitive knowledge and reading comprehension during secondary school. European Journal of Psychology of Education, 34(2), 397–416. https://doi.org/10.1007/s10212-018-0393-x
Article
Google Scholar
Finch, W. H., & Bolin, J. E. (2017). Multilevel Modeling using Mplus. Chapman and Hall—CRC.
Book
Google Scholar
Fischer, L., Gnambs, T., Rohm, T., & Carstensen, C. H. (2019). Longitudinal linking of Rasch-model-scaled competence tests in large-scale assessments: A comparison and evaluation of different linking methods and anchoring designs based on two tests on mathematical competence administered in grades 5 and 7. Psychological Test and Assessment Modeling, 61, 37–64.
Google Scholar
Fischer, L., Rohm, T., Gnambs, T., & Carstensen, C. H. (2016). Linking the data of the competence tests (NEPS Survey Paper No. 1). Bamberg: Leibniz Institute for Educational Trajectories, National Educational Panel Study. https://www.lifbi.de/Portals/0/Survey%20Papers/SP_I.pdf. Accessed 12 November 2020.
Fox, J.-P. (2010). Bayesian item response modeling: Theory and applications. Springer.
Book
Google Scholar
Fox, J.-P., & Glas, C. A. W. (2001). Bayesian estimation of a multilevel IRT model using gibbs sampling. Psychometrika, 66, 271–288.
Article
Google Scholar
Gamoran, A., & Mare, R. D. (1989). Secondary school tracking and educational inequality: Compensation, reinforcement, or neutrality? American Journal of Sociology, 94(5), 1146–1183. https://doi.org/10.1086/229114
Article
Google Scholar
Gehrer, K., Zimmermann, S., Artelt, C., & Weinert, S. (2003). NEPS framework for assessing reading competence and results from an adult pilot study. Journal for Educational Research Online, 5, 50–79.
Google Scholar
Gelman, A., Carlin, J. B., Stern, H. S., & Rubin, D. B. (2004). Bayesian data analysis (2nd ed.). Chapman & Hall.
Google Scholar
Gelman, A., & Rubin, D. B. (1992). Inference from iterative simulation using multiple Sequences. Statistical Science, 7, 457–472.
Google Scholar
Heck, R. H., Price, C. L., & Thomas, S. L. (2004). Tracks as emergent structures: A network analysis of student differentiation in a high school. American Journal of Education, 110(4), 321–353. https://doi.org/10.1086/422789
Article
Google Scholar
Holland, P. W., & Wainer, H. (1993). Differential item functioning. Routledge. https://doi.org/10.4324/9780203357811
Book
Google Scholar
Hox, J. J. (2002). Multilevel analysis: Techniques and applications. Quantitative methodology series. Erlbaum.
Book
Google Scholar
Jak, S., & Jorgensen, T. (2017). Relating measurement invariance, cross-level invariance, and multilevel reliability. Frontiers in Psychology, 8, 1640. https://doi.org/10.3389/fpsyg.2017.01640
Article
Google Scholar
Jöreskog, K. G. (1971). Simultaneous factor analysis in several populations. Psychometrika, 36(4), 409–426. https://doi.org/10.1007/BF02291366
Article
Google Scholar
Kamata, A., & Vaughn, B. K. (2010). Multilevel IRT modeling. In J. J. Hox & J. K. Roberts (Eds.), Handbook of advanced multilevel analysis (pp. 41–57). Routledge.
Google Scholar
Kaplan, D., Kim, J.-S., & Kim, S.-Y. (2009). Multilevel latent variable modeling: Current research and recent developments. In R. E. Millsap & A. Maydeu-Olivares (Eds.), The Sage handbook of quantitative methods in psychology (pp. 592–612). Sage Publications Ltd. https://doi.org/10.4135/9780857020994.n24
Chapter
Google Scholar
Kim, E., Cao, C., Wang, Y., & Nguyen, D. (2017). Measurement invariance testing with many groups: A comparison of five approaches. Structural Equation Modeling: A Multidisciplinary Journal. https://doi.org/10.1080/10705511.2017.1304822
Article
Google Scholar
Köller, O., & Baumert, J. (2001). Leistungsgruppierungen in der Sekundarstufe I. Ihre Konsequenzen für die Mathematikleistung und das mathematische Selbstkonzept der Begabung. Zeitschrift Für Pädagogische Psychologie, 15, 99–110. https://doi.org/10.1024//1010-0652.15.2.99
Article
Google Scholar
Köller, O., & Baumert, J. (2002). Entwicklung von Schulleistungen. In R. Oerter & L. Montada (Eds.), Entwicklungspsychologie (pp. 735–768). Beltz/PVU.
Google Scholar
Krannich, M., Jost, O., Rohm, T., Koller, I., Carstensen, C. H., Fischer, L., & Gnambs, T. (2017). NEPS Technical report for reading—scaling results of starting cohort 3 for grade 7 (NEPS Survey Paper No. 14). Bamberg: Leibniz Institute for Educational Trajectories, National Educational Panel Study. https://www.neps-data.de/Portals/0/Survey%20Papers/SP_XIV.pdf. Accessed 12 November 2020.
Lehmann, R., Gänsfuß, R., & Peek, R. (1999). Aspekte der Lernausgangslage und der Lernentwicklung von Schülerinnen und Schülern an Hamburger Schulen: Klassenstufe 7; Bericht über die Untersuchung im September 1999. Hamburg: Behörde für Schule, Jugend und Berufsbildung, Amt für Schule.
Google Scholar
Lehmann, R. H., & Lenkeit, J. (2008). ELEMENT. Erhebung zum Lese- und Mathematikverständnis. Entwicklungen in den Jahrgangsstufen 4 bis 6 in Berlin. Berlin: Senatsverwaltung für Bildung, Jugend und Sport.
Google Scholar
LeTendre, G. K., Hofer, B. K., & Shimizu, H. (2003). What Is tracking? Cultural expectations in the United States, Germany, and Japan. American Educational Research Journal, 40(1), 43–89. https://doi.org/10.3102/00028312040001043
Article
Google Scholar
Loyd, B. H., & Hoover, H. D. (1980). Vertical equating using the Rasch model. Journal of Educational Measurement, 17, 179–193.
Article
Google Scholar
Lu, I. R. R., Thomas, D. R., & Zumbo, B. D. (2005). Embedding IRT in structural equation models: A comparison with regression based on IRT scores. Structural Equation Modeling: A Multidisciplinary Journal, 12(2), 263–277. https://doi.org/10.1207/s15328007sem1202_5
Article
Google Scholar
Lüdtke, O., Marsh, H. W., Robitzsch, A., Trautwein, U., Asparouhov, T., & Muthén, B. (2008). The multilevel latent covariate model: A new, more reliable approach to group-level effects in contextual studies. Psychological Methods, 13, 203–229.
Article
Google Scholar
Lüdtke, O., Marsh, H. W., Robitzsch, A., & Trautwein, U. (2011). A 2x2 taxonomy of multilevel latent contextual model: Accuracy-bias trade-offs in full and partial error correction models. Psychological Methods, 16, 444–467.
Article
Google Scholar
Marsh, H. W., Lüdtke, O., Robitzsch, A., Trautwein, U., Asparouhov, T., Muthén, B., & Nagengast, B. (2009). Doubly-latent models of school contextual effects: Integrating multilevel and structural equation approaches to control measurement and sampling error. Multivariate Behavioral Research, 44, 764–802.
Article
Google Scholar
McNeish, D., Stapleton, L. M., & Silverman, R. D. (2017). On the unnecessary ubiquity of hierarchical linear modeling. Psychological Methods, 22(1), 114–140. https://doi.org/10.1037/met0000078
Article
Google Scholar
Millsap, R. E., & Everson, H. T. (1993). Methodology review: Statistical approaches for assessing measurement bias. Applied Psychological Measurement, 17(4), 297–334. https://doi.org/10.1177/014662169301700401
Article
Google Scholar
Muthén, B., & Asparouhov, T. (2012). Bayesian SEM: A more flexible representation of substantive theory. Psychological Methods, 17, 313–335.
Article
Google Scholar
Muthén, B., & Asparouhov, T. (2014). IRT studies of many groups: The alignment method. Frontiers in Psychology, 5, 978. https://doi.org/10.3389/fpsyg.2014.00978
Article
Google Scholar
Muthén, L.K. and Muthén, B.O. (1998–2020). Mplus User’s Guide (8th ed.), Los Angeles, CA: Muthén and Muthén.
Nagy, G., Retelsdorf, J., Goldhammer, F., Schiepe-Tiska, A., & Lüdtke, O. (2017). Veränderungen der Lesekompetenz von der 9. zur 10. Klasse: Differenzielle Entwicklungen in Abhängigkeit der Schulform, des Geschlechts und des soziodemografischen Hintergrunds? Zeitschrift Für Erziehungswissenschaft, 20(S2), 177–203. https://doi.org/10.1007/s11618-017-0747-1
Article
Google Scholar
Naumann, J., Artelt, C., Schneider, W. & Stanat, P. (2010). Lesekompetenz von PISA 2000 bis PISA 2009. In E. Klieme, C. Artelt, J. Hartig, N. Jude, O. Köller, M. Prenzel (Eds.), PISA 2009. Bilanz nach einem Jahrzehnt. Münster: Waxmann. https://www.pedocs.de/volltexte/2011/3526/pdf/DIPF_PISA_ISBN_2450_PDFX_1b_D_A.pdf. Accessed 12 November 2020.
Neumann, M., Schnyder, I., Trautwein, U., Niggli, A., Lüdtke, O., & Cathomas, R. (2007). Schulformen als differenzielle Lernmilieus. Zeitschrift Für Erziehungswissenschaft, 10(3), 399–420. https://doi.org/10.1007/s11618-007-0043-6
Article
Google Scholar
O’Brien, D. G., Moje, E. B., & Stewart, R. A. (2001). Exploring the context of secondary literacy: Literacy in people’s everyday school lives. In E. B. Moje & D. G. O’Brien (Eds.), Constructions of literacy: Studies of teaching and learning in and out of secondary classrooms (pp. 27–48). Erlbaum.
Google Scholar
Oakes, J., & Wells, A. S. (1996). Beyond the technicalities of school reform: Policy lessons from detracking schools. UCLA Graduate School of Education & Information Studies.
OECD. (2017). PISA 2015 assessment and analytical framework: science, reading, mathematic, financial literacy and collaborative problem solving. OECD Publishing. https://doi.org/10.1787/9789264281820-en
Book
Google Scholar
OECD & Statistics Canada. (1995). Literacy, economy and society: Results of the first international adult literacy survey. OECD Publishing.
Pfost, M., & Artelt, C. (2013). Reading literacy development in secondary school and the effect of differential institutional learning environments. In M. Pfost, C. Artelt, & S. Weinert (Eds.), The development of reading literacy from early childhood to adolescence empirical findings from the Bamberg BiKS longitudinal studies (pp. 229–278). Bamberg: University of Bamberg Press.
Google Scholar
Pfost, M., Hattie, J., Dörfler, T., & Artelt, C. (2014). Individual differences in reading development: A review of 25 years of empirical research on Matthew effects in reading. Review of Educational Research, 84(2), 203–244. https://doi.org/10.3102/0034654313509492
Article
Google Scholar
Pfost, M., Karing, C., Lorenz, C., & Artelt, C. (2010). Schereneffekte im ein- und mehrgliedrigen Schulsystem: Differenzielle Entwicklung sprachlicher Kompetenzen am Übergang von der Grund- in die weiterführende Schule? Zeitschrift Für Pädagogische Psychologie, 24(3–4), 259–272. https://doi.org/10.1024/1010-0652/a000025
Article
Google Scholar
Pohl, S., Haberkorn, K., Hardt, K., & Wiegand, E. (2012). NEPS technical report for reading—scaling results of starting cohort 3 in fifth grade (NEPS Working Paper No. 15). Bamberg: Otto-Friedrich-Universität, Nationales Bildungspanel.
Protopapas, A., Parrila, R., & Simos, P. G. (2016). In Search of Matthew effects in reading. Journal of Learning Disabilities, 49(5), 499–514. https://doi.org/10.1177/0022219414559974
Article
Google Scholar
Rabe-Hesketh, S., Skrondal, A., & Zheng, X. (2007). Multilevel Structural Equation Modeling. In S.-Y. Lee (Ed.), Handbook of Latent Variable and Related Models (pp. 209–227). Elsevier.
Google Scholar
Raudenbush, S. W., & Bryk, A. S. (2002). Hierarchical linear models: Applications and data analysis methods. Advanced quantitative techniques in the social sciences, (Vol. 1). Thousand Oaks, CA.: Sage Publ.
Google Scholar
Raykov, T. (1999). Are simple change scores obsolete? An approach to studying correlates and predictors of change. Applied Psychological Measurement, 23(2), 120–126. https://doi.org/10.1177/01466219922031248
Article
Google Scholar
Retelsdorf, J., Becker, M., Köller, O., & Möller, J. (2012). Reading development in a tracked school system: A longitudinal study over 3 years using propensity score matching. The British Journal of Educational Psychology, 82(4), 647–671. https://doi.org/10.1111/j.2044-8279.2011.02051.x
Article
Google Scholar
Retelsdorf, J., & Möller, J. (2008). Entwicklungen von Lesekompetenz und Lesemotivation: Schereneffekte in der Sekundarstufe? Zeitschrift Für Entwicklungspsychologie Und Pädagogische Psychologie, 40(4), 179–188. https://doi.org/10.1026/0049-8637.40.4.179
Article
Google Scholar
Robitzsch, A., & Lüdtke, O. (2020). A review of different scaling approaches under full invariance, partial invariance, and noninvariance for cross-sectional country comparisons in large-scale assessments. Psychological Test and Assessment Modeling, 62(2), 233–279. https://www.psychologie-aktuell.com/fileadmin/Redaktion/Journale/ptam-2020-2/03_Robitzsch.pdf
Rubin, D. B. (1987). Multiple imputation for nonresponse in surveys. Wiley. https://doi.org/10.1002/9780470316696
Book
Google Scholar
Scharl, A., Fischer, L., Gnambs, T., & Rohm, T. (2017). NEPS Technical report for reading: scaling results of starting cohort 3 for grade 9 (NEPS Survey Paper No. 20). Bamberg: Leibniz Institute for Educational Trajectories, National Educational Panel Study. https://www.neps-data.de/Portals/0/Survey%20Papers/SP_XX.pdf. Accessed 12 November 2020.
Schneider, W., & Stefanek, J. (2004). Entwicklungsveränderungen allgemeiner kognitiver Fähigkeiten und schulbezogener Fertigkeiten im Kindes- und Jugendalter. Zeitschrift Für Entwicklungspsychologie Und Pädagogische Psychologie, 36(3), 147–159. https://doi.org/10.1026/0049-8637.36.3.147
Article
Google Scholar
Schweig, J. (2014). Cross-level measurement invariance in school and classroom environment surveys: Implications for policy and practice. Educational Evaluation and Policy Analysis, 36(3), 259–280. https://doi.org/10.3102/0162373713509880
Article
Google Scholar
Silva, C., Bosancianu, B. C. M., & Littvay, L. (2019). Multilevel Structural Equation Modeling. Sage.
Google Scholar
Stanovich, K. E. (1986). Matthew effects in reading: Some consequences of individual differences in the acquisition of literacy. Reading Research Quarterly, 21(4), 360–407. https://doi.org/10.1598/RRQ.21.4.1
Article
Google Scholar
Stapleton, L. M., McNeish, D. M., & Yang, J. S. (2016). Multilevel and single-level models for measured and latent variables when data are clustered. Educational Psychologist, 51(3–4), 317–330. https://doi.org/10.1080/00461520.2016.1207178
Article
Google Scholar
Steenkamp, J. E. M., & Baumgartner, H. (1998). Assessing measurement invariance in cross-national consumer research. Journal of Consumer Research, 25, 78–90. https://doi.org/10.1086/209528
Article
Google Scholar
Steinhauer, H. W. & Zinn, S. (2016). NEPS technical report for weighting: Weighting the sample of starting cohort 3 of the national educational panel study (Waves 1 to 3) (NEPS Working Paper No. 63). Bamberg: Leibniz Institute for Educational Trajectories, National Educational Panel Study. https://www.neps-data.de/Portals/0/Working%20Papers/WP_LXIII.pdf. Accessed 12 November 2020.
Steyer, R., Partchev, I., & Shanahan, M. J. (2000). Modeling True Intraindividual Change in Structural Equation Models: The Case of Poverty and Children’s Psychosocial Adjustment. In T. D. Little, K. U. Schnabel, & J. Baumert (Eds.), Modeling longitudinal and multilevel data: Practical issues, applied approaches and specific examples (pp. 109–26). Mahwah, N.J.: Lawrence Erlbaum Associates. https://www.metheval.uni-jena.de/materialien/publikationen/steyer_et_al.pdf. Accessed 12 November 2020.
Sweeney, R. E., & Ulveling, E. F. (1972). A Transformation for simplifying the interpretation of coefficients of binary variables in regression analysis. The American Statistician, 26(5), 30–32. https://doi.org/10.2307/2683780
Article
Google Scholar
Te Grotenhuis, M., Pelzer, B., Eisinga, R., Nieuwenhuis, R., Schmidt-Catran, A., & Konig, R. (2017). When size matters: Advantages of weighted effect coding in observational studies. International Journal of Public Health, 62(1), 163–167. https://doi.org/10.1007/s00038-016-0901-1
Article
Google Scholar
van de Schoot, R., Kluytmans, A., Tummers, L., Lugtig, P., Hox, J., & Muthén, B. (2013). Facing off with Scylla and Charybdis: A comparison of scalar, partial, and the novel possibility of approximate measurement invariance. Frontiers in Psychology, 4, 770. https://doi.org/10.3389/fpsyg.2013.00770
Article
Google Scholar
Vandenberg, R. J., & Lance, C. E. (2000). A review and synthesis of the measurement invariance literature: Suggestions, practices, and recommendations for organizational research. Organizational Research Methods, 3(1), 4–70. https://doi.org/10.1177/109442810031002
Article
Google Scholar
Walberg, H. J., & Tsai, S.-L. (1983). Matthew effects in education. American Educational Research Journal, 20(3), 359–373. https://doi.org/10.2307/1162605
Article
Google Scholar
Warm, T. A. (1989). Weighted likelihood estimation of ability in item response theory. Psychometrika, 54, 427–450. https://doi.org/10.1007/BF02294627
Article
Google Scholar
Weis, M., Doroganova, A., Hahnel, C., Becker-Mrotzek, M., Lindauer, T., Artelt, C., & Reiss, K. (2020). Aktueller Stand der Lesekompetenz in PISA 2018. In K. Reiss, M. Weis & A Schiepe-Tiska (Hrsg). Schulmanagement Handbuch (pp. 9–19). München: Cornelsen. https://www.pisa.tum.de/fileadmin/w00bgi/www/_my_direct_uploads/PISA_Bericht_2018_.pdf. Accessed 12 November 2020.
Weis, M., Zehner, F., Sälzer, C., Strohmeier, A., Artelt, C., & Pfost, M. (2016). Lesekompetenz in PISA 2015: Ergebnisse, Veränderungen und Perspektiven. In K. Reiss, C. Sälzer, A. Schiepe-Tiska, E. Klieme & O. Köller (Eds.), PISA 2015—Eine Studie zwischen Kontinuität und Innovation (pp. 249–283). Münster: Waxmann.
Williamson, G. L., Appelbaum, M., & Epanchin, A. (1991). Longitudinal analyses of academic achievement. Journal of Educational Measurement, 28(1), 61–76. https://doi.org/10.1111/j.1745-3984.1991.tb00344.x
Article
Google Scholar