Chapter 5 References

Adams, R. J. (2005). Reliability as a Measurement Design Effect. Studies in Educational Evaluation, 31(2-3), 162–172. https://doi.org/10.1016/j.stueduc.2005.05.008

Adams, R. J., Doig, B. A., & Rosier, M. (1991). Science Learning in Victorian Schools. Australian Council for Educational Research.

Adams, R. J., & Gonzales, E. J. (1996). Third International Mathematics and Science Study. Technical Report Volume 1: Design and Development (M. O. Martin & D. L. Kelly, Eds.; Vol. 1). Center for the Study of Testing, Evaluation and Educational Policy. Boston, Massachusetts: Boston College.

Adams, R. J., & Wilson, M. R. (1996). A Random Coefficients Multinomial Logit: A Generalized Approach to Fitting Rasch Models. In G. Engelhard & M. R. Wilson (Eds.), Objective Measurement III: Theory into Practice (pp. 143–166). Ablex.

Adams, R. J., Wilson, M. R., & Wang, W.-c. (1997). The Multidimensional Random Coefficients Multinomial Logit Model. Applied Psychological Measurement, 21, 1–24. https://doi.org/10.1177%2F0146621697211001

Adams, R. J., Wilson, M. R., & Wu, M. (1997). Multilevel Item Response Models: An Approach to Errors in Variables Regression. Journal of Educational and Behavioural Statistics, 22, 46–75.

Andersen, E. B. (1985). Estimating Latent Correlations Between-Repeated Testings. Psychometrika, 50, 3–16. https://doi.org/10.1007/BF02294143

Andrich, D. (1978). A Rating Formulation for Ordered Response Categories. Psychometrika, 43, 561–573. https://doi.org/10.1007/BF02293814

Beaton, A. E. (1987). Implementing the New Design: The NAEP 1983–84 Technical Report (Nos. 15-TR-20). Educational Testing Service.

Beaton, A. E., Mullis, I. V., Martin, M. O., Gonzales, E. J., Kelly, D. L., & Smith, T. A. (1996). Mathematics Achievement in the Middle School Years. In IEA’s Third International Mathematics and Science Study. Boston College.

Birnbaum, A. (1968). Some Latent Trait Models and Their Use in Inferring an Examinee’s Ability. Statistical Theories of Mental Test Scores.

Bock, D. R. (1972). Estimating Item Parameters and Latent Ability When Responses Are Scored in Two or More Nominal Categories. Psychometrika, 37, 29–51. https://link.springer.com/article/10.1007/BF02291411

Bock, D. R., & Aitkin, M. (1981). Marginal Maximum Likelihood Estimation of Item Parameters: An Application of the EM Algorithm. Psychometrika, 46, 443–459.

Bradley, R., & Terry, M. (1952). Rank Analysis of Incomplete Block Designs: I. The Method of Paired Comparisons. Biometrika, 39(3/4), 324–345. https://doi.org/10.2307/2334029

Congdon, P., & McQueen, J. (1997, March 24–28). The Stability of Rater Severity Estimates in Large Scale Performance Assessment Programmes. Annual Meeting of the American Educational Research Association.

Crocker, L. M., & Algina, J. (1986). Introduction to Classical and Modern Test Theory. Holt Rinehart and Winston.

Dempster, A., Laird, N., & Rubin, D. (1977). Maximum Likelihood from Incomplete Data via the EM Algorithm. Journal of the Royal Statistical Society, 39, 1–38.

Embretson, S. E. (1991). A Multidimensional Latent Trait Model for Measuring Learning and Change. Psychometrika, 56, 495–515.

Engle, R. (1984). Wald, Likelihood Ratio, and Lagrange Multiplier Tests in Econometrics. In undefined, Handbook of Econometrics II (pp. 775–826). Elsevier. http://www.stern.nyu.edu/rengle/LagrangeMultipliersHandbook_of_Econ__II___Engle.pdf

Fischer, G. H. (1973). The Linear Logistic Model as an Instrument in Educational Research. Acta Psychologica, 37, 359–374.

Fischer, G. H. (1983). Logistic Latent Trait Models with Linear Constraints. Psychometrika, 48, 3–26.

Glas, C. W. (1989). Contributions to Estimating and Testing Rasch Models [Doctoral dissertation]. University of Twente.

Glickman, M. (1999). Parameter Estimation in Large Dynamic Paired Comparison Experiments. Journal of the Royal Statistical Society: Series C (Applied Statistics), 48(3), 377–394. https://doi.org/10.1111/1467-9876.00159

Kelderman, H., & Rijkes, C. P. (1994). Loglinear Multidimensional IRT Models for Polytomously Scored Items. Psychometrika, 59, 149–176.

Linacre, J. M. (1994). Many-Facet Rasch Measurement. MESA Press.

Lokan, J., Ford, P., & Greenwood, L. (1996). Maths and Science On the Line: Australian Junior Secondary Students’ Performance in the Third International Mathematics and Science Study. Australian Council for Educational Research.

Lord, F. M. (1983). Unbiased Estimators of Ability Parameters, of Their Variance, and Parallelforms Reliability. Psychometrika, 48, 233–245.

Lord, F. M. (1984). Maximum Likelihood and Bayesian Parameter Estimation in Item Response Theory (Research Report Nos. RR-84-30-ONR). Educational Testing Service.

Luce, R. (2005). Individual Choice Behavior: A Theoretical Analysis. Dover Publications.

Masters, G. N. (1982). A Rasch Model for Partial Credit Scoring. Psychometrika, 47, 149–174.

Mislevy, R. J. (1984). Estimating Latent Distributions. Psychometrika, 49, 359–381.

Mislevy, R. J. (1985). Estimation of Latent Group Effects. Journal of the American Statistical Association, 80, 993–997.

Mislevy, R. J. (1991). Randomization-Based Inference about Latent Variables from Complex Samples. Psychometrika, 56, 177–196.

Mislevy, R. J., Beaton, A. E., Kaplan, B., & Sheehan, K. M. (1992). Estimating Population Characteristics from Sparse Matrix Samples of Item Responses. Journal of Educational Measurement, 29, 133–161.

Mislevy, R. J., & Sheehan, K. M. (1989). The Role of Collateral Information about Examinees in Item Parameter Estimation. Psychometrika, 54, 661–679.

Muraki, E. (1992). A Generalized Partial Credit Model. Applied Psychological Measurement, 16, 159–176. https://conservancy.umn.edu/handle/11299/115645

Rasch, G. (1980). Probabilistic Models for Some Intelligence and Attainment Test. University of Chicago Press.

Roberts, L., Wilson, M. R., & Draney, K. (1997). The SEPUP Assessment System: An Overview. University of California.

Volodin, N., & Adams, R. J. (1995). Identifying and Estimating a D-Dimensional Item Response Model. International Objective Measurement Workshop, University of California.

Wang, W.-c. (1995). Implementation and Application of the Multidimensional Random Coefficients Multinomial Logit [Unpublished doctoral dissertation]. University of California.

Whitely, S. E. (1980). Multicomponent Latent Trait Models for Ability Tests. Psychometrika, 45, 479–494.

Wilson, M. R. (1992). The Ordered Partition Model: An Extension of the Partial Credit Model. Applied Psychological Measurement, 16, 309–325.

Wilson, M. R., & Adams, R. J. (1995). Rasch Models for Item Bundles. Psychometrika, 60, 181–198.

Wilson, M. R., & Masters, G. N. (1993). The Partial Credit Model and Null Categories. Psychometrika, 58, 87–99.

Wright, B. D., & Masters, G. N. (1982). Rating Scale Analysis: Rasch Measurement. MESA Press.

Wright, B. D., & Panchapakesan, N. (1969). A Procedure for Sample-Free Item Analysis. Educational and Psychological Measurement, 29, 23–48.

Wright, B. D., & Stone, M. H. (1979). Best Test Design: Rasch Measurement. MESA Press.

Wu, M. (1997). The Development and Application of a Fit Test for Use with Marginal Maximum Likelihood Estimation and Generalised Item Response Models [Unpublished masters dissertation]. University of Melbourne.

Wu, M., & Adams, R. J. (1993). Simulating Parameter Recovery for the Random Coefficients Multinomial Logit. Fifth International Objective Measurement Workshop.

Zammit, S. A. (1997). English and Home Background Languages in Australian Primary Schools. In P. McKay (Ed.), The Bilingual Interface Project Report (pp. 111–146). Department of Employment, Education and Training.