International Conference on Computer-Based Testing and the Internet: Building Guidelines for Best Practice

Winchester Guildhall, Winchester, England, June 12th - 15th 2002

Pre-conference workshop
Bruno Zumbo, University of British Columbia, & Stephen G. Sireci, UMass-Amherst: Methods for Detecting Problematic Items in Test Adaptations

Opening session Ronald K. Hambleton, UMass-Amherst: The Conference Programme

Mary J. Pitoniak, UMass-Amherst: Automatic Item Generation Methodology: Theory, Current Practices, and Future Directions
Abstract: Interest in automatic item generation (AIG) procedures has grown as computer-based tests, which require larger pools of items for security purposes, have become more common. In AIG, the computer is used as an item creation adjunct in order to produce items more efficiently and eventually, after initial cost and effort outlays, more economically. In this paper, a review of the theory and practice of AIG is presented. Early methods that grew out of the criterion-referenced testing movement and more advanced approaches tied to cognitive psychology are described. Their operational implementation and important directions for future research will be highlighted in the paper.

Michael G. Jodoin, UMass-Amherst: A Comparison of Linear-on-the-Fly and Multistage Test Designs in Making Credentialing Decisions
Abstract: The main purpose of certification and licensure examinations is to make criterion-referenced decisions about examinees resulting in the granting or withholding of accreditation. As such, these examinations possesses high-stakes and must insure that examinees are consistently classified. Although computer adaptive tests provide a theoretically superior method for accurately assessing the ability of an examinee across the range of abilities, computer adaptive tests are expensive to develop and have well documented security risks unless item banks are quite large. Alternative test designs, like linear on the fly and multistage test designs have the potential to reduce the complexity of the test design process and may provide highly desirable qualities if tests are designed to focus at the pass-fail decision point. Using data from a large-scale high-stakes certification examination this paper will examine the differences in decision consistency between linear on the fly and multistage test designs for a variety of models. Furthermore, it will look at the practical consequences of reduced test length and the advantages that may be expected from general improvements to an item bank.

Lisa A. Keller and H. Swaminathan, UMass-Amherst; Tim Davey, ETS: Using Collateral Information in IRT Parameter Estimation
Abstract: In computer adaptive testing, the accurate estimation of item parameters is critical since sequential administration of items depends on these item parameter values. Typically, experimental items are administered along with operational test items in order to estimate their IRT parameters. However, test security concerns make large samples which are highly desirable, difficult, if not impossible to obtain. In such situations, auxiliary information available on items may be used to predict item parameters; these predicted values, in turn, can serve as prior information in the estimation process. Procedures for incorporating such auxiliary information are described in this paper. Both simulation studies as well as the analysis of real has shown that use of collateral information in this manner greatly improves the accuracy of item parameter estimates over the traditional marginal maximum likelihood or Bayesian procedures.

April L. Zenisky, UMass-Amherst: Technology and Test Items in Computer-Based Assessment: Enhancing Measurement through Format and Design
Abstract: In computer-based testing (CBT), many technologies are integrated at once to define the format of the test items. First, there are many ways in which the item stem can be structured with regard to the nature of the task and how it is presented to examinees. Secondly, CBT assessments can be designed so that examinees have access to online reference materials such as calculators and spell-checkers. Lastly, technology also impacts the ways in which examinees can respond to different items, such that examinees can complete different cognitive tasks such as selecting the correct response, reordering data, manipulating onscreen information, and creating an original answer in a number of interesting ways that can be differently implemented to ensure that the test is appropriately measuring the construct of interest. The purpose of this presentation is to provide an overview of recent developments in assessment tasks with particular emphasis on the current state of research on emerging item types and to identify several areas for additional research.



Copyright 2003 Research and Evaluation Methods Program, University of Massachusetts Amherst,
152 Hills South, Amherst, Massachusetts, 01003. (413) 545-0262.

This is an official publication of the University of Massachusetts Amherst.
Send any comment about this page to the Webmaster: rempweb@oitunix.oit.umass.edu
Web Team: Urip Purwono, Lisa Keller, Ning Han
Design: Urip Purwono