SIBTEST implements a nonparametric estimation and hypothesis testing statistical method of assessing DIF in one or more items and/or DBF in one or more bundles of items. The method is based on Shealy-Stout’s (1993) multidimensional IRT model for DIF, a model further developed by Roussos and Stout (1996). The model assumes that examinees matched on the level of the latent ability the test is designed to measure may differ in their expected performance on an item or on an item bundle because of construct irrelevant sources of score variation, and as such DIF may occur. This desired matching of examinees on the latent target ability that the test is intended to measure is done approximately by matching examinees either on total test score or on a user-specified subscore believed to validly measure the target ability over the studied examinee subpopulations. The total score except for certain items being removed that are believed to be possibly DIF-producing is a common and sound matching choice. With a flexible, user-friendly front-end program the user specifies the particular DIF/DBF hypothesis tests she wishes to perform, including:
- which item(s) or item bundle(s) will be tested for DIF/DBF,
- the alternative hypothesis to be tested: either a one-sided hypothesis of DIF/DBF against the reference group, a one-sided hypothesis of DIF/DBF against against the focal group, or a two-sided hypothesis of DIF/DBF against either group, and
- which items will be used to construct the examinee matching score.
SIBTEST uses a sophisticated nonlinear regression correction procedure (Jang & Stout, 1998) to match examinees, a procedure that has demonstrated improved effectiveness in controlling the false positive flagging of non-DIF items in comparison to the original linear regression correction in Shealy and Stout (1993).