Skip to main content

Advertisement

Log in

Difference in method of administration did not significantly impact item response: an IRT-based analysis from the Patient-Reported Outcomes Measurement Information System (PROMIS) initiative

  • Published:
Quality of Life Research Aims and scope Submit manuscript

Abstract

Purpose

To test the impact of method of administration (MOA) on the measurement characteristics of items developed in the Patient-Reported Outcomes Measurement Information System (PROMIS).

Methods

Two non-overlapping parallel 8-item forms from each of three PROMIS domains (physical function, fatigue, and depression) were completed by 923 adults (age 18–89) with chronic obstructive pulmonary disease, depression, or rheumatoid arthritis. In a randomized cross-over design, subjects answered one form by interactive voice response (IVR) technology, paper questionnaire (PQ), personal digital assistant (PDA), or personal computer (PC) on the Internet, and a second form by PC, in the same administration. Structural invariance, equivalence of item responses, and measurement precision were evaluated using confirmatory factor analysis and item response theory methods.

Results

Multigroup confirmatory factor analysis supported equivalence of factor structure across MOA. Analyses by item response theory found no differences in item location parameters and strongly supported the equivalence of scores across MOA.

Conclusions

We found no statistically or clinically significant differences in score levels in IVR, PQ, or PDA administration as compared to PC. Availability of large item response theory-calibrated PROMIS item banks allowed for innovations in study design and analysis.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Abbreviations

CAT:

Computerized adaptive testing

COPD:

Chronic obstructive pulmonary disease

DEP:

Depression

FAT:

Fatigue

IRT:

Item response theory

IVR:

Interactive voice response

MOA:

Method of administration

PC:

Personal computer

PDA:

Personal digital assistant

PF:

Physical functioning

PQ:

Paper questionnaire

NLMIXED:

SAS procedure for estimating mixed models

PRO:

Patient-reported outcomes

PROMIS:

Patient-Reported Outcomes Measurement Information System

WLSMV:

Weighted least squares with mean and variance adjustment

References

  1. Gwaltney, C. J., Shields, A. L., & Shiffman, S. (2008). Equivalence of electronic and paper-and-pencil administration of patient-reported outcome measures: A meta-analytic review. Value Health, 11(2), 322–333.

    Article  PubMed  Google Scholar 

  2. Raat, H., Mangunkusumo, R. T., Landgraf, J. M., et al. (2007). Feasibility, reliability, and validity of adolescent health status measurement by the Child Health Questionnaire Child Form (CHQ-CF): Internet administration compared with the standard paper version. Quality of Life Research, 16(4), 675–685.

    Article  PubMed Central  PubMed  Google Scholar 

  3. Yu, S. C. (2007). Comparison of Internet-based and paper-based questionnaires in Taiwan using multisample invariance approach. CyberPsychology & Behavior, 10(4), 501–507.

    Article  Google Scholar 

  4. Duncan, P., Reker, D., Kwon, S., et al. (2005). Measuring stroke impact with the Stroke Impact Scale: Telephone versus mail administration in veterans with stroke. Medical Care, 43(5), 507–515.

    Article  PubMed  Google Scholar 

  5. Hepner, K. A., Brown, J. A., & Hays, R. D. (2005). Comparison of mail and telephone in assessing patient experiences in receiving care from medical group practices. Evaluation and the Health Professions, 28(4), 377–389.

    Article  PubMed  Google Scholar 

  6. de Vries, H., Elliott, M. N., Hepner, K. A., et al. (2005). Equivalence of mail and telephone responses to the CAHPS Hospital Survey. Health Services Research, 40(6 Pt 2), 2120–2139.

    Article  PubMed  Google Scholar 

  7. Powers, J. R., Mishra, G., & Young, A. F. (2005). Differences in mail and telephone responses to self-rated health: Use of multiple imputation in correcting for response bias. Australian and New Zealand Journal of Public Health, 29(2), 149–154.

    Article  CAS  PubMed  Google Scholar 

  8. Beebe, T. J., McRae, J. A., Harrison, P. A., et al. (2005). Mail surveys resulted in more reports of substance use than telephone surveys. Journal of Clinical Epidemiology, 58(4), 421–424.

    Article  PubMed  Google Scholar 

  9. Kraus, L., & Augustin, R. (2001). Measuring alcohol consumption and alcohol-related problems: Comparison of responses from self-administered questionnaires and telephone interviews. Addiction, 96(3), 459–471.

    Article  CAS  PubMed  Google Scholar 

  10. McHorney, C. A., Kosinski, M., & Ware, J. E, Jr. (1994). Comparisons of the costs and quality of norms for the SF-36 health survey collected by mail versus telephone interview: Results from a national survey. Medical Care, 32(6), 551–567.

    Article  CAS  PubMed  Google Scholar 

  11. Hanmer, J., Hays, R. D., & Fryback, D. G. (2007). Mode of administration is important in US national estimates of health-related quality of life. Medical Care, 45(12), 1171–1179.

    Article  PubMed  Google Scholar 

  12. Hays, R. D., Kim, S., Spritzer, K. L., et al. (2009). Effects of mode and order of administration on generic health-related quality of life scores. Value Health, 12(6), 1035–1039.

    Article  PubMed Central  PubMed  Google Scholar 

  13. Agel, J., Rockwood, T., Mundt, J. C., et al. (2001). Comparison of interactive voice response and written self-administered patient surveys for clinical research. Orthopedics, 24(12), 1155–1157.

    CAS  PubMed  Google Scholar 

  14. Dunn, J. A., Arakawa, R., Greist, J. H., & Clayton, A. H. (2007). Assessing the onset of antidepressant-induced sexual dysfunction using interactive voice response technology. Journal of Clinical Psychiatry, 68(4), 525–532.

    Article  CAS  PubMed  Google Scholar 

  15. Rush, A. J., Bernstein, I. H., Trivedi, M. H., et al. (2006). An evaluation of the quick inventory of depressive symptomatology and the hamilton rating scale for depression: A sequenced treatment alternatives to relieve depression trial report. Biological Psychiatry, 59(6), 493–501.

    Article  PubMed Central  PubMed  Google Scholar 

  16. Cella, D., Yount, S., Rothrock, N., et al. (2007). The Patient-Reported Outcomes Measurement Information System (PROMIS): Progress of an NIH Roadmap cooperative group during its first two years. Medical Care, 45(5 Suppl 1), S3–S11.

    Article  PubMed Central  PubMed  Google Scholar 

  17. Broderick, J. E., Schwartz, J. E., Vikingstad, G., et al. (2008). The accuracy of pain and fatigue items across different reporting periods. Pain, 139(1), 146–157.

    Article  PubMed Central  PubMed  Google Scholar 

  18. Broderick, J. E., Schneider, S., Schwartz, J. E., & Stone, A. A. (2010). Interference with activities due to pain and fatigue: Accuracy of ratings across different reporting periods. Quality of Life Research, 19(8), 1163–1170.

    Article  PubMed Central  PubMed  Google Scholar 

  19. Schneider, S., Stone, A. A., Schwartz, J. E., & Broderick, J. E. (2011). Peak and end effects in patients’ daily recall of pain and fatigue: A within-subjects analysis. J Pain, 12(2), 228–235.

    Article  PubMed Central  PubMed  Google Scholar 

  20. Ware, J. E, Jr, Kosinski, M., Bayliss, M. S., et al. (1995). Comparison of methods for the scoring and statistical analysis of SF-36 health profile and summary measures: Summary of results from the Medical Outcomes Study. Medical Care, 33(4 Suppl), AS264–AS279.

    PubMed  Google Scholar 

  21. Cella, D., Riley, W., Stone, A., et al. (2010). The Patient-Reported Outcomes Measurement Information System (PROMIS) developed and tested its first wave of adult self-reported health outcome item banks: 2005–2008. Journal of Clinical Epidemiology, 63(11), 1179–1194.

    Article  PubMed Central  PubMed  Google Scholar 

  22. Ware, J. E, Jr, Snow, K. K., Kosinski, M., & Gandek, B. (1993). SF-36 health survey. Manual and interpretation guide. Boston: The Health institute, New England Medical Center.

    Google Scholar 

  23. Hambleton, R. K., & Jones, R. W. (1993). An NCME Instructional Module on the comparison of classical test theory and item response theory and their applications to test development. Educational Measurement: Issues and Practice, 12(3), 38–47.

    Article  Google Scholar 

  24. van der Linden, W. J., & Hambleton, R. K. (1997). Handbook of modern item response theory. New York: Springer.

    Book  Google Scholar 

  25. Reeve, B. B., Hays, R. D., Bjorner, J. B., et al. (2007). Psychometric evaluation and calibration of health-related quality of life item banks: Plans for the Patient-Reported Outcomes Measurement Information System (PROMIS). Medical Care, 45(5 Suppl 1), S22–S31.

    Article  PubMed  Google Scholar 

  26. Kolen, M. L., & Brennan, R. L. (2004). Test equating, scaling, and linking: Methods and practices. New York: Springer.

    Book  Google Scholar 

  27. Chew, L. D., Bradley, K. A., & Boyko, E. J. (2004). Brief questions to identify patients with inadequate health literacy. Family Medicine, 36, 588–594.

    PubMed  Google Scholar 

  28. Muthen, B. O., & Muthen, L. (2007). Mplus user’s guide (5th ed.). Los Angeles: Muthén & Muthén.

    Google Scholar 

  29. Hochberg, Y. (1988). A sharper Bonferroni procedure for multiple tests of significance. Biometrika, 75, 800–803.

    Article  Google Scholar 

  30. Cohen, J. (1988). Statistical power for the behavioral sciences. Hillsdale NJ: Erlbaum.

    Google Scholar 

  31. Coons, S. J., Gwaltney, C. J., Hays, R. D., et al. (2009). Recommendations on evidence needed to support measurement equivalence between electronic and paper-based patient-reported outcome (PRO) measures: ISPOR ePRO Good Research Practices Task Force report. Value Health, 12(4), 419–429.

    Article  PubMed  Google Scholar 

  32. Dillman, D. A., Phelps, G., Tortora, R., et al. (2009). Response rate and measurement differences in mixed-mode surveys using mail, telephone, interactive voice response (IVR) and the Internet. Social Science Research, 38, 1–18.

    Article  Google Scholar 

Download references

Acknowledgments

The Patient-Reported Outcomes Measurement Information System (PROMIS) is a National Institutes of Health (NIH) Roadmap initiative to develop a computerized system measuring patient-reported outcomes in respondents with a wide range of chronic diseases and demographic characteristics. PROMIS was funded by cooperative agreements to a Statistical Coordinating Center (Northwestern University PI: David Cella, PhD, U01AR52177) and six Primary Research Sites (Duke University, PI: Kevin Weinfurt, PhD, U01AR52186; University of North Carolina, PI: Darren DeWalt, MD, MPH, U01AR52181; University of Pittsburgh, PI: Paul A. Pilkonis, PhD, U01AR52155; Stanford University, PI: James Fries, MD, U01AR52158; Stony Brook University, PI: Arthur Stone, PhD, U01AR52170; and University of Washington, PI: Dagmar Amtmann, PhD, U01AR52171). NIH Science Officers on this project are Deborah Ader, Ph.D., Susan Czajkowski, PhD, Lawrence Fine, MD, DrPH, Louis Quatrano, PhD, Bryce Reeve, PhD, William Riley, PhD, and Susana Serrate-Sztein, PhD. This manuscript was reviewed by the PROMIS Publications Subcommittee prior to external peer review. The authors would like to thank two anonymous PROMIS reviewers and two journal reviewers for comments on a previous version of this manuscript. See the web site at www.nihpromis.org for additional information on the PROMIS cooperative group.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jakob B. Bjorner.

Appendix

Appendix

The standard graded response IRT model can be formulated:

$$ \log \left( {\frac{{P(x_{ji} \ge c)}}{{P(x_{ji} < c)}}} \right) = \alpha_{i} \;(\theta_{j} - (\lambda_{i} - \tau_{ic} )) $$

where θ j , is the latent health of person j: (here: physical functioning, fatigue, or depression), α i is the discrimination parameter for item i, λ i is the location parameter for item i, and τ ic is the item category parameter. An extended graded response model can be formulated in the following way:

$$ \log \left( {\frac{{P (x_{jiopqa} \ge c )}}{{P (x_{jiopqa} < c )}}} \right) = (\alpha_{i} + \alpha_{o} + \alpha_{p} + \alpha_{q} )\;(\theta_{j} - (\lambda_{i} + \lambda_{o} + \lambda_{p} + \lambda_{q} - \tau_{ic} )) $$

where α o , λ o represents the potential effect of item order (being administered in the second part of the form as opposed to the first) on item discrimination and location parameters. α p , λ p represents the potential effect of IVR phone administration (as opposed to Internet administration). α q , λ q represents the potential effect of paper & pencil questionnaire administration (as opposed to Internet administration).

The model was estimated using SAS proc MLMIXED. The item parameters α i , λ i , and τ ic were initially treated as known constants and fixed to the values estimated in the PROMIS item bank development calibrations. In additional analyses, α i , λ i , and τ ic were estimated for each item using the current sample. The mean and standard deviation of θ was estimated separately for each diagnostic group.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bjorner, J.B., Rose, M., Gandek, B. et al. Difference in method of administration did not significantly impact item response: an IRT-based analysis from the Patient-Reported Outcomes Measurement Information System (PROMIS) initiative. Qual Life Res 23, 217–227 (2014). https://doi.org/10.1007/s11136-013-0451-4

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11136-013-0451-4

Keywords

Navigation