Improved responsiveness and reduced sample size requirements of PROMIS physical function scales with item response theory

Arthritis Res Ther. 2011;13(5):R147. doi: 10.1186/ar3461. Epub 2011 Sep 14.

Abstract

Introduction: The Health Assessment Questionnaire Disability Index (HAQ) and the SF-36 PF-10, among other instruments, yield sensitive and valid Disability (Physical Function) endpoints. Modern techniques, such as Item Response Theory (IRT), now enable development of more precise instruments using improved items. The NIH Patient Reported Outcomes Measurement Information System (PROMIS) is charged with developing improved IRT-based tools. We compared the ability to detect change in physical function using original (Legacy) instruments with Item-Improved and PROMIS IRT-based instruments.

Methods: We studied two Legacy (original) Physical Function/Disability instruments (HAQ, PF-10), their item-improved derivatives (Item-Improved HAQ and PF-10), and the IRT-based PROMIS Physical Function 10- (PROMIS PF 10) and 20-item (PROMIS PF 20) instruments. We compared sensitivity to detect 12-month changes in physical function in 451 rheumatoid arthritis (RA) patients and assessed relative responsiveness using P-values, effect sizes (ES), and sample size requirements.

Results: The study sample was 81% female, 87% Caucasian, 65 years of age, had 14 years of education, and had moderate baseline disability. All instruments were sensitive to detecting change (< 0.05) in physical function over one year. The most responsive instruments in these patients were the Item-Improved HAQ and the PROMIS PF 20. IRT-improved instruments could detect a 1.2% difference with 80% power, while reference instruments could detect only a 2.3% difference (P < 0.01). The best IRT-based instruments required only one-quarter of the sample sizes of the Legacy (PF-10) comparator (95 versus 427). The HAQ outperformed the PF-10 in more impaired populations; the reverse was true in more normal populations. Considering especially the range of severity measured, the PROMIS PF 20 appears the most responsive instrument.

Conclusions: Physical Function scales using item improved or IRT-based items can result in greater responsiveness and precision across a broader range of physical function. This can reduce sample size requirements and thus study costs.

Publication types

  • Comparative Study
  • Research Support, N.I.H., Extramural

MeSH terms

  • Aged
  • Arthritis, Rheumatoid / physiopathology*
  • Arthritis, Rheumatoid / psychology
  • Arthritis, Rheumatoid / therapy*
  • Disability Evaluation*
  • Female
  • Humans
  • Male
  • Recovery of Function / physiology*
  • Sample Size
  • Surveys and Questionnaires / standards*