Abstract
Objective. Description of use and metric properties of instruments measuring pain, physical function, or patient’s global assessment (PtGA) in hand osteoarthritis (OA).
Methods. Medical literature databases up to January 2014 were systematically reviewed for studies reporting on instruments measuring pain, physical function, or PtGA in hand OA. The frequency of the use of these instruments were described, as well as their metric properties, including discrimination (reliability, sensitivity to change), feasibility, and validity.
Results. In 66 included studies, various questionnaires and performance- or assessor-based instruments were applied for evaluation of pain, physical function, or PtGA. No major differences regarding metric properties were observed between the instruments, although the amount of supporting evidence varied. The most frequently evaluated questionnaires were the Australian/Canadian Hand OA Index (AUSCAN) pain subscale and visual analog scale (VAS) pain for pain assessment, and the AUSCAN function subscale and Functional Index for Hand OA (FIHOA) for physical function assessment. Excellent reliability was shown for the AUSCAN and FIHOA, and good sensitivity to change for all mentioned instruments; additionally, the FIHOA had good feasibility. Good construct validity was suggested for all mentioned questionnaires. The most commonly applied performance- or assessor-based instruments were the grip and pinch strength for the assessment of physical function, and the assessment of pain by palpation. For these measures, good sensitivity to change and construct validity were established.
Conclusion. The AUSCAN, FIHOA, VAS pain, grip and pinch strength, and pain on palpation were most frequently used and provided most supporting evidence for good metric properties. More research has to be performed to compare the different instruments with each other.
Hand osteoarthritis (OA) is a highly prevalent disorder, characterized by bony enlargements and deformities1,2,3. Most studies on individuals with OA are based on the general population. Individuals with hand OA can experience symptoms such as pain, decreased grip strength, and disability, leading to a high clinical burden4,5,6. In clinical practice, treatment for patients with hand OA (individuals with hand OA seeking healthcare) is administered to decrease symptoms and improve function; however, the evidence to support these treatments is limited because few high-quality clinical trials have been performed in hand OA7,8.
An important problem in the lack of high-quality clinical trials in hand OA is the lack of standardization of outcome measures8. Therefore, the Outcome Measures in Rheumatology (OMERACT) and the Osteoarthritis Research Society International Task Force on Clinical Trials Guidelines defined core domains to describe outcomes in clinical trials on symptom modification, consisting of pain, physical function, and patient’s global assessment (PtGA)9,10,11,12.
For the assessment of these domains, several patient-reported outcome measures are available. Hand OA-specific questionnaires such as the Functional Index for Hand OA (FIHOA) and the Australian/Canadian Hand OA Index (AUSCAN)13,14 have been developed, but also hand disorder- or arthritis-specific questionnaires such as the Michigan Hand Outcomes Questionnaire (MHQ), Arthritis Impact Measurement Scale-2 (AIMS-2), and Health Assessment Questionnaire (HAQ), to assess 1 or more of these domains15,16,17. In addition, physical function can be assessed using performance-based measures such as the grip or pinch strength or the Arthritis Hand Function Test (AHFT). In addition to self-report and performance-based instruments, assessor-based measures such as joint tenderness upon palpation are used for the assessment of pain18,19. Besides the above-mentioned questionnaires and assessor- or performance-based measures, several other instruments, which will be described in this manuscript, are used for the clinical assessment of hand OA. Although most available instruments have been shown to be reliable for the measurement of pain, physical function, or PtGA, a systematic comparison of the different instruments for the assessment of hand OA has not been performed.
Our study was conducted in the framework of the OMERACT hand OA working group, aiming to identify instruments for the measurement of pain, physical function, and PtGA in hand OA that can be recommended for use in clinical trials on OA. Therefore, insight into available instruments and their metric properties is needed. To this end, we performed a systematic literature review aiming to describe the frequency of use of available instruments measuring pain, physical function, or PtGA in studies on hand OA, and to describe the metric properties of these instruments20. Metric properties were described using the OMERACT filter21, focusing on the aspects of discrimination (reliability and sensitivity to change), feasibility, and truth (validity).
MATERIALS AND METHODS
Study design and identification of studies
The study design and performance followed the Preferred Reporting Items for Systematic reviews and Meta-Analyses guidelines20. In cooperation with a medical librarian (JWS), a systematic literature search was performed to obtain all manuscripts reporting on instruments measuring pain, physical function, or PtGA in hand OA. Medical literature databases (PubMed, Embase, Web of Science, COCHRANE, CINAHL, Academic Search Premier, and ScienceDirect) were searched from the date of their inception up to January 2014, using all variations of the following key words: “hand,” “osteoarthritis,” “outcome assessment,” “reliability,” “sensitive,” “feasibility,” and “validity” (Supplementary Data available from the authors on request).
Inclusion and exclusion criteria
First, all retrieved titles were screened, subsequently selected abstracts were reviewed, and finally full-text articles of the remaining references were read by 1 reviewer (AWV). A random sample of 200 titles (9% of the titles identified by literature search) was also reviewed by a second reviewer (MK). Because of the similar selection of titles, further extraction was done by a single reviewer, but in case of uncertainties, these were discussed and solved by consensus.
Studies reporting on the metric properties of the instruments assessing pain, physical function, and PtGA in hand OA were included. The metric properties of the studied instruments were described according to 4 items: reliability, sensitivity to change, feasibility, and validity. Inclusion criteria differed per item:
Reliability was described based on studies evaluating the reliability of 1 or more instruments performed more than once in the same group of patients, either by the same performer over time or by different performers during 1 study visit. Both cross-sectional and longitudinal studies were included.
Sensitivity to change was described based on longitudinal studies evaluating change of pain, physical function, or PtGA in hand OA measured by 1 or more instruments.
Feasibility was described based on studies evaluating this item of 1 or more instruments.
Validity was described based on studies comparing different instruments assessing pain, physical function, or PtGA in the same patients. Again, both cross-sectional and longitudinal studies were included.
Studies that fulfilled the requirements for at least 1 of these 4 items were included in our review. To be able to generalize the description of the metric properties of the applied instruments to different populations, evaluation by only 1 study was considered as insufficient evidence to draw conclusions. Therefore, only instruments that were assessed by at least 2 studies were included in the description of metric properties.
Studies reporting on surgical interventions, less than 25 patients having hand OA, or on diseases other than hand OA were excluded, as well as animal studies, reviews, abstracts, letters to the editor, and studies in languages other than English. Because of the published systematic literature review on outcome measures in trapeziometacarpal OA by Marks, et al22, studies reporting only on trapeziometacarpal OA were also excluded.
Data extraction
A self-made standardized form was used to extract information on the following data: (1) study population (population size, setting, age, sex), (2) instruments and assessed domains, (3) study design and followup duration, (4) results concerning measures of reliability [intraclass correlation coefficient (ICC), κ value, percentage of agreement, smallest detectable difference (SDD)], sensitivity to change (percentage of change, amount of change, standardized response mean), feasibility (time needed to perform outcome measure), and validity (correlation, association, and measures of agreement between different instruments assessing the same domain). From 6 random studies, data were also extracted by MK, resulting in similar extracted data. All extracted results were discussed by both reviewers to avoid missing information.
Statistical analyses
Because of the heterogeneity of the studies with respect to the evaluated instruments, it was not possible to perform a metaanalysis. Therefore, we performed a descriptive review.
RESULTS
Literature flow
In total, 4351 titles were identified and 2244 unique references were left for screening after removing duplicate references (Figure 1). During the screening, 2008 references could be removed based on title. After reviewing 236 abstracts and 92 full-text articles, 66 studies satisfied the inclusion criteria (Table 113,18,19,23⇓⇓⇓⇓⇓⇓⇓⇓⇓–33,34⇓⇓⇓⇓⇓⇓⇓⇓⇓–44,45⇓⇓⇓⇓⇓⇓⇓⇓⇓–55,56⇓⇓⇓⇓⇓⇓⇓⇓⇓–66,67⇓⇓⇓⇓⇓⇓⇓⇓⇓–77, 78,79,80,81,82,83,84,85.
Clinical outcome measures
The instruments used for the assessment of the OMERACT core domains pain, physical function, and PtGA in the 66 identified studies are specified in Table 213,14,15,16,17,18,86⇓⇓⇓⇓⇓⇓⇓⇓⇓–96,97,98,99,100,101,102. Different instruments were applied, consisting of 12 questionnaires, 1 interview, and a number of rating scales [visual analog scale (VAS), numeric rating scale (NRS), or Likert]. Further, 9 different performance- or assessor-based measures were applied for the assessment of physical function; pain was assessed by palpation, using the number of painful or tender joints, the Doyle index, or the Ritchie articular index.
The AUSCAN was most frequently applied (n = 34), followed by the VAS pain (n = 30), VAS global (n = 16), FIHOA (n = 17), and HAQ (n = 12). The AIMS-2 was applied in 5 studies, the Cochin scale and Score for Assessment and Quantification of Chronic Rheumatoid Affections of the Hands (SACRAH) in 4 studies, the Canadian Occupational Performance Measure (COPM) in 3 studies, and the Arthritis Self Efficacy Scale (ASES) in 2 studies. The Measure of Activity Performance (MAP-hand), MHQ, Older Americans’ Resources and Services Multidimensional Functional Assessment Questionnaire, Patient-Rated Wrist/Hand Evaluation (PRWHE), and Revel functional index were all used in only 1 study each.
Of the performance- or assessor-based measures, grip strength was applied most frequently (n = 35), followed by pain or tenderness on palpation (n = 21). Other applied performance- or assessor-based measures were pinch strength (n = 17), the grip ability test (GAT; n = 4), Moberg Pick-Up Test (MPUT; n = 3), AHFT (n = 2), evaluation of dexterity (n = 3), button test (n = 1), Hand Mobility in Scleroderma Test (HAMIS; n = 1), Hand Functional Index (HFI; n = 1), and Jebsen-Taylor Hand Function Test (JTHFT; n = 1).
Study characteristics
The characteristics of the 66 included studies are described in Table 1. The source populations were predominantly secondary care (n = 41), in addition to primary care (n = 6), population-based (n = 6), and familial OA studies (n = 5). All studies included more women than men, and the mean age was > 50 years in almost all studies. Different study designs were included: 26 observational studies, 35 randomized controlled trials (RCT), and 4 intervention studies.
Of the included studies, 25 studies were primarily aimed at the evaluation of metric properties of 1 or more instruments measuring pain, physical function, or PtGA13,18,19,23⇓⇓⇓⇓⇓⇓⇓⇓⇓–33,34⇓⇓⇓⇓⇓⇓⇓⇓⇓–44. The remaining studies applied these instruments to evaluate the effect of a treatment or intervention (n = 37)45⇓⇓⇓⇓⇓⇓⇓⇓⇓–55,56⇓⇓⇓⇓⇓⇓⇓⇓⇓–66,67⇓⇓⇓⇓⇓⇓⇓⇓⇓–77,78,79,80,81, or to evaluate disease course over time (n = 4)82,83,84,85.
Metric properties of clinical outcome measures (discrimination: reliability)
Only 11 studies provided data on measures of reliability, including 7 instruments13,19,25,27,30,34,35,36,37,43,44. The FIHOA and AUSCAN were most frequently evaluated (Table 3). The AHFT and GAT were evaluated in only 1 study each18,35. The reported measures of reliability of instruments that were assessed in at least 2 studies are listed in Table 3.
In general, all evaluated instruments showed good measures of reliability. Three studies evaluated 2 questionnaires for the assessment of physical function, enabling direct comparison of these measures34,37. Haugen, et al reported excellent reliability for both the AUSCAN function subscale and FIHOA30. Moe, et al reported the same, in addition to comparable SDD for both questionnaires34. Poole, et al evaluated the FIHOA, in addition to the Cochin scale, reporting the highest ICC for the Cochin scale37.
Performance- or assessor-based measures were assessed less frequently, but showed good measures of reliability.
Only 2 instruments (AUSCAN and FIHOA) were extensively tested, showing excellent measures of reliability for both questionnaires. Other instruments, while showing good measures of reliability, had only been tested in 1 or 2 studies. Therefore, only tentative conclusions can be drawn for these instruments.
Discrimination: Sensitivity to change
Of the 45 studies assessing change over time in pain, physical function, or PtGA25,26,29,36,42,45,47⇓⇓⇓⇓⇓⇓⇓⇓⇓–57,58⇓⇓⇓⇓⇓⇓⇓⇓⇓–68,69⇓⇓⇓⇓⇓⇓⇓⇓⇓–79,80,81,82,83,84,85, 7 studies did not demonstrate any significant change (1 observational study, 6 RCT)62,69,75,78,79,80,81. Six studies observed only a statistically significant change in pain or PtGA (1 observational study, 5 RCT)29,50,54,60,61,77, and 5 studies only observed the change in physical function (all RCT)45,47,59,65,76.
The studies that detected change in at least 1 instrument assessing the corresponding domain are summarized in Table 4. The results of these studies regarding measured change over time are described in the Supplementary Table (available from the authors on request).
Pain was most frequently assessed using the VAS or NRS, detecting change in 88% of these studies. Other applied instruments were the AUSCAN pain scale and pain/tenderness assessed on palpation, detecting change in 77% and 92% of the studies, respectively (Table 4)29,36,48,49,52,54,56,61,72,73,74,83,84. The ASES pain scale was applied in only 1 study and therefore not included in the table50.
Physical function was most frequently assessed by measured grip strength, detecting change in 75% of these studies. Other commonly applied instruments were the AUSCAN function scale (82% detecting change), FIHOA (67% detecting change), HAQ (50% detecting change), and grip strength (57% detecting change). The Cochin scale and VAS or NRS were less frequently used (Table 4). The AIMS-267, COPM59, dexterity68, GAT50, and MPUT77 were all assessed in only 1 study each.
PtGA was assessed using the VAS global, detecting change in 60% of these studies. The 40% that did not detect change over time did measure change in the AUSCAN function, COPM, or number of tender joints. A few studies assessed change in PtGA using the AUSCAN total (Table 4).
The VAS pain was by far the most frequently applied instrument for the assessment of change over time of pain in hand OA, followed by the AUSCAN pain subscale and pain on palpation. For the assessment of change of physical function, the AUSCAN function subscale, FIHOA, and grip strength assessment were commonly used. Change in PtGA was most frequently evaluated using the VAS global. The majority of studies that reported change in pain, physical function, or PtGA detected this change by all applied instruments assessing the corresponding domain, suggesting good sensitivity to change for all evaluated instruments.
Feasibility
The number of items of the different applied instruments is described in Table 2. Although most of these instruments are available in the public domain, payment is required for the use of the AUSCAN.
Only 4 of the included studies reported data on the time needed to apply the used instruments13,19,37,39. Two studies reported the completion time of a questionnaire: for completion of the modified SACRAH, a median of 95 s was measured (range 80–175 s)39, and for completion of the FIHOA, a mean of 165 s (SD 119 s, range 50–600) was measured in patients with painful OA whereas inactive OA patients needed on average 136 s (SD 97 s, range 20–240)13. The other 2 studies reported the time required to administer 1 or 2 assessor- or performance-based measures: for the Doyle index, a mean time of 5.1 min (range 2.4–7.8) was reported19, and the AHFT and HAMIS were reported to require 20–25 min and 5 min, respectively37.
Questionnaires took less time than assessor- or performance-based measures. The completion time of both assessed questionnaires was short, so both the FIHOA and the modified SACRAH were highly feasible.
Validity
Eighteen studies correlated different instruments (mostly questionnaires), providing information on construct validity. The reported correlations between instruments assessing either pain or physical function, or PtGA are presented in Table 5. Most of the studies (n = 16) reported cross-sectional correlations, whereas correlations or associations between assessed change over time were reported in only 3 studies23,28,46.
The AUSCAN, grip strength, and FIHOA scores were most frequently compared with other outcome measures (Table 5). Correlations of the ASES pain scale, COPM, and MAP-hand with other clinical outcome measures were evaluated in only 1 study28, as were the JTHFT41, Revel functional index36, PRWHE33, MHQ, HFI, and HAMIS37. These studies were therefore not included in Table 5.
Varying correlation coefficients were reported among the different studies. In general, correlations between different questionnaires were stronger than correlations of performance-based measures with other performance-based measures or with questionnaires. Correlations between different instruments assessing physical function ranged from 0.52–0.89 between questionnaires, from 0.05–0.67 between questionnaires and performance-based measures, and from 0.25–0.96 between performance-based measures. For the assessment of pain, correlations between 0.55–0.81 were observed between questionnaires, and correlations between 0.47–0.65 between questionnaires and pain on palpation. However, only a few correlation coefficients above 0.90 were observed, suggesting that different instruments detect different aspects of the assessed domain.
Two of the 3 studies associating change over time by different instruments presented correlation coefficients, which were in line with the results described above28,46. The third study calculated β coefficients for the association of change of the AUSCAN and grip and pinch strength with global assessment of change, adjusted for age, sex, number of osteoarthritic hand joints, and time between assessments. The strongest association with global assessment of change was observed for the AUSCAN23.
Construct validity of various instruments measuring pain, physical function, or PtGA has been assessed in multiple cross-sectional studies, but only few longitudinal data are available. Moderate to good correlations were observed, especially between questionnaires, suggesting good construct validity.
Table 6 summarizes the available information of metric properties per domain for the 6 most frequently applied instruments for the assessment of pain, physical function, and PtGA. Information of metric properties was considered established when supporting results were observed in at least 3 studies. The unavailability of the AUSCAN in the public domain was included as negative evidence regarding its feasibility.
DISCUSSION
The most frequently applied and evaluated instruments for the assessment of pain were the AUSCAN pain subscale, VAS pain, and pain on palpation. The AUSCAN function subscale, FIHOA, and grip and pinch strength were most frequently applied and evaluated for the assessment of physical function. PtGA was most frequently evaluated using the VAS global.
In the description of discrimination, the reliability of the AUSCAN and FIHOA were found to be extensively tested and shown to be excellent. The reliability of other instruments was suggested to be good, but only scarce evidence was available.
The VAS pain was by far the most commonly used instrument for the assessment of the change of pain, followed by the AUSCAN pain subscale and pain on palpation. The AUSCAN function subscale, FIHOA, and assessment of grip and pinch strength were regularly applied for the assessment of the change of physical function. The change of PtGA was most often evaluated by the VAS global. The majority of studies detected change by all used instruments, suggesting good sensitivity to change for the evaluated instruments. The change in pain was detected most frequently by the VAS pain or pain on palpation, whereas the change in physical function was detected most frequently by the AUSCAN function subscale or measured grip strength.
In the description of feasibility, only a few of the studies reported on the time needed to perform the instruments. Questionnaires took less time than performance-based measures. Of the frequently applied instruments, only the FIHOA was evaluated and seemed feasible. This is supported by the availability of this questionnaire in the public domain, in contrast with the AUSCAN.
For the description of validity, numerous cross-sectional studies assessed correlations between various instruments, but few longitudinal data were available. The strongest correlations were reported between different questionnaires assessing pain or physical function. Remarkably, the VAS pain, as 1 of the most frequently applied instruments, was evaluated in only a limited number of studies.
For further evaluation of validity, comparison with an external standard should be performed. However, no external standards for the evaluation of pain, physical function, and PtGA have been agreed upon, perhaps because of the varying definitions and measurement of these concepts. For the assessment of physical function, observation of the performance of tasks as described by specific instruments assessing physical function may be useful in the evaluation of validity of these instruments103.
Based on our review, it is not possible to decide on 1 instrument that should be recommended for the measurement of pain, physical function, or PtGA in hand OA research. Although no major differences regarding metric properties of the evaluated instruments were observed, the amount of supporting evidence varied extensively between the instruments.
Before consensus can be reached on which instruments should be applied, some aspects need further investigation. The reliability of the VAS pain, grip and pinch strength, and pain on palpation needs to be further established in a variety of populations. Regarding the sensitivity to change, the minimal clinical important difference of instruments needs to be determined. Only for the AUSCAN has a minimal clinically important improvement been proposed104. Validity of instruments assessing physical function should be further investigated by comparing these instruments with an external standard. Further, future research should evaluate instruments within specific subtypes of hand OA.
Our study has some limitations. We intended to include as many available studies as possible that provided information on instruments and their metric properties, and not only studies that actually aimed at evaluating this. Because of the large heterogeneity across studies regarding their purpose (primarily aiming at evaluation instruments or applying instruments for other primary aims) and study design, the methodological quality of the included studies was not assessed. Further, the heterogeneity did not enable the pooling of data into a metaanalysis and addressing the presence of publication bias.
Limitations regarding the literature search are the included databases, restriction to English language, and exclusion of abstracts and unpublished results.
Within all studies assessing the VAS pain or VAS global, different questions were used. The individual questions were observed to be highly variable, especially regarding the type of pain (global pain, overall disease severity, intensity, not specified) and time settings (last 24 h or 48 h, 2 days, 2 weeks, not specified). In future research, this phrasing should be standardized. Further, the VAS pain score has been shown to be influenced by the information on the disease and its consequences that is given to patients when determining the VAS105, which could not be addressed because of the lack of information on this topic in the included studies. However, future studies evaluating the VAS should take the effect of patient information into account.
Our systematic literature review provides an overview of the instruments that are used for the measurement of pain, physical function, and PtGA in hand OA. Most information on the metric properties of these instruments was available for the questionnaires AUSCAN (assessing pain and function), FIHOA (assessing function), and VAS pain, and for the performance- or assessor-based instruments grip and pinch strength, and pain on palpation. To enhance comparability across future studies in hand OA, consensus has to be reached on recommended instruments for the measurement of pain, physical function, and PtGA in hand OA. More research has to be performed to compare the different instruments with each other.
Footnotes
Supported by the Dutch Arthritis Foundation (grant number 10-1-309).
- Accepted for publication July 14, 2015.
REFERENCES
- 1.↵
- 2.↵
- 3.↵
- 4.↵
- 5.↵
- 6.↵
- 7.↵
- 8.↵
- 9.↵
- 10.↵
- 11.↵
- 12.↵
- 13.↵
- 14.↵
- 15.↵
- 16.↵
- 17.↵
- 18.↵
- 19.↵
- 20.↵
- 21.↵
- 22.↵
- 23.↵
- 24.↵
- 25.↵
- 26.↵
- 27.↵
- 28.↵
- 29.↵
- 30.↵
- 31.↵
- 32.↵
- 33.↵
- 34.↵
- 35.↵
- 36.↵
- 37.↵
- 38.↵
- 39.↵
- 40.↵
- 41.↵
- 42.↵
- 43.↵
- 44.↵
- 45.↵
- 46.↵
- 47.↵
- 48.↵
- 49.↵
- 50.↵
- 51.↵
- 52.↵
- 53.↵
- 54.↵
- 55.↵
- 56.↵
- 57.↵
- 58.↵
- 59.↵
- 60.↵
- 61.↵
- 62.↵
- 63.↵
- 64.↵
- 65.↵
- 66.↵
- 67.↵
- 68.↵
- 69.↵
- 70.↵
- 71.↵
- 72.↵
- 73.↵
- 74.↵
- 75.↵
- 76.↵
- 77.↵
- 78.↵
- 79.↵
- 80.↵
- 81.↵
- 82.↵
- 83.↵
- 84.↵
- 85.↵
- 86.↵
- 87.↵
- 88.↵
- 89.↵
- 90.↵
- 91.↵
- 92.↵
- 93.↵
- 94.↵
- 95.↵
- 96.↵
- 97.↵
- 98.↵
- 99.↵
- 100.↵
- 101.↵
- 102.↵
- 103.↵
- 104.↵
- 105.↵