Abstract
Objective. The objective of this paper is to assess the content and measurement constructs of the candidate instruments for the domains of “pain” and “physical function/activity” in the Outcome Measures in Rheumatology (OMERACT) shoulder core set. The results of this International Classification of Functioning, Disability, and Health (ICF)–based analysis may inform further decisions on which instruments should ultimately be included in the core set.
Methods. The materials for the analysis were the 13 candidate measurement instruments within pain and physical function/activity in the shoulder core domain set, which either passed or received amber ratings (meaning there were some issues with the instrument) in the OMERACT filtering process. The content of the candidate instruments was extracted and linked to the ICF using the refined linking rules. The linking rules enhance the comparability of instruments by providing a comprehensive overview of the content of the instruments, the context in which the measurements take place, the perspectives adopted, and the types of response options.
Results. The ICF content analysis showed a large variation in content and measurement constructs in the candidate instruments for the shoulder core outcome measurement set.
Conclusion. Two of 6 pain instruments include constructs other than pain. Within the physical function/activity domain, 2 candidate instruments matched the domain, 3 included additional content, and 2 included meaningful concepts in the response options, suggesting that they should be omitted as candidate instruments. The analyses show that the content in most existing instruments of shoulder pain and functioning extends across core set domains.
Shoulder pain is a common musculoskeletal disorder with an incidence of 10 per 1000 patients in primary care and point prevalence estimates of 7% to 26% in the general population1. Shoulder disorders can be long lasting; in a Dutch study of patients presenting to their general practitioner with a new episode of shoulder pain, a considerable number (41%) showed persistent symptoms after 12 months2. The associated disability and effect in terms of earnings, missed workdays, and disability payments are substantial3–7.
The domains and measurement instruments reported in trials on shoulder disorders are widely diverse; therefore, the development of a core outcome set for use in clinical trials across shoulder disorders has been advocated8. Since 2016, there has been an ongoing effort to develop a shoulder core set within the Outcome Measures in Rheumatology (OMERACT)9–12. At the OMERACT 2018 conference, a shoulder core domain set was approved by the delegates13. It consists of 4 mandatory domains for all trials of shoulder disorders: pain, physical function/activity, patient global – shoulder, and adverse events including death; and 4 important but optional domains: participation (recreation/work), sleep, emotional well-being, and condition-specific pathophysiological manifestations13. The next phase will be to recommend specific measurement instruments for a core outcome measurement set10.
Preliminary work has investigated instruments within 2 of the mandatory domains, pain and physical function/activity, identified from a systematic review of outcome domains and measurement instruments reported across randomized trials of any interventions for various shoulder disorders8. Pain was defined as “how much a person’s shoulder hurts, reflecting the overall magnitude of the pain experience (i.e., at rest, during and after activity, at night)”. Physical function/activity was defined as “a person’s ability to carry out daily physical activities, ranging from self-care (e.g., bathing, combing hair) to more complex activities that require a combination of skills (e.g., driving a car)”13. Thirty-eight instruments within the pain domain and 45 within the physical function/activity domain were further investigated with the “Truth” Part 1 and “Feasibility” filters of OMERACT11,12. Altogether, 6 instruments in the pain domain and 7 within the physical function/activity domain passed both filters and are candidates for further assessment14,15. However, 5 in the pain domain and 3 in the physical function/activity domain received amber ratings for content validity, indicating potential limitations in their utility14,15.
The International Classification of Functioning, Disability, and Health (ICF) is the World Health Organization framework for measuring health and disability16. Since its publication in 2001, the ICF has been used to describe and compare health information. To establish a standardized procedure to translate the content of measurement instruments into ICF concepts, a set of 10 linking rules were published in 2002 and updated in 200517,18. Since their introduction, a number of instruments have been linked to the ICF19–21. To enhance the comparability of instruments, and ultimately to be able to aggregate information gathered with various instruments, it does not only require content comparability of items but also a reflection on the perspective they have adopted and the categorization of their response options.
In 2016, the linking rules were refined to account for these aspects, offering a more transparent tool to assess the content of measurement instruments and the context in which the measurements take place22. Thus, content linking of outcome measurement instruments based on the refined ICF linking rules provides information on important aspects of content validity. Content validity is considered to be the most important measurement property of an outcome measurement instrument, because if it is unclear what an instrument is actually measuring, the assessment of other measurement properties may be irrelevant23.
The aim of the present study was to assess the content and measurement constructs of the candidate instruments for the domains of pain and physical function/activity in the OMERACT shoulder core set, using the refined ICF linking rules. The results of this ICF-based analysis may inform further decisions on which instruments should ultimately be included in the Core Set.
MATERIALS AND METHODS
The materials for the analysis were the 13 candidate measurement instruments within pain and physical function/activity in the shoulder core domain set, which either passed or received amber ratings (meaning there were some issues with the instrument) in the OMERACT filtering process14,15. The 6 candidate instruments within the pain domain and the 7 within the physical function/activity domain are presented in Table 124–34. These instruments are widely used in the clinical and epidemiological research of shoulder pain conditions8.
Analysis of content and measurement constructs
The ICF is based on an integrative model of health that classifies functioning within the components of body functions (b), body structures (s), activities and participation (d), environmental factors (e), and personal factors (not classified)16. The ICF provides 4 subclassifications (b, s, d, e), where categories of functioning and environmental factors are arranged hierarchically using an alphanumeric coding system. At the first level, the initial letter is followed by a numeric code (1-digit; e.g., d4 Mobility), 2 more digits for the second level (e.g., d445 Hand and arm use), and a total of 4 digits for third level categories (e.g., d4452 Reaching). A fourth level is also available when appropriate. An overview of the chapter structure of the components body functions and activities and participation is shown in Table 2.
The content from each item in the measurement instruments was linked to the ICF according to the 10 refined linking rules22. Linking rules 1 to 3 specify how to get familiar with the ICF and identify the purpose of an instrument and concepts to be linked to the ICF. Both the researchers who conducted the analyses (YR and SØ) had previously linked the content of shoulder pain instruments to the ICF21.
First, the actual meaning (main and additional concepts) of the information to be linked was identified, consistent with rules 2 and 322. When identifying the concepts, both the item text and the text that sets the premises for the interpretation of the item content were taken into consideration. For most items, it was straightforward to identify the main and additional concepts. For example, for the item “How severe is your pain: pushing with the involved arm?”, “pain” was identified as the main concept and “pushing with the arm” as an additional concept. In this item, the additional concept defines the context in which pain is assessed. Sometimes more than 1 activity was listed in the same item. When this was the case, all of the listed activities were recognized as main concepts. In a few cases, the item was framed in general terms, while specific activities were included in the response options, such as in the function subscale of the University of California at Los Angeles (UCLA) Shoulder Score function subscale, and the activities of daily living subscale of the Shoulder Function Assessment scale (SFA)32,33,34. The naming of the item was then identified as the main concept and the specific activities as additional concepts.
The next step was to document the perspectives from which the information was collected (linking rule 4). The most common perspectives included in measurement instruments are the descriptive, appraisal, and needs or dependency perspectives22. The descriptive perspective refers to a person’s function of the body, ability to perform a task in a standardized environment (capacity), or actual performance of certain tasks or activities in the natural environment. According to linking rule 5, the categorization of the response option in every measurement instrument was identified and documented.
Finally, all main and additional concepts identified during steps 2 and 3 were linked to the most precise ICF category (linking rules 6–10). For concepts not sufficiently specified to be linked, the “not definable” option was used. If a concept was not covered by any of the ICF classifications, the option “not covered” was used.
All instruments were independently assessed by 2 researchers (YR and SØ). In the case of differences in linking, this was solved by discussion. There were no cases of disagreement in the identification and documentation of perspectives and response options.
Agreement between the researchers in the linking of concepts at the second ICF category level was calculated with the Cohen k coefficient. The 95% CI for the k coefficient were calculated using the standard error (SE) of the kappa35: k − 1.96 × SEk to k + 1.96 × SEk. The calculated k coefficient of the linking of the main and additional concepts was 0.85 (95% CI 0.78–0.91) and considered as excellent (range, 0.61–1.00)36. The study did not include data from patients or any other sensitive material, thus ethical approval was waived.
RESULTS
Descriptive information about the 6 candidate instruments within the pain and physical function/activity domains of the shoulder core set is shown in Table 1.
“Pain” candidate instruments
The analysis of the perspectives showed that the “descriptive performance perspective” was adopted in all 6 instruments. The response options in the visual analog scale (VAS), the numerical rating scale (NRS), the Oxford Shoulder Score (OSS), and the Shoulder Pain and Disability Index (SPADI) pain subscale reflect intensity. In the verbal rating scale (VRS) the response options reflected “qualitative attributes”, and in the Shoulder Pain Scale (SPS), a combination of intensity and qualitative attributes.
All instruments had main concepts linked to sensory functions and pain categories in the ICF (Table 3). For the 3 overall pain scales (VAS, NRS, and VRS) and the SPADI pain subscale, all main concepts were linked to a pain ICF category. The overall pain scales only cover a single ICF pain category, while the SPADI includes 5 categories.
In addition to pain categories, the SPS included a main concept linked to a mobility category in the ICF. One instrument stood out from the others: in the OSS, 10 of 14 main concepts were linked to ICF categories other than pain, namely to activity and participation categories within the mobility, self-care, and domestic life chapters.
The additional concepts in the pain instruments provide information about the context in which the pain is assessed. In the 3 overall pain scales, no additional concepts were identified. In the SPS, 3 additional concepts were not sufficiently specified to be classified in the ICF (at rest, in motion, and nightly), whereas in the SPADI pain subscale, pain was measured in the context of 4 different mobility activities. Of the 4 main concepts in the OSS that assessed pain, 2 were provided without any additional concepts, one was linked to a mobility category, and another was assigned to “not definable.”
“Physical function/activity” candidate instruments
The analysis of perspectives in the candidate instruments showed that a descriptive performance perspective was adopted in all 7 instruments. With respect to the response options, 4 instruments including the Penn Shoulder Score, Function Subscale (Penn), the L’Insalata Shoulder Rating Questionnaire (SRQ), the American Shoulder and Elbow Surgeons Shoulder Outcome Score (ASES), and the SPADI disability subscale, assessed “Intensity”; the Simple Shoulder Test (SST) and the University of California at Los Angeles Shoulder Score (UCLA) “Confirmation/agreement,” and the SFA scale “Qualitative attributes”.
The instruments varied with respect to the depth and breadth of information (Table 4). The additional concepts in the physical function/activity measures were often used for specifying the content, and thus should be interpreted differently than in the pain candidate instruments. All physical function/activity candidate instruments included concepts linked to self-care ICF categories, and all except one, the SFA, included concepts linked to both self-care and mobility.
The Penn was the most wide-ranging instrument with concepts linked to categories in 5 chapters of the activities and participation component of the ICF. In particular, the Penn comprehensively covers mobility, self-care, and domestic life (23 of 27 main concepts). It is also worth noting that the Penn included 4 main concepts linked to a sleep category, which is classified as body functions in the ICF, and also linked to work and leisure activities in the activities and participation component of the ICF. Similarly, the ASES covered mobility and self-care comprehensively, but it also included concepts linked to sleep functions and to work and leisure activities. In the SST, 8 of 15 concepts were linked to mobility categories and the rest to work, sleep, and pain categories in the ICF.
Two instruments, the SRQ and the SPADI disability subscale, covered mobility and self-care comprehensively. In the SRQ, the content was linked to 3 mobility categories and 7 different self-care categories. In addition, 2 concepts were linked to domestic life activities. The SPADI disability subscale had concepts linked to 4 mobility categories and 5 self-care categories (of these, only 2 are unique).
In the 2 last instruments, the SFA and the UCLA, the meaningful concepts were identified in the response options. For the UCLA, these concepts were linked to mobility, self-care, and domestic life categories in the ICF, and for the SFA to self-care categories.
DISCUSSION
The ICF content analysis showed a large variation in content and measurement constructs in the candidate instruments for pain and physical function/activity for the shoulder core outcome measurement set.
Among the 6 pain candidate instruments, all included concepts linked to a pain category in the ICF. However, 2 of the instruments, the SPS and the OSS, also covered sleep functions, mobility, self-care, and domestic life activities. This was particularly prominent in the OSS, where more than two-thirds of the items covered concepts other than pain.
In pain assessments, it is important to take into account the context in which the pain is experienced. This is consistent with the definition of pain in the shoulder core set, relating pain experiences to a given context (“i.e., at rest, during, and after activity”)13. The only candidate instrument where all main concepts cover pain and at the same time refer to a specific context was the SPADI Pain. It should, however, be noted that all except 1 SPADI item measure pain in the context of performing hand and arm mobility activities. In addition, a single item measures pain at its worst. Thus, one of its items measures pain in relation to self-care or domestic life activities, not pain at rest.
The overall pain candidate scales, the VAS, NRS, and VRS, measure the magnitude of the pain regardless of any contextual information. Because of the vagueness in construct definition, it has been recommended that such scales can only complement and not replace genuine, validated pain scales37.
Based on our ICF analysis, no single candidate instrument completely matches the magnitude of the pain experience, as defined in the shoulder core outcome set13. However, the use of the SPADI Pain in combination with an overall pain scale (VAS, NRS, or VRS), might provide an acceptable coverage of the pain domain. Moreover, the documented inconsistencies in the content of the SPS and OSS should be considered in the further discussions regarding which pain instruments to be included in the core set.
Seven candidate instruments in the physical function/activity domain were included in the ICF content analysis. As defined in the core set, this domain covers functions ranging from self-care (e.g., bathing, combing hair) to more complex activities (e.g., driving a car)13. Our analysis showed that a majority of the candidate instruments cover mobility and self-care activities, which matches the domain definition of the core set13,16. Nevertheless, a majority of the candidate instruments also cover content that falls outside the domain definition. In particular, 1 instrument, the Penn, included content from 5 of 9 chapters within the activities and participation component and content that was linked to the body functions component of the ICF. A similar content coverage was found in the SST and the ASES. This wider content coverage, as provided by the Penn, SST, and ASES, is supported by empiric evidence showing that patient-reported problems are frequently reported within a range of body functions and activities and participation chapters38.
The candidate instruments that provided the best match with physical function/activity were the SRQ and the SPADI Disability. Both instruments covered mobility and self-care activities, and included little additional content. Although both instruments had a similar content profile, an important difference was discovered: while the SRQ covers a range of self-care activities, the SPADI Disability only included 2 such activities. It should also be noted that only 6 of the 15 items in the full version of the SRQ were selected as candidates for the shoulder outcome measurement set. From our previous content analyses of shoulder pain instruments, we learned that the full version of the SRQ covers similar ICF domains as the most wide-ranging candidate instrument, the Penn21.
The 2 last candidate instruments, the UCLA Shoulder Score and the SFA have little or no content that address mobility activities of the hand and arm. In addition, they have a structure that implies that the meaningful concepts are included in the response options and not in the item itself. This limitation needs to be considered in the ongoing selection process.
Our ICF analysis showed that a majority of the physical function/activity candidate instruments had content that did not perfectly match the OMERACT domain definition13. In addition to mobility and self-care activities, most of the measures covered content belonging to pain and 2 optional core set domains, participation (recreation and work) and sleep13. There were also examples of domestic life activities (e.g., household tasks) in the instruments that are not included in any of the recommended core set domains13.
We suggest that the lack of alignment between the definition of physical function/activity in the shoulder core outcome set, and the content of the candidate measures needs further consideration by the OMERACT Shoulder working group. The group could consider either adjusting the domain definition or not including instruments that do not comply with the current definition. In this undertaking, the consensus-based guidelines for selection of outcome measurement instruments, developed as a joint initiative between the Core Outcome Measures in Effectiveness Trials (COMET) initiative and the COnsensus-based Standards for the selection of health Measurement INstruments (COSMIN) initiative, will be useful23,39.
A limitation of our study was that some of the measures included content that neither could be defined nor is covered by the ICF. Because of this, the results do not provide a complete overview of the content in the measures.
The ICF-based analysis of the candidate instruments within the mandatory pain and physical function/activity domains of the OMERACT shoulder core outcome set showed large variations in the content and measurement constructs covered. Two of 6 pain instruments include constructs other than pain. Within physical function/activity, 2 candidate instruments matched the domain, 3 included additional content, and the last 2 instruments included meaningful concepts in the response options, suggesting that they should be omitted as candidate instruments. The analyses show that the content in most existing instruments of shoulder pain and functioning extends across core set domains.
- Accepted for publication January 27, 2020.