Abstract
Objective. To develop and test shortened versions of the Manual Muscle Test-8 (MMT-8) in juvenile dermatomyositis (JDM).
Methods. Construction of reduced tools was based on a retrospective analysis of individual scores of MMT-8 muscle groups in 3 multinational datasets. The 4 and 6 most frequently impaired muscle groups were included in MMT-4 and MMT-6, respectively. Metrologic properties of reduced tools were assessed by evaluating construct validity, internal consistency, discriminant ability, and responsiveness to change.
Results. Neck flexors, hip extensors, hip abductors, and shoulder abductors were included in MMT-4, whereas MMT-6 also included elbow flexors and hip flexors. Both shortened tools revealed strong correlations with MMT-8 and other muscle strength measures. Correlations with other JDM outcome measures were in line with predictions. Internal consistency was good (0.88–0.96) for both MMT-4 and MMT-6. Both reduced tools showed strong ability to discriminate between disease activity states, assessed by the caring physician or a parent (P < 0.001), and between patients whose parents were satisfied or not satisfied with illness course (P < 0.001). Responsiveness to change (assessed by both standardized response mean and relative efficiency) of MMT-4 and, to a lesser degree, MMT-6, was slightly superior to that of MMT-8.
Conclusion. Overall, the metrologic performance of MMT-4 and MMT-6 was comparable to that of the other established muscle strength tools, which indicates that they may be suitable for use in clinical practice and research, including clinical trials. The measurement properties of these tools should be further tested in other patient populations and evaluated prospectively.
- outcome assessment
- pediatric dermatomyositis/polymyositis
- pediatric rheumatic diseases
Juvenile dermatomyositis (JDM) is a multisystem inflammatory disease of presumed autoimmune origin that predominantly affects the skin and the skeletal muscles, but it may also involve visceral organs, especially the lungs and the gastrointestinal tract, and is associated with poorly understood complications, namely dystrophic calcinosis and lipodystrophy.1,2 Although over the past 2 decades there has been a remarkable improvement in the management and outcome of JDM, there are still many patients who respond suboptimally to contemporary therapies and experience chronic disease activity. These patients are at risk of developing irreversible damage and physical functional disability, which may have a profound effect on their quality of life (QOL).3,4,5,6,7
Muscle weakness is a cardinal feature of JDM, which can be due to either ongoing muscle inflammation or residual muscle damage. Improvement of muscle disease is a key determinant of disease prognosis and a leading objective of all therapeutic interventions. Hence, measurement of muscle strength is a fundamental component of the clinical assessment of children with JDM, and must be performed regularly to monitor the course of the disease over time and to evaluate the effectiveness of management strategies.
The Manual Muscle Test-8 (MMT-8) is one of the most popular tools for the measurement of muscle strength in children with JDM.8 It is the shorter version of the original instrument, in which 8 proximal, distal, and axial muscle groups are tested unilaterally, on the patient’s dominant side. Each muscle group examined is scored on a scale of 0 to 10 (where 0 = extreme weakness and 10 = normal strength), depending on how much it can do in terms of moving against gravity or against pressure applied by the examiner. In our experience with the use of the MMT-8, we have noticed that the upper and lower extremity proximal muscle groups and the cervical muscles are more frequently and severely affected than the muscles of the distal extremities. This disparity was expected as it reflects the typical pattern of weakness in JDM,9,10 although distal muscle involvement can be noticeable. This observation led us to hypothesize that limiting the evaluation to the most impaired muscles could enhance the measurement performance of the instrument. We also considered that reducing the number of muscle sites tested may facilitate assessment in younger children, who may not cooperate for the entire length of the exam.
For these reasons, we undertook the study described herein, which aimed to test the metrologic properties of the 4- and 6-muscle reduced versions of the MMT-8 (named MMT-4 and MMT-6, respectively). We also compared them with those of the complete tool and of 2 other established measures of muscle strength in JDM: the Childhood Myositis Assessment Scale (CMAS)11,12 and the hybrid MMT-8/CMAS (hMC).13
METHODS
Development of MMT-4 and MMT-6. The construction of the 2 shortened versions of the MMT-8 was based on the analysis of the frequency of the individual scores assigned to each muscle group in 3 multinational datasets of patients with JDM enrolled in previous studies. The MMT-4 and MMT-6 were designed to incorporate the 4 and 6 most frequently impaired muscle groups, respectively. The composition and theoretical range of the 3 versions of the MMT are shown in Table 1.
Study datasets. The first dataset comprised 213 patients followed in routine care at 13 pediatric rheumatology centers and evaluated at baseline and after a median of 5.9 months. The second dataset included 139 patients enrolled in a randomized controlled trial (RCT) aimed at comparing the efficacy and safety of prednisone alone with that of prednisone plus either methotrexate or cyclosporine.14 The third dataset included 322 patients with a disease duration ≥ 2 years enrolled in a cross-sectional study of the long-term outcomes of JDM.3 The first study protocol was approved by the regional ethics committee of Liguria, Genoa, Italy, on June 18, 2018 (meeting minutes no. 10/2018), and the other 2 were approved by the ethics committee of Istituto Giannina Gaslini, Genoa, Italy, on February 9, 2006 (meeting minutes no. 458/2006), and on April 23, 2003 (meeting minutes no. 1006/2003). Written informed consent/assent to participate in the studies was provided by both the parent/guardian and the patient (when applicable). For sake of brevity, the 3 datasets will be thereafter named “Routine dataset,” “JDM trial dataset,” and “Outcome dataset,” respectively. The demographic characteristics and the results of outcome assessments in the 3 patient samples have been reported elsewhere.13,14,15
Assessment of additional JDM outcome measures. Beside the MMT-8, measurement of muscle strength was carried out with the CMAS11,12 and the hMC.13 Briefly, the CMAS assesses the capacity of the patient to perform 14 activities or maneuvers, or the duration of performance of particular tasks. Its score ranges from 0 (worst) to 52 (best). The hMC is made up by combining the entire MMT-8 with 3 of the 14 items of the CMAS (time of head lift, sit-ups, and floor rise). Its score ranges from 0 (worst) to 100 (best).
Clinical assessment also included quantification of the other aspects of disease impact through the traditional outcome measures for JDM. These measures included, depending on the sample, the following: (1) physician global assessment of overall disease activity (PGA) on a visual analog scale (VAS; 0 = no activity and 10 = maximum activity); (2) parent global assessment of child’s well-being (PaGA) on a VAS (0 = best and 10 = worst); (3) parent global rating of child’s pain on a 10-cm VAS (0 = no pain and 10 = maximum pain); (4) parent global rating of child’s fatigue on a 10-cm VAS (0 = no fatigue and 10 = extreme fatigue); (5) estimation of overall disease activity through the total score of the Disease Activity Score (DAS total; 0 = no activity and 20 = maximum activity)16; (6) assessment of muscle disease activity with the muscle component of the DAS (DAS muscle; 0 = no activity and 11 = maximum activity)16; (7) PGA of muscle disease activity on a 10-cm VAS (muscle activity VAS; 0 = no activity and 10 = maximum activity)17; (8) assessment of skin disease activity with the skin component of the DAS (DAS skin; 0 = no activity and 9 = maximum activity)16; (9) PGA of skin disease activity on a 10-cm VAS (skin activity VAS; 0 = no activity and 10 = maximum activity)17; (10) assessment of physical function through the Childhood Health Assessment Questionnaire (0 = best and 3 = worst)18; (11) assessment of health-related QOL through the Child Health Questionnaire (CHQ), and expressed by the CHQ physical summary score (CHQ-PhS) and CHQ psychosocial summary score (CHQ-PsS)19,20; (12) assessment of cumulative damage with the Myositis Damage Index (MDI; 0 = no damage and 35 = maximum damage)17; and (13) determination of the serum muscle enzyme creatine kinase (CK).
The Routine dataset also included the following evaluations: (1) physician subjective assessment of disease state as inactive disease, low disease activity, moderate disease activity, or high disease activity; (2) physician subjective assessment of disease course at second visit as improved, stable, or worsened; (3) parent subjective assessment of disease state as remission, continued activity, or flare; and (4) parent satisfaction with illness outcome. To evaluate satisfaction, parents were asked the question, “Considering all the ways the illness affects your child, would you be satisfied if his/her condition remained stable/unchanged for the next few months?” This was to be answered yes or no.21
Evaluation of measurement performance of the MMT-4 and MMT-6. The metrologic properties of the reduced versions of the MMT-8 were examined following the standard procedures that are used in the validation of outcome measures.22,23,24,25 Specific assessments in the present study included evaluation of construct validity, internal consistency, discriminant ability, and responsiveness to change over time. In the assessment of all these properties, the measurement performance of the MMT-4 and MMT-6 was compared with that of the MMT-8, CMAS, and hMC. No imputation of missing data was made.
Construct validity was assessed by calculating the Spearman correlation of muscle tools with the other JDM outcome measures. Correlations were considered high if > 0.7, moderate if 0.4–0.7, and low if < 0.4.25,26 Internal consistency was assessed using Cronbach α coefficient27 and was defined as follows: < 0.6 = poor, 0.6–0.64 = slight, 0.65–0.69 = fair, 0.7–0.79 = moderate, 0.8–0.89 = substantial, and > 0.9 = almost perfect.28
Construct validity was also examined by carrying out a multivariable logistic regression analysis, in which the individual muscle strength tools were the dependent variable and the other JDM outcome measures were the explanatory variables. This analysis was performed separately for each dataset.
To evaluate the capacity of the tools to differentiate between patients with varying degrees of disease activity, we compared their scores between patients grouped using physicians’ subjective assessment of disease state, parents’ subjective assessment of disease state, and parents’ satisfaction with illness outcome. Comparison among groups was made by Mann-Whitney U test and Kruskall-Wallis test, as appropriate.
Responsiveness to change between 2 consecutive visits was assessed by computing the standardized response mean (SRM), calculated as the mean change in score divided by the SD of individuals’ change in score. According to Cohen,29 threshold levels for SRM were defined as follows: ≥ 0.2 = small, ≥ 0.5 = moderate, and ≥ 0.80 = good. In the Routine dataset, responsiveness was calculated for patients judged by the physician as improved at second visit. Patients whose baseline scores were at ceiling (i.e., the maximum score for the assessed tool) and therefore, could not further improve, were excluded from the assessment of responsiveness in the group judged as improved. In the JDM trial dataset, responsiveness was calculated only for responders, in relation to the magnitude of improvement by the Pediatric Rheumatology International Trials Organization (PRINTO) response criteria.7,30,31
In addition to SRM, responsiveness to change was assessed by computing the relative efficiency, which is the square of the ratio between the SRM of the new tool (i.e., the MMT-4 and MMT-6) and the SRM of the tool used as reference (i.e., the MMT-8) and is calculated through the following formula:
An RE > 1 indicates that the evaluated tool is more efficient in detecting change than the reference tool.32,33
All statistical tests were 2-sided; a P value < 0.05 was considered statistically significant. The statistical packages used were Statistica (release 6.1, StatSoft), Stata release 9.2 (StataCorp), XLSTAT (version 1.02, Addinsoft), and R statistics (version 3.3.3.; The R foundation for Statistical Computing, www.R-project.org).
RESULTS
Construction of MMT-4 and MMT-6. The frequency of impairment of each of the 8 items of the MMT-8 as well of their individual scores in the 3 patient datasets are shown, together with their mean and median scores, in Table 2. As expected, the JDM trial dataset, which included treatment-naïve patients enrolled at disease onset, had a greater degree of muscle weakness than the Routine dataset, which was composed of consecutive patients followed in standard clinical care. The Outcome dataset, whose patients had a disease duration > 2 years and a high prevalence of disease remission, had the lesser severity of muscle impairment. In accordance with Harris-Love, et al,9 each muscle/muscle group was termed according to its function, rather than to its anatomic name.
In the 3 datasets, neck flexors consistently had the lowest frequency of the normal score of 10 and the highest frequency of a score < 5, followed by hip extensors, hip abductors, shoulder abductors, elbow flexors, hip flexors, wrist extensors, and ankle dorsiflexors. Based on the observed figures, neck flexors, hip extensors, hip abductors, and shoulder abductors were included in the MMT-4, whereas the MMT-6 also included elbow flexors and hip flexors. The score of the MMT-8, MMT-6, and MMT-4 ranges from 0 to 80, 0 to 60, and 0 to 40, with the highest score indicating normal muscle strength.
The score values of the muscle strength tools assessed in the study in the 3 patient datasets are presented in Table 3.
Construct validity. The Spearman correlations of the MMT-4 and MMT-6 with the other muscle strength tools and the JDM outcome measures in the 3 datasets are shown in Table 4. Both reduced tools were closely correlated with the original MMT-8 (with all r values > 0.9). Correlations were also strong with both CMAS and hMC, although lower for the CMAS (which does not include MMT components). Most of the correlations with the other JDM outcome measures were in line with the expectations, as they were stronger with the tools that assess constructs related to muscle strength, such as the PGA of muscle activity and the DAS muscle, and poorer with measures that address different disease domains, such as skin disease, pain, and CK. Overall, the correlations yielded by the MMT-4, MMT-6, MMT-8, CMAS, and hMC were comparable.
Multivariable logistic regression analysis confirmed the similarity of the correlations of the original and shortened tools as well as of the hMC (which incorporates the full MMT-8) with the other JDM outcome measures. This analysis showed that the correlation level was substantial only with the CMAS.
Internal consistency. This analysis was performed in each dataset separately. Overall, the Cronbach α was in the substantial to almost perfect range for both the MMT-4 (0.88–0.93) and the MMT-6 (0.91–0.96) and was comparable to that of the MMT-8 (0.93–0.97), CMAS (0.93–96), and hMC (0.91–0.96).
Responsiveness to change. The SRM and relative efficiency values for the MMT-4 and MMT-6 and the original tool in the Routine and JDM trial datasets are shown in Table 5. Both reduced tools performed slightly better than the MMT-8. The MMT-4 proved more responsive than the MMT-6.
Discriminant validity. In the Routine dataset, both reduced tools showed strong ability to discriminate patients judged subjectively by the caring physician as being in different disease activity states, with the median values increasing progressively from the state of high disease activity to the state of inactive disease (P < 0.001; Figure 1). A similarly good performance was seen in the discrimination between patients judged subjectively by the parent as being in the states of remission or continued activity/flare, and between patients whose parents were satisfied or not satisfied with the course of their child’s illness (P < 0.001; results not shown). The discriminant capacity of the MMT-4 and MMT-6 was overall comparable to that of the other muscle strength instruments (results not shown).
DISCUSSION
We developed and tested 2 reduced versions of the MMT-8, which are composed of a core set of 4 (MMT-4) or 6 (MMT-6) muscle groups. The selection of the items included in the shortened tools was based on the analysis of the distribution and severity of weakness in the 8 muscle groups that are part of the MMT-8 in a multinational sample of patients followed in routine clinical care, enrolled in an RCT, or included in a longterm outcome study. Altogether, these patients are likely representative of a broad range of activity and severity of JDM.
Individual muscle groups were stratified based on the 10-point MMT-8 grading criteria (Table 2). This exercise provides a detailed account of weakness of a standardized approach in a large cohort of patients with JDM. Its findings may offer guidance to delineate the objectives of exercise interventions and may influence the selection of targets in future therapeutic trials.
In accordance with previous studies9 and the well-known proximal pattern of weakness in JDM,10,34 we found that neck flexors, shoulder abductors, hip extensors, and hip abductors were the weakest muscle groups. These muscle groups were elected to constitute the MMT-4. Elbow flexors and hip flexors, which revealed an intermediate degree of impairment, were added to the muscle groups placed in the MMT-4 to make up the MMT-6. Wrist extensors and ankle dorsiflexors were the least severely affected and were therefore excluded.
In validation analyses, both reduced tools revealed strong correlation with the complete instrument and the other established muscle strength measures. As expected, the correlation level was highest with the MMT-8, and was closer with the hMC (which includes MMT components) than with the CMAS. The correlations with the other JDM outcome measures were in line with a priori predictions, showing better relationships with tools that assess similar constructs and poorer correlation with measures that address different disease domains. The Cronbach α for internal consistency was in the substantial to almost perfect range for both MMT-4 and MMT-6 in all datasets, and was comparable to that of MMT-8, CMAS, and hMC. Both reduced tools showed good responsiveness to change over time in both routine and clinical trial samples, and strong ability to discriminate between disease activity states, assessed subjectively by the caring physician or a parent, and between patients whose parents were satisfied or not satisfied with the course of their child’s illness. These findings indicate that both MMT-4 and MMT-6 possess good measurement properties and may serve as surrogates for the complete tools in routine practice and potentially also in research.
It is noteworthy that the responsiveness to change of the MMT-4 and, to a lesser degree, of the MMT-6, was slightly superior to that of the MMT-8. This finding suggests that focusing the assessment to a restricted core set of the most affected muscle groups could enhance the capacity of the MMT to capture improvement or worsening of muscle disease, which can be advantageous for its use in clinical trials.
Our study should be viewed in the light of certain limitations. Because validation analyses were conducted on data stored in existing databases, the inter- and intraobserver reliability of the reduced tools could not be evaluated. Further, due to the lack of prospective assessments, we could not investigate the capacity of the shortened tools to predict disease outcomes, such as continued activity, cumulative damage, or functional disability. In addition, the retrospective nature of our study did not allow us to assess the actual clinical performance of the abbreviated tools. We acknowledge that the results of our study do not imply that other muscle groups excluded from the MMT-4 and MMT-6 are not contributors to the impaired strength, functional limitation, and disability observed in children with JDM. Unfortunately, the study data did not allow us to examine alternative sets of muscle groups, such as those evaluated in the previous study by Rider et al.35 We previously underscored that the MMT-8 lacks the assessment of abdominal muscles, a major site of muscle disease in JDM and often, together with neck flexors, the last muscle group to recover.13 We also recognize that omitting 2 or 4 items of the MMT-8 does not lead to a significant reduction of the length of the examination. Further, the features of our study cohorts did not allow us to evaluate the full spectrum of weakness present in patients with JDM. Although there was no overlap between study visits across the 3 datasets, we cannot ensure that the same patient was included more than once. We could not address the capacity of the simplified tools to distinguish between the strength impairment that resulted from disease activity vs disease damage.36 Muscle imaging with magnetic resonance imaging (MRI) could be more suitable for this purpose.37 In this respect, a previous study with whole-body MRI has shown a high frequency of increased signal intensity in clinically asymptomatic distal muscles in the limbs.37 These muscles were the least affected in our patient population. However, the exclusion of distal muscles (wrist extensors and ankle dorsiflexors) in the shortened composite scores may make them unsuitable in the initial evaluation of patients with JDM or in certain subgroups of patients. The use of very abbreviated sets of muscles in testing could be problematic for patients with severely impaired joint range of motion due to contractures or calcinosis. The significance of this discordance between clinical and MRI findings is unclear. Finally, we should emphasize that although the shortened tools may also be suited for use in clinical trials and research, they are primarily proposed for use in routine care, particularly in children who may not cooperate for the entire assessment.
In conclusion, we found that the metrologic properties of the MMT-4 and MMT-6 were comparable to those of the other established tools, which suggests that the shortened version may serve as surrogate for the more comprehensive instruments, particularly in a busy clinical setting or in children who cannot cooperate for the entire duration of the assessment. The better responsiveness to change of the reduced tools, particularly the MMT-4, may make them suitable for use as endpoints in clinical trials. The measurement performance of these tools should be further tested in other populations of patients (including adults with dermatomyositis and polymyositis) and evaluated prospectively prior to potential use in clinical trials.
Footnotes
None of the authors declare any competing interests related to the present manuscript.
- Accepted for publication October 2, 2020.
- Copyright © 2021 by the Journal of Rheumatology
REFERENCES
- 1.
- 2.
- 3.
- 4.
- 5.
- 6.
- 7.
- 8.
- 9.
- 10.
- 11.
- 12.
- 13.
- 14.
- 15.
- 16.
- 17.
- 18.
- 19.
- 20.
- 21.
- 22.
- 23.
- 24.
- 25.
- 26.
- 27.
- 28.
- 29.
- 30.
- 31.
- 32.
- 33.
- 34.
- 35.
- 36.
- 37.