Abstract
Objective. Temporomandibular joint (TMJ) arthritis, commonly considered oligoarthritic/asymptomatic, occurs frequently in children with juvenile idiopathic arthritis (JIA), and gadolinium-enhanced magnetic resonance imaging (Gd-MRI) has proved to be a sensitive diagnostic tool in this context. We compared the reliability of clinical examinations to Gd-MRI results in diagnosing the condition.
Methods. Patients with JIA (134 consecutive) underwent routine clinical and Gd-MRI examinations. The clinical items examined were clicking, tenderness (TMJ/adjacent muscles), and mouth-opening capacity. Blinded MRI reading focused on inflammation (synovitis/hypertrophy). After statistical power analysis, the clinical findings for 134 healthy controls were included. Contingency analysis was used to determine the sensitivity, specificity, and frequency of clinical symptoms (JIA/healthy controls); Cohen’s κ was used to establish the interrater reliability.
Results. Statistically significant differences were observed between JIA and healthy control groups with regard to the concise screening items (power analysis > 0.95), whereas no differences in mouth-opening capacity were noted. In 80% of the patients with JIA, Gd-MRI revealed signs of TMJ arthritis, with positive correlations between concise screening items and Gd-MRI results. The average specificity was 0.81, but the sensitivity was low, at 0.42. Combining items led to a marked increase in the sensitivity (0.73). There was a high rate of both false-negative and false-positive results (corresponding to clinical underdiagnosis or overdiagnosis of TMJ arthritis).
Conclusion. Despite a relatively high specificity, clinical examination alone does not seem sufficiently sensitive to adequately detect TMJ arthritis. Thus, a relatively high number of cases will be missed or overdiagnosed, potentially leading to undertreatment or overtreatment. Gd-MRI may support correct diagnosis, thereby helping to prevent undertreatment or overtreatment.
- JUVENILE IDIOPATHIC ARTHRITIS
- TEMPOROMANDIBULAR JOINT
- CLINICAL SYMPTOMS
- MAGNETIC RESONANCE IMAGING
- TEMPOROMANDIBULAR JOINT ARTHRITIS
The prevalence of temporomandibular joint (TMJ) arthritis in juvenile idiopathic arthritis (JIA), first described by Still1 as early as 1897, has recently become the focus of clinical treatment requirements and scientific research. TMJ arthritis has long been underreported, perhaps because of the insidious progression of the disease and the difficulties in detecting it using clinical examination and imaging techniques2,3,4. Depending on the examination method, reports on the prevalence of TMJ arthritis have varied widely (from 17% to 87%), with more recent results based on gadolinium-enhanced magnetic resonance imaging (Gd-MRI) accounting for the higher numbers3,5,6,7,8,9,10,11. Müller, et al2 and Weiss, et al3 compared MRI and ultrasonography (US) to clinical TMJ examinations and showed that MRI was capable of diagnosing TMJ arthritis in 75% of all cases while US was neither sufficiently sensitive nor specific. With regard to the longterm sequelae of the disease, US proved somewhat more effective: at least 29% of TMJ arthritis cases were identified by US compared with 69% of cases by MRI3. Further, Müller, et al2 showed that, taking MRI as the gold standard, significantly more cases of TMJ arthritis were diagnosed correctly by clinical orthodontic examination than by US. TMJ arthritis has significant potential for subclinical progression, leading to destruction of the condyles, which may result in severe mandibular growth disturbances and subsequent facial asymmetry. Consequently there is a need to ensure reliable diagnosis as early as possible6. Diagnosis by Gd-MRI is known to correlate with histological findings to a great extent but is costly, psychosocially burdensome, and not always available12,13,14.
The aim of our present study was to test the reliability of clinical findings (including a concise screening protocol based solely on the clinical examination of the TMJ) in diagnosing TMJ arthritis as compared with the reference Gd-MRI method.
MATERIALS AND METHODS
Our study was approved by the ethics committee (University of Tübingen project number 613/2011A).
First, over a 3-month period, a total of 134 consecutive patients with JIA underwent routine clinical examination and Gd-MRI within the same week at the interdisciplinary TMJ outpatient clinic. Demographics were recorded as part of this process. Because the aim of the present work was a comparison of clinical and MRI results acquired at the same point in time, we did not take into account the JIA subtype, disease duration, disease activity variables (e.g., physician global assessment scale, number of active joints, etc.), or current medication, nor was it noted whether this was the first examination (clinical or Gd-MRI) for each patient with TMJ.
Second, healthy controls were recruited from patients attending the dental practice for routine dental checks. For ethical reasons no Gd-MRI was performed for this control group. Prior to the clinical examination, a survey containing questions about TMJ-related symptoms (jaw pain, TMJ sounds, headache, etc.) as well as general health (e.g., autoimmune diseases) was distributed among the children/their families. Children with TMJ complaints, TMJ-related symptoms, or autoimmune diseases were excluded from the healthy control group. Finally, clinical examination results were obtained for 134 healthy children and adolescents (matched with the patients for age and sex). The medical histories of these subjects showed no indications of TMJ-related symptoms/complaints, autoimmune diseases, or TMJ dysfunction. The clinical study protocol (CSP) for the healthy controls consisted of the same concise screening items (CSI) used in the CSP for the patients with JIA.
The MRI examinations were performed on a 1.5 Tesla system (Magnetom Avanto 1.5 T, Siemens AG) with 3-mm slice thickness. A standard dose of Gd was used (gadobutrol 0.2 ml/kg body weight, gadoterate meglumine) on the axial, coronal, and sagittal planes2,15. All MRI images were evaluated on certified viewing screens using Gentricity PACS (GE Healthcare).
For the purposes of the study, inflammation of the TMJ (active TMJ arthritis) was defined as excessive synovial Gd enhancement and synovial hypertrophy, whereby a line less than 0.5 mm thick was considered normal, as in other studies published in this field15,16,17,18,19. The extent of the inflammation or TMJ damage (flattening or destruction of the condyles) was not taken into account in our study. Synovial fluid accumulation was also not documented, because it is not clear the extent to which synovial fluid observed on short-tau inversion recovery/T2 MRI constitutes an indication of pathology rather than a normal finding. Examples of the MR images are provided in Figure 1. Two experienced MRI readers (1 pediatric rheumatologist with more than 5 years’ experience in MRI imaging for TMJ arthritis in JIA, and 1 orthodontist with more than 5 years’ experience in MRI imaging for TMJ arthritis in JIA and temporomandibular disorders), blinded for the results of the clinical examination, reviewed all results. Cohen’s κ was used to calculate the interrater reliability. In cases where discrepancies arose, the final decision was made by consensus.
The CSP was conducted in accordance with the relevant provisions of the Research Diagnostic Criteria for Temporomandibular Disorders20 and consisted of the following measures (Figure 2): (1) TMJ clicking (including crepitation) during mouth opening or closing: placing the fingers without pressure on the lateral TMJ pole, behind the tragus of the ear, the examiner asked the patient to gradually open his/her mouth as widely as possible, starting with teeth touching, and then close it again. This procedure was performed 3 times; (2) tenderness upon palpation of the lateral TMJ pole: the examiner applied 1 lb of pressure to the lateral TMJ pole, holding the head with the other hand and asking the patient if he/she felt any pain; (3) tenderness upon palpation of the masticatory muscles (pars superficialis of the masseter muscle, pars anterior of the temporalis muscle): palpation of the body of the masseter with a pressure of 2 lb, from the anterior region back to the angle of the mandible and continuing along the middle of the temporalis about 4–5 cm parallel to the lateral border of the eyebrow; and (4) reduced or asymmetric mouth opening: effective unassisted mouth opening, with or without pain. This item was measured with a disinfectable ruler based on the maximum active interincisal distance, corrected to take the overbite into account. Mouth-opening capacity was considered reduced at < 35 mm in children younger than 10 and at < 40 mm in children 10 years of age or older as compared with the findings for healthy children21,22. Mouth opening was considered asymmetric if the lateral deflection was > 2 mm. To measure the deflection, the position of the corresponding inferior front tooth in relation to the middle of the face was marked with black pen when the mouth was closed and then at its widest, with the distance between the 2 marks constituting the lateral deflection. The same examiner, who was experienced in temporomandibular disorders and blinded for the MRI results, performed all clinical examinations.
The first step of the statistical analysis was to compare the data obtained for the patients with JIA and the healthy controls as part of the concise screening and to test for discrepancies. Second, the results for the patients with JIA were compared to the Gd-MRI results regarding inflammation (TMJ arthritis).
Power analysis with GPower 3 (G*Power HHU)4,23 was used to determine the sufficiency of the chosen patients and to ensure an adequate number of healthy controls.
The statistical analysis consisted of a contingency analysis based on the right-sided Fisher’s exact test (α level 0.05) to evaluate the correlation between frequencies of pathological findings in the JIA patient group versus the control group.
Next, a contingency analysis and Fisher’s exact test (α level 0.05) were applied to determine the sensitivity, specificity, and false-positive rates of clinical symptoms compared to Gd-MRI. Items were analyzed both singly and as combinations to test for changes in sensitivity/specificity or false-negative/positive rates for multiple items. To this end, each single concise screening item was combined separately with 1, 2, 3, and 4 other items in all possible combinations.
As well as comparing the diagnostic effectiveness of clinical examination and Gd-MRI concerning TMJ arthritis, this study was also intended to evaluate the diagnostic reliability of the various clinical items (CSI). Therefore we divided our analysis into single-item and combination analysis (testing all possible combinations) to test whether any single item would have sufficient sensitivity and specificity to detect TMJ arthritis or whether instead a combination of items would prove more effective.
RESULTS
A total of 268 individuals were included in the study: 134 consecutive patients with JIA matched for age and sex with 134 healthy controls. The mean age of both cohorts was 13.3 years (SD 2.8), 66% female, and 34% male. Owing to the nature and aim of our study, no data concerning JIA subtype, disease duration, or disease activity variables were included.
The posthoc power analysis with the proportions P1 = 0.22 (patients) versus P2 = 0.13 (control group), α level 0.05, showed a sufficient power of 0.965 for the given proportions and sample size of 268 individuals in the 2 even groups compared.
The comparison between the concise screening items of the JIA vs healthy control groups revealed significant discrepancies (p < 0.01) for all items except reduced mouth-opening capacity (Table 1). The latter occurred with equal frequency (patients 19% vs control group 18%) in both groups (p = 0.37). The most significant difference between both groups was for asymmetric opening of the mouth, with a frequency of 62% in the patient group and 16% in the healthy control group.
TMJ arthritis was diagnosed by Gd-MRI in 80% of the 134 patients with JIA, with 25% exhibiting symptoms of unilateral and 55% bilateral TMJ arthritis. Cohen’s κ for the interrater reliability of the MRI-based diagnosis of TMJ arthritis was 0.74.
The most sensitive single clinical item (Table 2) for the detection of TMJ arthritis was asymmetric mouth opening, with a sensitivity (sens.) of 0.65 and a specificity (spec.) of 0.78. Both items for pain on palpation yielded intermediate values (sens. 0.61, spec. 0.71 for pain on palpation of masseter muscle and sens. 0.40, spec. 0.86 for pain on palpation TMJ). The least sensitive item was TMJ clicking (sens. 0.23, spec. 0.87), followed by reduced mouth-opening capacity (sens. 0.21, spec. 0.83).
There was a high rate of false negatives (up to 0.79) in the single-item analysis (Table 2).
High false-positive rates of 0.29 and 0.22, respectively, were observed for pain on palpation of masseter muscle and asymmetric mouth opening, with low false-positive rates for the other items.
Fisher’s exact test for the single-item analysis revealed significant correlations with TMJ arthritis detection by Gd-MRI, with low to moderate sensitivity and moderate to high specificity (Table 2).
The analysis of item combinations revealed an increase in the average sensitivity from 0.42 for the single-item analysis to 0.73 for the combinations. The highest sensitivity was recorded for the combination of all 5 items (sens. 0.85; spec. 0.54).
In general, the average specificity dropped as items were combined, decreasing from 0.81 for the single-item analysis to 0.70 for the combination analysis. The highest specificity was recorded for “TMJ clicking” + “Pain on palpation TMJ” (sens. 0.50; spec. 0.80).
The combination analysis revealed a marked decrease in the false-negative rate from 0.58 for the single-item analysis to 0.33 for the combination analysis.
We also observed an increase in the average false-positive rate from 0.19 for the single-item analysis to 0.35 for the combination analysis. A maximum of 0.46 was recorded for the combination of all 5 items.
DISCUSSION
Since TMJ arthritis may have an oligoarthritic/asymptomatic, progressive/destructive course, the importance of early diagnosis and consequent adequate treatment is obvious10,24,25,26,27,28,29.
The results of our present study confirm the high prevalence of TMJ arthritis in patients with JIA previously indicated by Gd-MRI examinations (80% of patients exhibited symptoms of TMJ arthritis). Similar rates have been observed in the respective study populations of other MRI-based studies2,3,5,29. These findings contrast with those of other, non-MRI based studies, which revealed lower rates based on clinical examination and/or the frequency of condylar alterations in radiographic examinations6,7,8,9,10. While the role of Gd-MRI as a highly sensitive and specific diagnostic tool for joint inflammation remains undisputed12,13,30,31, its routine use for screening purposes and for the early detection of TMJ arthritis is precluded for a number of reasons such as invasiveness, limited availability, and high cost.
Although clinical procedures for manual functional analysis and various screening techniques have been validated20,32,33 for the diagnosis of temporomandibular disorders, they have also been subject to criticism34. The concise screening tool for the diagnosis of TMJ arthritis was developed on the basis of these clinical procedures with the aim of creating a short, easily applicable clinical diagnostic procedure for the detection of TMJ abnormalities, particularly in children with JIA.
The statistically significant discrepancies in CSI frequency observed between patients with JIA and healthy controls were to be expected, because healthy controls should not display a higher frequency of pathological findings. Interestingly, both groups yielded similar results for mouth-opening capacity. This is surprising because mouth-opening capacity is widely considered a suitable criterion for the diagnosis and treatment monitoring of TMJ arthritis2,3,35. Given that the healthy controls did not have undetected TMJ arthritis, these findings cannot be explained conclusively. Possible explanations include the high variance of mouth-opening capacity in healthy children, demonstrated by Müller, et al22, and the problem of the smallest detectable difference while measuring mouth-opening capacity, discussed by Stoustrup, et al36. Another explanation could be that most publications indicating a high frequency of mouth-opening restriction listed this orofacial abnormality as an inclusion criterion, which may have biased the results. The present work included 134 consecutive patients with JIA, irrespective of the presence of orofacial abnormalities. Thus, reduced mouth-opening capacity may not always be a very reliable criterion for diagnosing TMJ arthritis. Further studies are needed to evaluate the significance and reliability of this item in diagnosing TMJ arthritis.
The results of the clinical examination revealed formal positive correlations between each single concise screening item and TMJ arthritis (detected by Gd-MRI). However, the overall sensitivity of the single items was low (between 0.21 and 0.65). This finding is in line with the low sensitivity of the clinical examination results observed by Weiss, et al3 and Müller, et al2. While these studies found that reduced mouth-opening capacity was the only clinical item to correlate significantly with TMJ arthritis or used it to document treatment response, our study revealed a very low sensitivity for this item (0.21). Despite its high specificity (0.83) in our study, the low sensitivity seems to indicate that it is not a reliable clinical item for the detection of TMJ arthritis. This may be due to the lack of distinction in mouth-opening capacity between patients with JIA and healthy controls as described above. Compared to other studies, this discrepancy may also be the result of specificities of the population of northern Germany, where joint hypermobility appears to be relatively prevalent37. This might cause difficulties in finding the limitation of the range of motion for all joints and particularly the TMJ. In any case, because no difference in mouth-opening capacity was observed between both groups, it appears to be of little relevance in this context and can only be considered useful in daily practice when combined with other diagnostic options and not as a key symptom.
In our study, the highest single-item sensitivity measured was for asymmetric mouth opening (0.65), which also demonstrated favorable specificity (0.78). This reflects the overall sensitivity and specificity results for clinical items found in other publications2,3,35. Nonetheless, no single item was found to have sufficient sensitivity to reliably detect TMJ arthritis. We therefore analyzed combinations of various concise screening items to test whether a combination of 2 or more clinical items would prove more effective. The combination of clinical items gave rise to a marked increase in the average sensitivity (from 0.42 to 0.73). The maximum increase observed was for the combination of all 5 items (0.85), for which a small but acceptable decrease in the overall specificity was also observed (from 0.81 to 0.70). Although these findings sound logical, this is the first scientific study, to our knowledge, to compare combinations of different clinical items and Gd-MRI results and to show that combining items can increase sensitivity while maintaining favorable specificity, which makes the clinical examination with combined items more reliable in detecting TMJ arthritis than the use of single items. This is especially relevant for the development of clinical scoring systems. Even in daily practice, combining concise screening items during clinical examination should increase the effectiveness of the examination in clinical diagnosis of TMJ arthritis.
An important finding in our present study is the high false-negative rate of clinically detectable cases of TMJ arthritis, corresponding to the number of cases of TMJ arthritis that will not be diagnosed by clinical examination alone. Although the false-negative rate corresponds to the low sensitivity and specificity of the single-item analysis, we believe this is the first study to show a false-negative rate. On the other hand, analysis of item combinations showed an increase in the number of patients diagnosed with TMJ arthritis, indicating that item combinations are more reliable as a diagnostic tool. Nevertheless, a relatively high number of patients would still fail to be diagnosed correctly with TMJ arthritis based on clinical examination alone, as already demonstrated by other studies.
The most significant result of our present study is the relatively high false-positive rate of clinically diagnosed TMJ arthritis, corresponding to the number of patients potentially wrongly diagnosed with TMJ arthritis by clinical examination alone. This confirms the results of Müller, et al2, which already included (with a smaller number of patients) false-positive and false-negative results based on a clinical orthodontic examination, where the presence of 2 or more of 6 items was defined as “active arthritis”. By contrast, the present investigation is the first to show the precise correlation of each CSI to the degree of inflammation. Thus, patients may be overtreated if treatment decisions are based only on clinical examination. Although the false-positive rate seems to rise with the number of items included in the combination, this increase cannot be explained without further investigation and may simply be the result of statistical analysis.
However, there are some limitations to this study. MRI is considered the gold standard for the detection of TMJ arthritis and clinical results are compared to this standard. Although synovial Gd enhancement is considered pathologic15, some concerns exist regarding the overinterpretation of MRI findings especially for synovial Gd enhancement16. Therefore, only excessive synovial Gd enhancement above normal in combination with synovial hypertrophy was classified as inflammation on the MRI images, whereby synovial hypertrophy caused by the response of the synovial tissue to the autoimmune inflammatory process was taken into consideration. Further MRI measures, such as synovial fluid accumulation or bone marrow edema/osteitis, were not taken into account, because the diagnostic value of these items with respect to TMJ arthritis is not known.
The extent of the TMJ inflammation or damage was not considered in our study and therefore no statements could be made regarding the relationship between the severity of the inflammation/damage and the sensitivity/specificity of the clinical examination.
For ethical reasons, healthy controls were not subjected to Gd-MRI examinations. Therefore our results are based on the presumption that these healthy controls do not have asymptomatic TMJ arthritis15,18,19,38.
Our present study has a cross-sectional design and represents patients attending a single center analyzed at a single point in time. Further longitudinal investigations are needed to scrutinize the relative ability of the various clinical items to predict damage.
Lastly, some of the studies referenced deal with TMJ dysfunction (TMD) and not TMJ arthritis. It is not known how TMD presents in Gd-MRI and whether there are similarities or discrepancies in the diagnosis of both conditions using Gd-MRI. Given the frequent delay in diagnosing JIA, the possibility that a certain portion of patients with TMD may also have JIA or another underlying chronic inflammatory condition cannot be ruled out. It is also important to mention that this screening tool does not include mouth-opening capacity with or without pain. Inflammation may not necessarily cause a mechanical limitation in mouth-opening capacity, but increasing pain could prevent the patient from opening his mouth wide. In this regard, Salé, et al39 demonstrated the high prevalence of symptoms in patients with disc displacement, which could cause a mechanical limitation of the TMJ. On the other hand, pain is always subjective and variable and cannot always be reproduced or quantified reliably40,41,42. Further, the physiological TMJ movement is a combination of rotation and translation of the condyle head on the top of the discus articularis, which separates the joint into an upper sliding part and a lower rotating part. Mouth-opening limitations and deflections may be related to dysfunctions where the translation is blocked and only rotation in the lower compartment is possible. These 2 issues require further investigation.
Clinical TMJ examination may not be a sufficiently reliable diagnostic tool in its own right for diagnosing TMJ arthritis. Considering Gd-MRI as the gold standard for the diagnosis of TMJ arthritis, clinical findings will not correlate to Gd-MRI results in a relatively high proportion of patients, and false-positive clinical findings may sometimes result in the administration of unnecessarily aggressive medical treatment.
Our results demonstrate that individual CSI used to diagnose TMJ arthritis offer relatively high specificity for each single item, together with a low overall sensitivity. Thus a relatively high number of patients will remain undiagnosed with TMJ arthritis by clinical examination alone, exposing them to undetected joint destruction. More importantly, clinical examination may yield a relatively high false-positive rate for TMJ arthritis, resulting in a relatively high number of potentially overtreated patients.
Thus it is clear that clinical examination alone is not always sufficiently reliable for the detection of TMJ arthritis and involves the risk of overdiagnosis or underdiagnosis. Gd-MRI is recognized as the gold standard for the diagnosis of TMJ arthritis and should be considered as an important additional diagnostic tool in cases of uncertainty.
- Accepted for publication April 3, 2014.