Abstract
Objective. To investigate the correlation between ultrasound (US) B-lines and high-resolution computed tomography (HRCT) findings in the assessment of pulmonary fibrosis (PF) in patients with connective tissue disorders (CTD).
Methods. Thirty-four patients with a diagnosis of CTD were included. Each patient underwent clinical examination, pulmonary function test (PFT), chest HRCT, and lung US by an experienced radiologist or rheumatologist. A second rheumatologist carried out US examinations to assess interobserver agreement. In each patient, US B-line lung assessment including 50 intercostal spaces (IS) was performed. For the anterior and lateral chest, the IS were the second to the fifth along the parasternal, mid-clavicular, anterior axillary, and medial axillary lines (the left fifth IS of the anterior and lateral chest was not performed because of the presence of the heart, which limits lung visualization). For the posterior chest, the IS assessed were the seventh to the eighth along the posterior-axillary and subscapular lines. The second to eighth IS were assessed in the paravertebral line. In each IS, the number of US B-lines under the transducer was recorded, summed, and graded according to the following semiquantitative scoring: grade 0 = normal (< 10 B-lines); grade 1 = mild (11 to 20 B-lines); grade 2 = moderate (21 to 50 B-lines); and grade 3 = marked (> 50 B-lines).
Results. A total of 1700 IS in 34 patients were assessed. A significant linear correlation was found between the US score and the HRCT score (p < 0.001; correlation coefficient ρ = 0.875). A positive correlation was found between US B-line assessments and values of DLCO (p = 0.014). Both κ values and overall percentages of interobserver agreement showed excellent agreement.
Conclusion. Our study demonstrates that US B-line assessment may be a useful and reliable additional imaging method in the evaluation of PF in patients with CTD.
Pulmonary fibrosis (PF) is one of the manifestations of lung involvement in connective tissue diseases (CTD) that may lead to severe impairment of respiratory function and gas exchange1,2,3,4. Conventional chest radiography has been widely used as the first imaging approach to assess PF5, but low sensitivity in the early stages has limited its use in daily clinical practice6. Currently, high-resolution computer tomography (HRCT) is the most common imaging technique used in the assessment of PF7,8. However, its use involves high doses of ionizing radiation.
Although the role of ultrasound (US) in different lung diseases has been described9,10,11,12,13,14,15,16,17, only in recent years has its criterion validity in the assessment of PF been investigated in patients with systemic sclerosis (SSc)18. This has opened a new research field focused on the use of US for assessment of PF in CTD.
Our main aim was to investigate the correlation between US and HRCT findings in the assessment of PF in patients with CTD. Our secondary aims were to determine the interobserver agreement in the assessment of US findings and the correlation between the US and clinical data.
MATERIALS AND METHODS
Patients
We included in the study consecutive patients with diagnosis of CTD [SSc, Sjögren’s syndrome (SS), antisynthetase syndrome, dermatomyositis (DM), mixed CTD, and undifferentiated CTD] attending the outpatient and inpatient clinics of the Rheumatology Department of Università Politecnica delle Marche, Italy. The diagnoses were made according to the respective international criteria. The inclusion criteria were age > 18 years, nonsmoker, and having a diagnosis of CTD. To avoid overlap of lung findings, patients were excluded if they had a history of pulmonary diseases other than secondary PF, such as chronic obstructive pulmonary disease and pulmonary edema due to heart failure, and if they had ever undergone pulmonary surgical procedures. The exclusion of other causes of PF was made mainly on the basis of clinical manifestations. In those patients with a minimal suspicion of heart failure, echocardiography was done to exclude cardiac involvement.
Study design
All patients underwent the following procedures in the same day: (1) a complete clinical evaluation by an expert rheumatologist. Particular attention was paid to the detection of fine crackles and of a “Velcro sound” at the lung bases during auscultation. Health-related quality of life was calculated using a validated Italian translation of the self-administered Medical Outcomes Study Short-Form 36 (SF-36; IQOLA SF-36 Italian Version 1.6)19,20,21; (2) pulmonary function test (PFT) and DLCO measure; and (3) chest US examination, carried out by 2 rheumatologists to evaluate the interobserver reliability. The first rheumatologist (MG) had 8 years of experience in musculoskeletal US and 4 years of experience in sonographic lung assessment, and was blinded to clinical and spirometric data. The second (MT) had 2 years of experience in musculoskeletal US and 1 year of sonographic lung assessment, and was blinded to clinical and spirometric data and to the first operator’s results. Prior to our study, the investigators reached a consensus on the US scanning technique and on interpretation of US findings.
HRCT examinations were carried out at the Radiology Department of the same university and performed within 7 days after US assessment by an expert radiologist who was blinded to clinical, PFT, and US findings.
Our study was conducted according to the Declaration of Helsinki and local regulations. The local Ethics Committee gave approval for the study and informed consent was obtained from all patients.
Pulmonary function tests
Standard spirometric measurements of lung volumes, flow indices, and DLCO were performed in the Lung Function Laboratory of the Pulmonary Department (Ospedale “Carlo Urbani”, Jesi, Ancona, Italy). Forced vital capacity (FVC) and forced expiratory volume in the first second (FEV1) were measured with a computerized lung analyzer (Masterscreen PFT-PRO, Viasys Jaeger, Höchberg, Germany). The DLCO was determined as the single-breath diffusing lung capacity and corrected for hemoglobin and CO levels. The results were expressed as a percentage of predicted values.
US assessment
US examinations were performed using a MyLab 70 XVG (Esaote Biomedica, Genoa, Italy) equipped with a 2–7 MHz broadband convex multifrequency transducer, with patients in the supine or near-supine position (with the arms elevated and hands clasped behind the neck) for anterior and lateral scanning, and in the sitting position (with the arms along the trunk) for posterior scanning.
Anterior chest wall was defined from clavicles to diaphragm and from sternum to anterior axillary line. The lateral chest wall was defined from armpit to diaphragm and from anterior to posterior axillary line, whereas the posterior chest wall was delineated from a line at the first dorsal spinal apophysis to that at the tenth dorsal spinal apophysis, from the posterior axillary line to the paravertebral line.
For the anterior and lateral chest wall, US assessment was performed bilaterally on the parasternal, mid-clavicular, anterior axillary, and medial axillary lines. On the right side, US assessment was performed from the second to the fifth intercostal space (IS), whereas on the left side the last IS was not evaluated because the heart limits lung visualization22,23. US examination of the posterior chest wall was performed including the seventh and eighth IS, assessing the posterior-axillary and subscapular line. Finally, in the paravertebral line, US examination was performed from the second to the eighth IS. Figures 1 and 2 show the IS assessed by US and the relative patient and probe positions. The scanning protocol included assessment of 50 IS for each patient (28 for the anterior chest and 22 for the posterior chest).
In the IS the elementary finding evaluated was the US B-lines, artefacts generated from the thickened interlobular septa at lung surface level. It was defined as a hyperechoic narrow-based reverberation type of artefact, spreading like a laser ray up to the edge of the screen. The global number of US B-lines was recorded, summed, and graded according to the following semiquantitative scoring: grade 0 (normal) < 10 B-lines; grade 1 (mild) 11 to 20 B-lines; grade 2 (moderate) 21 to 50 B-lines, and grade 3 (marked) > 50 B-lines24. The US B-line data were classified according to the semiquantitative score to correlate with the HRCT findings. The semiquantitative score was obtained using the distribution of percentile analysis. US examinations were performed independently by 2 sonographers blinded to clinical and HRCT data. They recorded representative samples of the full examination.
HRCT assessment
HRCT examination was performed using a standard protocol with a CT 644E Light Speed VCT scanner (GE Healthcare) and a scanning time of 0.65 s. Scans were obtained at full inspiration with the patients in the supine position, at 120 kV and 300 mAs, with a slice thickness of 1.25 mm and slice spacing of 7 mm. The scans were then reconstructed with a high-resolution “bone” algorithm (window level 7500–7600 HU; window width 1800–2000 HU). In cases showing increased opacification in the posterobasal segments, a limited number of sections was also acquired through the lower zones of the lung, with the patient in the prone position, to ensure that opacification was not due to gravity-dependent perfusion.
Pulmonary involvement was evaluated according to the Warrick, et al score25. To correlate the US B-lines with HRCT findings, the following semiquantitative score was used: grade 0 = absence of abnormalities (0 points), grade 1 = < 8, grade 2 = 8–15, and grade 3 = > 15 points.
Statistical analysis
Statistical analysis was performed using MedCalc, version 10.0 (MedCalc Software, Mariakerke, Belgium). Standard descriptive results were expressed as mean and SD and median. Categorical data were expressed as proportions. Chi-square analysis was used for comparison between the US and HRCT, while Spearman’s correlation coefficient was used for correlation of the US and HRCT. P values < 0.05 were considered statistically significant. Box and whisker plots by Kruskal-Wallis test were used to represent the correlation between the DLCO and US findings.
The interobserver agreement between the 2 investigators has been calculated in terms of semiquantitative scoring by weighted κ statistic and overall agreement. A κ value of 0.01–0.20 was considered poor, 0.21–0.40 fair, 0.41–0.60 moderate, 0.61–0.80 substantial, and 0.81–1.00 almost perfect. The minimal detectable difference was calculated by analysis of Bland-Altman plots for any anatomical chest line.
RESULTS
Thirty-four patients (30 women, 4 men) with a diagnosis of CTD (26 SSc, 2 SS, 2 antisynthetase syndrome, 2 DM, 1 mixed CTD, and 1 undifferentiated CTD) were included in our study. Demographic and clinical data are reported in Table 1.
A total of 1385 US B-lines in 1700 IS of 34 patients were found. Globally (data obtained from both imaging techniques), 14 patients (41.1%) showed grade 3 of PF according to the semiquantitative scoring, 6 patients (17.6%) grade 2, and no patients grade 1.
When the imaging techniques were analyzed separately, 16 patients (47.1%) showed a grade 3 of PF according to the US semiquantitative scoring. Eight patients (23.6%) showed a grade 2 and 4 patients (11.8%) showed a grade 1. Five patients (14.8%) did not show US signs of PF. Regarding the HRCT findings, 16 patients (47.1%) showed a grade 3 of PF according to Warrick score. Ten patients (29.4%) showed a grade 2, and no patient showed a grade 1. Eight patients (23.5%) did not show HRCT signs of PF. Five patients (14.8%) reported pulmonary hypertension in the echocardiography (considering > 25 mm Hg the cutoff value for mean pulmonary arterial pressure).
The mean time spent on each US examination (including 50 IS) was 23 min (SD 4.5, range 16–30).
The IS with the most US B-lines was the eighth of the subscapular line, followed by the eighth of the posterior-axillary line, whereas the one with the fewest B-lines was the second of the median-axillary line. Globally, through the posterior wall a higher number of B-lines was found concerning both anterior and lateral aspects of the chest (Table 2).
A significant linear correlation was found between the US and the HRCT scores (p < 0.001; coefficient of rank correlation, ρ = 0.875; Figure 3A). This correlation was also calculated using the chi-square test that confirmed the data (p = 0.0006, contingency coefficient 0.745). Moreover, a positive correlation between US B-line assessments and values of DLCO and DLCO/FVC was found (p = 0.014, p = 0.032, respectively; Figure 3B). Figure 4 shows the various US scores of PF involvement.
Both the κ values and percentages of overall agreement of US B-lines in semiquantitative interobserver assessments showed almost perfect agreement in results between the 2 sonographers (Table 3).
DISCUSSION
Our study demonstrates a significant positive correlation between US and HRCT scores in the assessment of PF in patients with CTD. Such a correlation should be interpreted considering that our cohort of patients showed varying grades of PF severity (from mild to severe), unlike the patients in previous studies. The potential of US in the assessment of PF in patients with SSc has been suggested. Gargani, et al18 showed a positive correlation between US and HRCT data using a cardiac sector transducer (2.5–3.5 MHz). We obtained similar results in a heterogeneous group of patients, confirming the usefulness of this technique.
Moreover, we proposed a semiquantitative system for the assessment of US B-lines that showed itself to be highly reproducible. An excellent interobserver agreement was found between the 2 investigators. This is important for medical training and for the development of standardized teaching programs for use of this technique.
The mean time spent to perform a complete US B-line assessment for each patient was 23 min (range 16–30). This result is controversial, because some authors have indicated that a comprehensive US B-line assessment could be performed in < 10 min18,26. We believe that this is possible in patients without severe PF. The count of B-lines is more prolonged in patients with severe PF because of the higher number of B-lines to evaluate and the more challenging assessment of US B-lines at IS of the upper part for the slanting position of ribs.
The positive correlation between US B-lines and DLCO and DLCO/FVC is a further element supporting the construct validity of US B-line assessment in PF evaluation. Considering that DLCO reduction is not specific for the presence of PF27, an US B-line count could add further data that can be used for both diagnostic and monitoring purposes. As well, elderly people could not always perform breathing maneuvers correctly, resulting in tests that were not applicable.
We found a higher number of US B-lines in the lower IS of the posterior chest. This is in accord with the typical features of PF in CTD, which involves mainly the lower parts of the lung. Thus, the lower IS of the posterior chest could be used to reveal US B-lines in the early phase of the disease.
The advantages of US techniques are well known. It is a widely available bedside procedure, nonionizing, inexpensive, and accepted by the patients. These features are also of value in the assessment of US B-lines. Even portable machines and transducers with large surfaces and frequencies between 5 and 7.5 MHz permit a complete and detailed lung assessment28.
Although HRCT remains the most frequent imaging technique to assess PF, our results indicate that US is a useful tool for systematic evaluation of the patient who may have pulmonary involvement, together with the PFT and DLCO. Moreover, US characteristics make this technique ideal for patients who should avoid ionizing radiation tests, such as young patients or pregnant women.
Our study has some limitations. The low number of patients did not permit accurate evaluation in terms of sensitivity and specificity that could better support these data. Second, a test for intraobserver agreement to support our reliability data more consistently was not performed. Finally, the findings were not correlated to stratified HRCT findings such as ground-glass or alveolitis. In this way, our US results do not necessarily reflect the whole dimension of pulmonary disease in patients with CTD. However, Lichtenstein, et al showed that US B-lines have a sensitivity of 92.5% and a specificity of 65.1% for diagnosis of alveolar-interstitial syndrome23.
Our study shows that US B-line assessment may be a valid and reliable additional imaging method in the evaluation of PF in patients with CTD and provides evidence justifying further investigation and multicenter studies. Additional investigations may be useful to support these observations; additional studies should involve a larger series of cohorts and should include sensitivity and specificity, predictive value, and stratification of Warrick score into fibrosis and ground-glass/alveolitis, to demonstrate which correlate better with HRCT findings.
Acknowledgment
We thank the Lung Function Laboratory of the Division of Pulmonary Diseases (Carlo Urbani Hospital, Jesi, Ancona, Italy) for the assessments of the pulmonary function test.
- Accepted for publication April 30, 2012.