Abstract
Objective. To estimate systemic autoimmune rheumatic disease (SARD) prevalence across 7 Canadian provinces using population-based administrative data evaluating both regional variations and the effects of age and sex.
Methods. Using provincial physician billing and hospitalization data, cases of SARD (systemic lupus erythematosus, scleroderma, primary Sjögren syndrome, polymyositis/dermatomyositis) were ascertained. Three case definitions (rheumatology billing, 2-code physician billing, and hospital diagnosis) were combined to derive a SARD prevalence estimate for each province, categorized by age, sex, and rural/urban status. A hierarchical Bayesian latent class regression model was fit to account for the imperfect sensitivity and specificity of each case definition. The model also provided sensitivity estimates of different case definition approaches.
Results. Prevalence estimates for overall SARD ranged between 2 and 5 cases per 1000 residents across provinces. Similar demographic trends were evident across provinces, with greater prevalence in women and in persons over 45 years old. SARD prevalence in women over 45 was close to 1%. Overall sensitivity was poor, but estimates for each of the 3 case definitions improved within older populations and were slightly higher for men compared to women.
Conclusion. Our results are consistent with previous estimates and other North American findings, and provide results from coast to coast, as well as useful information about the degree of regional and demographic variations that can be seen within a single country. Our work demonstrates the usefulness of using multiple data sources, adjusting for the error in each, and providing estimates of the sensitivity of different case definition approaches.
Systemic autoimmune rheumatic diseases (SARD) are complex autoantibody-associated chronic inflammatory disorders characterized by rheumatic manifestations and multiorgan inflammation that often lead to damage. The definition of SARD used in our study includes the following: systemic lupus erythematosus (SLE), systemic sclerosis (SSc), primary Sjögren syndrome (pSS), and polymyositis-dermatomyositis. (In other contexts, the term SARD combines the above with other autoimmune rheumatic diseases such as rheumatoid arthritis and vasculitis.) SARD frequently require intensive specialty care and are potentially disabling, incurring high direct medical costs1,2,3 as well as indirect costs due to loss of productivity4. These effects are especially significant in developed countries with aging populations (because SARD prevalence affects people in midlife and beyond), especially in the setting of a declining pool of specialists (e.g., rheumatologists)5.
In 2011, our group published SARD prevalence estimates across 3 Canadian provinces: Quebec, Manitoba, and Nova Scotia6. The results from this work suggested a high burden, with the prevalence for certain demographics (e.g., older women) reaching or exceeding 1%. Ongoing surveillance of these diseases is important from both medical and public health perspectives, to improve understanding of their medical, personal, and societal effects. Our current paper thus aims to extend previous results, to estimate SARD prevalence across 7 (out of 10) provinces, providing a more complete national Canadian perspective. (Data from 3 small Maritime provinces — New Brunswick, Prince Edward Island, and Newfoundland and Labrador — were not easily available and constitute less than 5% of the population of Canada.)
Population-based administrative databases offer a potentially useful way of acquiring longitudinal epidemiological data on an entire jurisdiction. An inherent limitation, however, is that the diagnoses within billing and hospital data are not necessarily clinically confirmed, and any case definition has imperfect sensitivity and specificity. For this reason, as described below in our methods, we use latent class models to help deal with the imperfect nature of administrative data. The methods we use take into account imperfect sensitivity and specificity, and also afford a means of estimating case definition sensitivity and specificity, especially when these variables might differ importantly across different subpopulations or jurisdictions.
MATERIALS AND METHODS
Our research was approved by all relevant provincial data access and institutional review boards.
Data sources
Essentially, all Canadians are covered by provincial healthcare plans. In Canada (and other countries with comprehensive healthcare), all citizens are entitled to publicly funded physician care. Normally, each time an encounter occurs, a physician may bill the provincial government for services rendered. However, some physicians partake in “alternative payment plans,” which provide, for example, an annual salary in place of fee-for-service remuneration. Such arrangements are sometimes made, for example, for Canadian physicians who practice in an academic setting. To maintain provincial statistics regarding physician use, administrative databases collect “shadow bills,” meaning a claim is submitted with each visit but not remunerated directly.
The data sources used in our study were the provincial health administrative databases containing information on virtually all residents of Nova Scotia (913,000), Quebec (7.5 million), Ontario (12.2 million), Manitoba (1.1 million), Saskatchewan (968,000), Alberta (3.3 million), and British Columbia (4.1 million). These sources document essentially all physician services (with 1 physician diagnostic code for each visit, except Alberta and Nova Scotia, which allow more than 1), and all hospitalizations (with multiple discharge diagnoses for each hospitalization). In both cases, diagnoses are captured under International Classification of Diseases (ICD) codes, for which SARD cases fall under ICD-9 code 710.x, and ICD-10 Canadian codes M32.1, M32.8-32.9, M33–M34, M35.0, M35.8-35.9, and M36.07.
Case definitions
We used 3 definitions to establish probable cases of SARD: (1) 1 hospitalization with a SARD diagnostic code; (2) at least 2 physician visits for any SARD code, at least 2 months apart, but within a 2-year span; (3) 1 SARD billing code provided by a rheumatologist. Individuals would be detected by 1 or more, but not necessarily all these 3 definitions. Given that some provinces (e.g., Saskatchewan, Manitoba, Ontario) only record 3-digit billing codes, it was necessary to group SARD into 1 category rather than exploring individual conditions (such as SLE, which is represented by the 4-digit ICD-9 code of 710.0). The prevalence estimates included all cases identified for which there was provincial health insurance coverage as of December 31 of the last year of the study period: 1990–2004 for Nova Scotia, 1994–2007 for Quebec, 1993–2006 for Ontario, 1989–2009 for Manitoba, 1998–2007 for Saskatchewan, 1993–2007 for Alberta, and 1988–2007 for British Columbia. The calendar year period varied for each province because of differences in data access availability; we used in each case the maximum number of years available, and consider the potential effects of this in our discussion.
Prevalence estimates
The numerator for the prevalence estimates included all cases identified (that is, anyone who met at least 1 of the 3 definitions) who had provincial health insurance coverage as of December 31 of the last year of the study period. Persons who had died before the end of the observation interval were thus excluded. The denominator was the provincial population in the same year, obtained from Statistics Canada. Because each province captures demographic information (year of birth, sex, and postal code residence) for all healthcare beneficiaries, we were able to provide estimates stratified by age, sex, and region. Stratification by 2 age groups (all ages < 45 and ≥ 45) was required because some provinces provided cell-counts according to definitions, instead of raw data.
Statistical analyses
We used a previously developed Bayesian hierarchical latent class regression model, which does not assume the existence of a gold standard8, to adjust for the imperfect sensitivity and specificity of each case definition. Latent class methods consider different “tests” (case definitions) and the results for each subject, that is, whether they test positive or negative for each of the tests. Whether a case is a true case is not directly observed, but the probability that a given subject is a true case can be estimated based on the combined results for the multiple tests. Then, comparisons of the results of 1 test versus another can allow us to estimate which test has higher or lower sensitivity and specificity. The sensitivity and specificity estimates produced from this model are relative to the true disease status of subjects, which is not known and is thus a “latent” variable. This allows simultaneous estimation of disease prevalence, as well as the sensitivity and specificity of each case definition9. In the absence of a gold standard, multiple case definitions each provide some information about the case status of subjects. This allows the disease status for each subject to be estimated probabilistically, and the sum of these probabilities provides the number of estimated cases.
Bayesian methods10 use probability distributions to reflect uncertainty about variables in a model. One begins with a “prior distribution” which may be “uninformative” (where the results will thus be “informed” mainly by the data) or “informative” (which indicates that there is some knowledge outside of data concerning the likely values for a variable of interest that will be combined with the information in the data). Because 2 of our case definitions were based on a similar source (physician billing claims), our model also had to consider possible between-test correlation (the case definitions being considered as a “diagnostic test”). We handled this with a covariance term, as described11,12.
In the absence of a gold standard, there may be more variables to estimate than degrees of freedom. Therefore prior information is required on a subset of variables9. To estimate the variables of interest (disease prevalence and sensitivity/specificity of each case definition) we used informative prior distributions for some of the specificity values of the case definitions11. In our previous evaluations of rheumatic disease prevalence using administrative databases13,14, all case definitions had very high specificity, generally greater than 98%. So, for our primary analyses we set informative β (α 248.3, β 1.65) prior distributions for the specificities of the 2 billing data case definitions. This prior corresponds to specificities of 99% (95% credible interval 98, 100). Noninformative prior distributions were used for all other variables. For example, we used a uniform density for the prevalence of a SARD [density for prevalence is uniform on (0,1)].
Various factors, particularly age and sex, affect disease frequency, and these variables (as well as rural vs urban residence) may also affect the sensitivity of case definitions. For example, residents of urban areas likely have better access to rheumatology care than rural residents (who must often travel considerable distances to obtain rheumatology care); thus, rheumatology billing claims data may be more sensitive to detect SARD cases in urban areas. Our hierarchical model accounted for these differences11,15,16. Levels of the hierarchical model included (1) population sampling variability (assigned a binomial distribution) and misclassification error, adjusting for false-negative and false-positive case assignment according to estimated sensitivity and specificity; (2) demographic-related differences in disease prevalence (age, sex, and urban-vs-rural residence), input as a logistic regression model on the binomial probabilities from the first level of our model; (3) differences in the sensitivity of case definitions, according to the same demographics, input as a distinct variable for the sensitivity of each case definition. We used postal-code data to define urban-versus-rural residence (urban areas defined by Census Metropolitan Area classifications, residence codes in Saskatchewan)17. WinBUGS version 1.4.3 (MRC Biostatistics Unit) was used for all analyses.
We also did sensitivity analyses where different prior distributions for prevalence and sensitivity were used and the results were substantive. The estimates from these sensitivity analyses were essentially unchanged from our primary analysis, so only the results from the primary analyses were reported.
RESULTS
Table 1 provides SARD prevalence estimates from the Bayesian latent class hierarchical models categorized by age, sex, rural/urban status, and province. The total prevalence (Table 1C) in each province ranged from 2 to 5 cases per 1000 residents, with marginally higher prevalence rates in British Columbia and Ontario compared to the other provinces. Female to male ratios were similar across provinces. In all provinces, the highest prevalence was seen among women aged ≥ 45 years. There were trends for higher prevalence in urban-versus-rural settings, which was especially evident in British Columbia.
Sensitivity estimates for each of the 3 case definitions (rheumatology billing, 2-code physician billing, and hospital diagnosis) within each province are shown in Figure 1. As a general trend, sensitivity estimates for the case definition based on at least 2 SARD billing codes tended to be higher than sensitivity estimates for the rheumatology billing code definition or for hospitalization diagnoses. Rheumatology billing code definition sensitivity estimates tended to be higher for the younger groups and the urban groups. In general, hospitalization data was less sensitive for case detection, across provinces and demographics, compared to billing data.
DISCUSSION
In our earlier preliminary results from 3 provinces, we estimated the overall prevalence of SARD to be about 2–3 cases per 1000 residents. Stratified prevalence estimates across provinces suggested greater prevalence in females-versus-males, and in persons of older age. The prevalence in older females approached or exceeded 1 in 100. Adjusting for demographics, there was a greater prevalence in urban-versus-rural settings.
Our current results suggest an overall SARD prevalence of between 2–5 cases per 1000 Canadians, which is very consistent with our earlier, more preliminary analyses, based on 3 provinces. This is also consistent with the summed prevalence of North American estimates specifically for SLE, SSc, pSS, and inflammatory myopathies (polymyositis/dermatomyositis)8,18,19. Again, our prevalence estimates suggested similar demographic trends across provinces (i.e., higher prevalence in females vs males and with older age). In older women, the prevalence approached or exceeded 1 in 100 (Table 1), likely because of pSS, which has been shown to affect up to 1% of older women1. The consistency of findings between our analyses attests to the strength of the methods used. Moreover, this is the most comprehensive, inclusive Canadian estimate of SARD prevalence to date.
There were some trends for higher prevalence in urban-versus-rural settings, especially evident in British Columbia, although these trends need to be interpreted with caution given the possibility of residual confounding by slight differences in age distributions (because we only adjusted broadly, for age < 45 and age ≥ 45). Another limitation is our definition of urban-versus-rural areas, which provides a very broad definition of the concept of rurality20 and does not necessarily account for variations, among different rural locations, in access to medical care. British Columbia and Ontario showed trends for higher overall prevalence estimates. Again, residual confounding by slight differences in age distributions across provinces is possible, although as mentioned above, our estimates do account for age group (< 45 yrs or older).
The result from British Columbia could reflect that province’s position as the “Asia-Pacific gateway” (Asian immigrants and offspring being an important part of urban British Columbia), as well as the considerable proportion (5%) of British Columbia First Nations/Metis residents (who are found in both urban and rural locations). On the other hand, the First Nations/Metis population is also fairly high in Alberta (6%), Saskatchewan (15%), and Manitoba (15%)21. The Asian community in Canada is highly concentrated in urban areas of Ontario and British Columbia, and Asian Canadians are more likely than the total Canadian population to be between the ages of 15 and 45 years22. Both of these race/ethnic groups are at increased risk for SLE and possibly other SARD, likely owing to genetic factors23,24,25. Unfortunately, the administrative data used do not contain data by race/ethnicity, so exploration of this on a province-by-province basis is not possible.
Stratification by 2 age groups (< 45 and ≥ 45 yrs) was required because some provinces provided grouped data according to definitions, instead of individual-level data. This meant that residual differences in age structure must be considered, particularly because SARD prevalence increases with age. However, in actuality across provinces the percent of seniors is very similar (about 14–16% in 2010), aside from a trend in recent years for a lower figure in Alberta (11%), because of a positive net interprovincial migration as of 2009 of younger people26.
Because of differences in data-access availability, the number of years of data varied somewhat across provinces, although in most cases it was similar (i.e., about 14 years in most provinces). Recently, members of our group looked at the effects of increasing the number of years of observation, to determine effects on case ascertainment, and prevalence estimates, of SLE27. Those analyses suggested that periods of observation time less than 10 years will give falsely low prevalence estimates, but that beyond 10 years there is only a small gain in prevalence. This appears to be illustrated in our case, because the SARD prevalence estimates for 2 provinces with very similar race/ethnicity and age distributions, Saskatchewan and Manitoba, were comparable, despite the fact that the data period for Saskatchewan was 10 years, and for Manitoba, almost twice that.
Also interesting are trends for greater sensitivity of rheumatology billing code data in certain provinces, particularly British Columbia, and to an extent other provinces such as Saskatchewan. This contrasts with the relatively lower sensitivity of rheumatology billing code data in other provinces, such as Alberta. On one hand this may suggest fewer numbers of, and/or poorer access to, rheumatologists in 1 province versus another. However, based on Canadian Medical Association statistics for the distribution of rheumatologists across Canada28, the per-capita number of rheumatologists does not seem to correlate well with our sensitivity estimates for per-province rheumatology billing codes. For example, although the per-capita number of rheumatologists is slightly higher in British Columbia (1.09 per 100,000 residents) than Alberta (1.00 per 100,000 residents), Saskatchewan (which appeared to have fairly high sensitivity for rheumatology billing code data in SARD case detection) has the lowest per-capita number of rheumatologists in the nation (0.49 per 100,000 residents). This observation is limited because statistics for the number of rheumatologists in Canada may not actually reflect active practice, accounting for the number of part-time versus full-time rheumatologists or their academic versus clinical practice.
Although most physicians in Canada participate in fee-for-service care, alternative payment plans do exist, particularly for specialists (including rheumatologists) in certain provinces, although the extent varies across jurisdictions. Unfortunately, provinces have not followed consistent approaches to reporting services provided under alternative payment programs (e.g., shadow billing). As of 2003, about 11.7% of physicians’ payments occurred through alternative plans in Canada. That figure varies considerably across the provinces studied: 4.5% in Alberta, 6.6% in Ontario, 10.7% in British Columbia, 17.8% in Quebec, 15.4% in Manitoba, 21.9% in Saskatchewan, and 29.1% in Nova Scotia. Shadow billing is used to varying degrees in Quebec, Nova Scotia, and Saskatchewan. Ontario and the western provinces also use shadow billing, although as of 2005, none of those provinces had standard policies. While the effects of this would be to lower the prevalence estimates to different degrees in different provinces, it is difficult to estimate the magnitude of the effects29.
There are obvious limitations in using administrative databases. Most provinces, except Alberta and Nova Scotia, allow only 1 physician diagnostic code for each visit and therefore sensitivity for ascertainment of chronic diseases may be affected by comorbidities taking precedence. We are also limited because not all patients with SARD will have received a physician claim for a SARD during our study period and would not be counted as cases. This might result in the underascertainment primarily of milder cases. In addition to missing some true cases, the misclassification of some cases is also inevitable if sources such as medical records or classification criteria are considered the true gold standard. In an article determining the accuracy of SARD diagnoses from administrative data it was found that the majority of identified cases do have some type of SARD, although there is some misclassification between categories (SLE vs SSc, for example)30.
We note that model-based approaches to case definition from administrative data have been used recently by others to produce prevalence estimates for hypertension31 and comorbidities in multiple sclerosis32. These results, along with our own, have concrete applications for those who want to improve outcomes in these conditions. For instance, individuals with any of the conditions that comprise SARD require followup care from a select number of specialists (including rheumatologists). Therefore, knowledge about disease prevalence on a provincial level would be helpful to determine whether the resources in each province (such as physician-to-population ratios33) are adequate in dealing with the disease prevalence. While we did not explore differences in disease prevalence by health regions, this information is available for all jurisdictions.
Our results possibly suggest differences across provinces in how patients with SARD obtain medical attention. For example, in Nova Scotia, patients with SARD seem much less likely to be identified from hospitalization (vs physician billing) data. This phenomenon may be partially explained by the triage system for referrals at the province’s academic rheumatology center; SARD are considered a priority in terms of ambulatory visit wait times. This system may create better access to rheumatology care in Nova Scotia as compared to other provinces and this may optimize care and prevent hospitalizations. In addition, Nova Scotia allows for the recording of more than 1 diagnostic code per visit (although Alberta does, also). On the other hand, barriers to hospital admission for patients with SARD in Nova Scotia could result in a similar outcome.
Recently our group used administrative data from Alberta to study potential differences in SARD according to First Nation status34,35. This showed some suggestion of increased SLE and scleroderma in certain First Nation demographic groups; these trends were not noted for patients with inflammatory myopathy. The results may reflect true variations in SARD prevalence in these demographic groups, or other factors, such as systematic differences in healthcare delivery, which could lead to case ascertainment biases36.
Administrative databases have potential as vital resources for decision makers, including public healthcare officials, because they allow chronic disease surveillance. Our results suggest that such surveillance of some rheumatic diseases may indeed be feasible and useful, especially when using multiple data sources (e.g., billing and hospitalization data) adjusted for error. Using these methods, the prevalence of SARD is estimated to be between 2–5 cases per 1000 Canadians, overall, and prevalence approaches or exceeds 1% in older women.
Acknowledgment
The authors are indebted for the provision of data to Manitoba Health and Healthy Living, the 2 provincial departments in Manitoba responsible for healthcare services and healthy living initiatives. The results and conclusions are those of the authors, and no official endorsement by Manitoba Health and Healthy Living is intended or should be inferred.
- Accepted for publication December 19, 2013.