Abstract
The Cochrane Musculoskeletal Group (CMSG), one of 53 groups of the not-for-profit, international Cochrane Collaboration, prepares, maintains, and disseminates systematic reviews of treatments for musculoskeletal diseases. It is important that authors conducting CMSG reviews and the readers of our reviews be aware of and use updated, state-of-the-art systematic review methodology. One hundred sixty reviews have been published. Previous method guidelines for systematic reviews of interventions in the musculoskeletal field published in 2006 have been substantially updated to incorporate methodological advances that are mandatory or highly desirable in Cochrane reviews and knowledge translation advances. The methodological advances include new guidance on searching, new risk-of-bias assessment, grading the quality of the evidence, the new Summary of Findings table, and comparative effectiveness using network metaanalysis. Method guidelines specific to musculoskeletal disorders are provided by CMSG editors for various aspects of undertaking a systematic review. These method guidelines will help improve the quality of reporting and ensure high standards of conduct as well as consistency across CMSG reviews.
The aim of the Cochrane Collaboration is to help healthcare providers, patients, patient advocates and carers, and policy makers arrive at well-informed decisions on healthcare treatments by preparing, maintaining, and disseminating methodologically strong systematic reviews1.
The systematic review is an essential tool for managing the vast amount of information generated on the etiology, prognosis, incidence/prevalence, diagnosis, prognosis, and treatment of disease. These method guidelines focus on the assessment of treatments (including both benefits and harms) that are aimed at improving health. Cochrane systematic reviews are increasingly used as the basis for clinical decision support resources such as UpToDate and clinical guidelines [e.g., American College of Rheumatology (ACR) osteoarthritis (OA)2 and rheumatoid arthritis (RA)3 guidelines]. Compared to a narrative literature review, the systematic review uses “scientific strategies that are systematic, and designed to limit bias in the assembly, critical appraisal and synthesis of all relevant studies on a specific topic”4. Most Cochrane reviews are quantitative and where appropriate include metaanalysis, but now can include complementary qualitative data to address context and syntheses of how interventions work.
The Cochrane Musculoskeletal Group (CMSG), one of the largest of the 53 international review groups in the Cochrane Collaboration, synthesizes the results of existing studies of appropriate “fit-for-purpose” designs to determine the benefit and harm of interventions for the prevention, treatment, and management of musculoskeletal diseases including various forms of arthritis, soft tissue conditions, and osteoporosis. Conditions specific to the back and musculoskeletal injuries are addressed by the Cochrane Back Group (http://back.cochrane.org) and the Cochrane Bone, Joint, and Muscle Trauma Group (http://bjmtg.cochrane.org), respectively.
As of Issue 2, 2013, there are 160 completed CMSG reviews and 88 protocols in the Cochrane Database of Systematic Reviews, with its own impact factor of 6.471.
The rigorous, systematic approach used by Cochrane reviews aims to provide a definitive statement on the effects of healthcare treatments. This is useful when there are no other systematic reviews but may also help clarify confusion arising from single studies or systematic reviews with conflicting results.
The 2011 updated Cochrane Handbook for Systematic Reviews of Interventions5 reflects major methods advances; this has now been complemented by a set of essential and desirable methodological standards for Cochrane systematic reviews endorsed at the 2011 Madrid Cochrane Colloquium6. The methods have changed substantively since our original methods guidance paper published in 20067.
Although the updated Cochrane Handbook is comprehensive and easily accessible, its size is daunting to new authors and a number of review groups have found it helpful to tailor a short summary for their clinical area8,9,10,11,12. This tailored guidance is in alignment with the Cochrane Handbook and should improve consistency among authors and thus facilitate comparison across reviews.
Many clinicians and their patients find it difficult to master the key features of systematic reviews and communicate the key issues to each other and to others such as family members. This paper will help the reader master the key features. The companion article will provide a primer on options for translating results from CMSG reviews into “usable” and “useful” formats with the advent of summary of findings tables, podcasts, videos, decision aids, phone apps, and cloud technology13.
These guidelines, prepared by the CMSG editors (who have a combination of clinical, knowledge translation, methods and statistics expertise), are intended to complement the handbook and not to substitute for the handbook. We will use specific examples from musculoskeletal reviews to illustrate recommendations from the handbook.
Defining the question
First, the research question needs to be clearly formulated using the “PICOS” framework, i.e., a clinically relevant or policy-relevant question that takes into account the patient/population, intervention, comparison, outcomes, and study design, and includes both the benefit and harm of the intervention being studied.
Priority topics for new and updated reviews have been identified by CMSG editors and consumers based on criteria including the burden of disease, equity, identification of new interventions, number of new studies, and frequency of access for existing reviews.
Literature search and study selection
The complete search strategy for each database searched is defined a priori and is documented in the review appendices with the date so that the search can be duplicated14,15. The search strategy frequently needs tailoring to the topic, so it is reviewed before implementation by the CMSG Trial Search Coordinator and peer reviewed by information science specialists using the PRESS checklist (Peer Review of Electronic Search Strategies)16.
It is recommended that, at a minimum, the following databases and trial registers be searched: MEDLINE, EMBASE, the Cochrane Central Register of Controlled Trials (CENTRAL), ClinicalTrials.gov, and the World Health Organization International Clinical Trial Registry Platform portal. In addition, we recommend checking the references in identified relevant systematic reviews and individual studies that meet the review’s inclusion criteria. For systematic reviews of drugs, authors should search for adverse effects in Websites of regulatory authorities such as the US Food and Drug Administration (FDA) and the European Medicines Evaluation Agency (EMEA)17. For example, the FDA Website contains important trial and observational study data on tuberculosis and fungal infections from the use of biologics, which were included in the Cochrane and BMJ reviews on biologics overview and network metaanalysis18,19. The Trial Search Coordinator may suggest additional sources, such as conference abstracts from ACR and the European League Against Rheumatism (EULAR), depending on the topic.
We do not recommend excluding trials in languages other than English20. Some topics such as studies of the effects of medicinal plants may have a significant number of trials published in another language and the CMSG can assist with translation when necessary.
Two people should independently screen the titles and abstracts from the results of the searches for the selection of trials meeting the predefined inclusion criteria. The full text of those articles that appear to meet the inclusion criteria should then be obtained and assessed for eligibility. Those full-text studies that do not meet the inclusion criteria should be added to the Table of Excluded Studies and a reason provided for their exclusion.
INCLUSION CRITERIA
The minimum criteria for trial inclusion in the systematic review should be defined in advance and address several items using the PICOS framework.
Population
Participants of trials should be defined by acceptable diagnostic criteria where possible, such as the ACR criteria for OA, RA, gout, systemic lupus erythematosus, and fibromyalgia (FM)21,22,23,24,25,26. Specific exclusions, such as age, sex, and condition, must be detailed.
Trials may report “mixed populations” in which patients with different conditions are enrolled. For example, for reviews on knee OA, randomized controlled trials (RCT) included patients with both hip and knee OA27. If such situations are anticipated, review authors should define in advance how to handle these reports. Rather than excluding, a rule may be chosen to include those trials, requiring that at least a given percentage, such as 75%, of patients meet the inclusion criteria. It is also desirable to contact trial authors, to obtain data for the population of interest.
Intervention
Glasziou, et al have pointed out the importance of having sufficient information in papers to be able to apply the intervention to patients28. The intervention must be explicitly described. If applicable, the route of administration, dose, timing, duration of treatment, and concomitant treatments should be outlined.
An example of a definition of type of intervention could read like this29: Trials were included that investigated treatment with adalimumab 40 mg subcutaneously every week to every other week, alone or in combination with disease-modifying antirheumatic drugs (DMARD) for a minimum of 12 months.
Comparator
The comparator intervention should be explicitly defined (e.g., placebo, another treatment).
An example of a definition of type of comparator30: Studies were included comparing leflunomide treatment (as monotherapy or in combination with other DMARD) at a dose of 20 to 25 mg/day (with or without a loading daily dose of 100 mg given in the first 1 to 3 days) to placebo or other DMARD.
Outcomes
Cochrane reviews are only as useful as their outcomes are relevant and accurate. Cochrane reviews now report results by outcome. They should include all outcomes that are likely to be meaningful, and not include trivial outcomes. At the time of the title registration, the authors should list all patient-important outcomes, including both benefits and harms relevant to the intervention, organized from the most important to the least important. The major outcomes (up to a maximum of 7) to be presented in the “summary of findings” (SoF) table are selected at the protocol stage.
Review authors must choose at the protocol stage what they consider the main timepoint of interest for each outcome. This does not imply that they should extract only 1 timepoint; to the contrary, analyzing and depicting results over time is very informative. However, defining the timing in advance forces the review author to think about short-term and longterm effects and to consider whether both are relevant for their intervention in question. It helps to plan statistical analyses and to define the focus of the review.
The SoF table (Figure 1), which may be created using GRADEProfiler software31, is now shown on the first page of every Cochrane review, along with the matching abstract and plain language summary. Although review authors may complete an SoF table for each major comparison in their review, to best convey the main “evidence-based actionable message” to users, the Cochrane Library format places the single most important SoF table on the first page of the review. This means that the accuracy and consistency of the numbers and wording across the SoF table, plain language summary, and abstract are pivotal.
The CMSG is in the process of developing default outcome templates for SoF tables for classes of interventions for each condition. Standardizing the outcomes presented in these tables will improve consistency for readers and also permit easier production of overviews of reviews using network metaanalyses. The CMSG editorial team has drafted preliminary default guidelines for which outcomes should be included in SoF tables for pharmacologic and complementary interventions in the following conditions: RA, OA, FM, and ankylosing spondylitis. The preliminary default templates for pharmacologic and complementary interventions for RA and OA are shown in Table 1. These may need tailoring for specific interventions or for specific research questions and may require a different set of major outcomes. For example, in a review on a biologic for RA32, radiographic progression would be an important outcome of interest; in a review of arthroplasty, the imaging outcomes would be very different; while in a review on the effect of patient education programs for RA33, imaging changes may not be of key interest.
Common core sets established and validated by groups such as OMERACT (Outcome Measures in Rheumatology — an international initiative to improve outcome measurement in rheumatology) and their associated groups are encouraged. This has yet to be done for regional musculoskeletal disorders such as shoulder and elbow disorders, where currently a set of standardized measures does not exist; until then we favor a description of the most relevant ones from the patient’s perspective. The CMSG has a joint working group with OMERACT and their partners to develop standardized outcomes by both condition and intervention for use in CMSG SoF tables.
The CMSG accepts surrogate outcomes if other outcomes are not available and if they meet the following conditions: (1) they have been shown to be on the causal pathway between the disease and target patient-important outcome34,35,36. A high correlation would be indicative of this37; (2) the change in the surrogate largely identifies the intervention’s effect on the patient-important outcome34,35,36. For example, it is tempting to make inferences about the anti-fracture efficacy of pharmacotherapies on the basis of their effects on bone mineral density (BMD). However, there are many limitations associated with using BMD for this purpose. Studies using efficacy estimates from metaanalyses of RCT of antiresorptive therapies to explore the relationship between BMD and fractures using logistic regression analysis have demonstrated that the increases in BMD do not adequately explain the reduction in fracture risk38,39. It is important that the limitations of surrogate outcomes be clearly outlined in the review, and these are reflected in the subsequent Grading of Recommendation Assessment Development and Evaluation (GRADE) evidence profile.
Information on harms as well as benefits must be included. The Cochrane Adverse Effects Methods Group has developed guidelines for evaluating adverse effects17. The minimum recommendation is to collect the adverse events/effects reported in the trials. Adverse events are usually generic. If they do not differ across clinical indications, authors are encouraged to pool these from the other indications, when possible18. Where rare or delayed serious toxicity is a major concern, it is appropriate to do a more comprehensive review of adverse effects by including data from observational studies and high-quality registries. This is a daunting task when there are numerous observational studies, so at a minimum, information from regulatory authority Websites should be searched and a focused, selective review carried out. One example is the review by Rostom, et al,40 in which unpublished RCT safety data from the FDA Website were included in the metaanalyses.
STUDY DESIGN
Review authors should consider what study designs are likely to provide valid data to answer their questions. The study designs included will depend upon the question, the context, and the resources of the systematic review team. Reviews should define selection criteria for study designs according to their “fitness for purpose” for the research question being posed, rather than just follow an evidence hierarchy5. The rationale for the fitness for purpose should be clearly stated and explained.
RCT, where 2 or more groups are formed by randomly allocating participants so that any differences between groups can be attributed to the intervention, should always be included. Controlled clinical trials (CCT) are trials where allocation to treatment and control groups is quasi-random, for example, alternation, date of birth, or case record number. In some treatment settings, such as educational interventions, it is not possible to randomize individuals because of the risk of 1 group receiving some or all of the intervention of the other group (i.e., contamination) as a result of being in the same setting or place; cluster RCT overcome this contamination by randomizing the different individual practices to different groups41. Crossover RCT, in which each patient is allocated to a sequence of treatment and control interventions, can also be included, but their analyses need special attention5.
Other study designs can also be included, whenever possible, according to their “fitness for purpose”. In 2009, 6% of Cochrane reviews included nonrandomized study designs42 and it has become a priority to develop the skills and best-practice methods to ensure that this component of systematic reviews is useful. The Cochrane Non-Randomised Trial Methods Group has developed guidelines for nonrandomized studies to standardize searching and assessment of rare and delayed adverse effects that will not be detected in short-term trials43. A series of 6 papers has been published that provides an update on the increasing consensus on how these studies should be assessed and synthesized. The last article provides useful checklists to help authors44.
An example of study design45: To assess benefits and harms we included RCT. To further assess harms, we included the following types of studies as long as they reported at least 1 year of followup for patients taking anti-tumor necrosis factor agents, had a sample size > 100 patients46, and reported an a priori-selected adverse effect outcome: CCT, cohort studies (prospective, e.g., longterm extension of RCT, or retrospective), case-control studies, case series, and published registry data.
Assessment of risk of bias
In RCT, because an included study may be performed to the highest possible standards but some individual patient-important outcomes may be underpowered and/or still have an important risk of bias, the 2011 handbook also recommends the assessment of risk of bias by outcomes5.
Risk of bias is assessed in a 2-step process (Figure 2, Figure 3).
Step 1: The risk-of-bias tool addresses 7 different domains: (1) sequence generation; (2) allocation concealment; (3) blinding of participants and personnel; (4) blinding of outcome assessment; (5) incomplete outcome data; (6) selective outcome reporting; and (7) “other sources of bias”. Review authors need to specify in their protocol which issues they will consider for “other sources of bias”. Other potential sources of bias should address issues that may affect the internal validity of the study. The handbook provides further details on these issues such as significant baseline imbalances between groups, or situations where a cointervention is not administered evenly between groups. Each domain includes 1 or more specific entries in a risk-of-bias table. Within each entry, the first part of the tool involves assigning a judgment of low risk, high risk, or unclear risk, and the second part involves providing an explanation of the judgment. A summary table of review authors’ judgments for each risk-of-bias item for each study is shown in Figure 2 for the “Abatacept for rheumatoid arthritis” review. Independent assessment of risk of bias should be undertaken by at least 2 review authors. Where differences in assessment cannot be resolved, arbitration by a third person is warranted.
Step 2: The handbook suggests summarizing risk of bias for each important outcome within and across studies using 3 categories — low, unclear, and high risk of bias. Within a single trial, different outcomes may be at different risk of bias given that different studies may contribute to each outcome. Figure 3 shows a plot of the distribution of review authors’ judgments across studies for each risk-of-bias item in the “Abatacept for rheumatoid arthritis” review.
In nonrandomized (observational) studies, assessment of risk of bias is more difficult than assessment in an RCT. Risk-of-bias assessment methods for systematic reviews of nonrandomized studies are under development for the Cochrane Collaboration. Meanwhile, 6 existing useful tools have been identified3,48. One tool is the Newcastle-Ottawa Scale49 (www.ohri.ca/programs/clinical_epidemiology/oxford.htm), which assesses cohort and case-control studies and takes 5–10 min to complete. The second, by Downs and Black50, is a longer tool taking about 10–20 min to complete. CMSG authors should also consider the methodological checklists developed by the Scottish Intercollegiate Guidelines Network (www.sign.ac.uk/methodology/checklists.html).
DATA COLLECTION AND ANALYSIS
The CMSG recommends that at least 2 review authors independently extract data from included studies. Data collection forms should be used on all CMSG reviews and it is recommended that they be piloted on a sample of studies. It is important that key characteristics and contextual factors of each study be identified for entry into the “Table of Included Studies.” At the editorial office we have developed data collection forms that can be modified for new reviews.
Priority is given to extracting the information on up to 7 outcomes predefined for inclusion in the SoF table that is presented on the first page of the Cochrane review; however, the review authors should be alert to the possibility of important, unexpected findings, particularly serious adverse effects that may need to be recorded.
It is usually desirable to collect summary data separately for each intervention group and to enter these into RevMan (Review Manager, the software used for preparing and maintaining Cochrane Reviews), where effect estimates can be calculated. Examples are frequency summary data upon which effect estimates such as risk ratio (RR), OR, and risk difference (RD) can be calculated, or mean and SD for continuous data upon which effect estimates such as mean difference (MD) and standardized mean difference (SMD) can be calculated. Chapter 7 in The Cochrane Handbook describes how data should be extracted and converted when necessary to obtain an effect estimate.
Cluster randomized trials are often incorrectly analyzed41. When including cluster randomized trials in a review, we recommend that a CMSG statistical editor be consulted.
Metaanalysis
Because we want to provide the best numerical estimate of the probability of each patient-important outcome, metaanalysis should be undertaken when data are sufficiently clinically homogeneous. Straightforward statistical analyses should be performed using RevMan, if data are available and sufficiently similar. The timing of outcome measures should be provided for the most clinically relevant time frame. It may be appropriate to provide summary estimates for short, medium, or long term, depending on the intervention. For example, in the CMSG network metaanalysis of biologics for RA18, the methods section defined the following timing of outcomes: short (≤ 6 mo), intermediate (> 6 to 12 mo), or long duration (> 1 yr). Estimates based on these different timings were presented in the SoF tables.
Effect measures
The “effect sizes” for dichotomous outcomes may be expressed in RR, OR, Peto OR, or RD. Although absolute differences in important outcomes is the preferred way of presenting the magnitude of the benefit or harm of an intervention to patients and their clinicians, an absolute measure such as the RD is very vulnerable to heterogeneous baseline rates; a relative effect size measure is more stable, so these are preferred for the statistical estimation. The CMSG recommends that RR be used to express dichotomous outcomes because they are easier to understand51. When events are rare, the Peto OR is recommended52. There is no consensus on the definition of “rare;” a working rule of an event rate of < 10% can be used, but special care is needed when studies within the metaanalysis do not have any events.
For continuous outcomes, relative differences are again used, such as MD between the postintervention values, or the difference between baseline values and postintervention values, of the intervention and control groups. SMD should be used when results for continuous outcomes measuring the same concept are presented on different scales; for example, visual analog scale (VAS) and Likert pain scales. One important caveat associated with the use of SMD values in metaanalyses is that few clinicians, patients, journalists, or policy makers understand how to interpret them; we recommend transforming them back to a well-known scale for the SoF table, abstract, and plain language summary (e.g., VAS pain). For examples of this conversion, see Bliddal and Christensen53.
Although relative difference metrics are used in the RevMan statistic calculations, patients and their clinicians also need to be provided with absolute differences in the patient-important benefits and harms as listed in the SoF table. The frequency of events without treatment (i.e., the baseline prevalence) makes a marked difference. For example, a relative 50% success in achieving a patient-important reduction in severe pain in a group of 100 patients, 90 of whom report severe pain without the treatment of interest, gives an absolute patient-important reduction in severe pain in 45 patients out of 100 patients with pain. This is substantively different from the same relative 50% success only providing a patient-important reduction in severe pain in 5 of a group of 100 patients when only 10 report severe pain without the treatment of interest. Another way to interpret an SMD value is to convert it into a number needed to treat (NNT) through a transformation to an OR54,55. This OR can then be combined with an assumed control group risk to obtain an absolute benefit as in NNT56. To do this for continuous or categorical data, the review authors need to estimate a reasonable control event rate — the percentage of patients who would be expected to respond to placebo/sham therapy.
Subgroup analyses
Subgroups are frequently of clinical or policy importance, e.g., to determine the effects of dosage or disease severity on the response to treatment. The problem is that they may show spurious differences, i.e., by chance alone. However, if the review authors have some clear objectives that justify this in advance — to confirm clinically sound hypotheses — the CMSG endorses this prespecified behavior. Therefore, as few subgroups as possible should be prespecified. These should be justified against the criteria proposed by Sun, et al57 (Table 2).
Diversity/heterogeneity of effect sizes across available studies
Following the terminology of the handbook, the terms “heterogeneity” or “diversity” may be used to describe variability among studies included in a systematic review. Clinical diversity (or heterogeneity) is the most important — that is, the variability in participants, interventions (e.g., dose), context, comparator (including differences in “usual care”), and outcomes (both surrogate and clinical). Variability in study design is termed “methodological diversity (or heterogeneity)”. “Statistical heterogeneity” (or conventionally just heterogeneity) is the term used when the variation in intervention effects between studies is greater than that expected by chance. “Inconsistency” is the term used for quantifying the effect of heterogeneity on the metaanalysis.
This issue is characterized by the expression “one cannot combine apples and oranges”. It is important to take an initial look at the results for both clinical diversity and methodological diversity. Clinical diversity is assessed by checking that the patients, interventions, and comparators are not too different from each other such that combining them is clinically useless. Methodological diversity means checking that the studies are similar in terms of study design and risk of bias. Once satisfied that the studies are minimally diverse and that it makes sense to combine them in a metaanalysis, an assessment of the statistical heterogeneity must be undertaken by examining the forest plot and result of the I2 statistic and the τ2 statistic, described below.
A forest plot provides a visual sense of heterogeneity because one can easily see whether the different point estimates of the effect size of each trial all show either a benefit or harm. RevMan calculates I2 and τ2 statistics, used to indicate the presence of statistical heterogeneity. The τ2 statistic provides an estimate of the between-study variance. The I2 statistic describes the percentage of total variation across studies due to heterogeneity, and it does not inherently depend on the number of studies in the metaanalysis58, although the size of trials included in the metaanalysis should be taken into consideration for proper interpretation59. In Figure 4, at 12 months the effect size has an I2 of 0%, which is consistent with a τ2 of 0.
If the effects observed across trials are inconsistent and vary to a large extent (say, I2 > 50%), it is important that the review authors explore the results again and try to assess whether the differences can be explained by some clinical or methodological heterogeneity60. Inconsistency that cannot be explained (i.e., reduced) by prespecified stratified analyses will lead to an overall estimate with less confidence when interpreting the inference from the metaanalysis. In this case, instead of the fixed-effect approach, a suitable, more conservative approach would be a random-effects metaanalysis, so that the between-study variance is considered and the uncertainty of the effect estimate is reflected in wider CI in the model.
Sensitivity analyses
Sensitivity analyses should be performed to examine the strength of the results to risk of bias and the influence of other variables. Authors should prespecify in their protocol which key domains of the risk-of-bias criteria will be used to perform a sensitivity analysis, by outcome. For example, for each major outcome, those studies contributing data to that outcome that are judged at low risk of bias for the domains of allocation concealment, blinding of patients and outcome assessors, and incomplete outcome data may be compared with all studies to check the strength of the result of including all studies versus a restricted set of studies with a stronger methodological design. Effectiveness/pragmatic studies may need additional sensitivity analyses of considerations such as different populations, differences in interventions, or patient adherence61.
Forest plots
Using RevMan, the results of individual studies should be presented graphically in forest plots (Figure 4). The overall effect size is shown as a diamond (individual studies as a square), and the horizontal points of the diamond (horizontal line in an individual study) illustrate the 95% CI. The treatment effect is determined by the location of the square in relation to the vertical middle line that indicates the null hypothesis; an effect size is considered to have no statistical significance when the CI crosses the vertical middle line. When appropriate, data from more than 1 trial may be combined in a metaanalysis, and the diamond at the bottom of the graph provides an estimate of effect of this pooled data.
Grading of the evidence
In an effort to make it easier for the end user to understand the quality of the evidence or the “degree of confidence” in the reported results included in the review, we recommend that a rating or grade of the evidence for each major outcome be provided in each review. The GRADE approach31,62,63 now replaces the simplified grading system that was derived by the editors of Evidence-based Rheumatology14.
The GRADE approach specifies 4 levels of quality: high, moderate, low, and very low to quantify the “degree of confidence” in the reported results per outcome (Table 3). Note that this requires a decision for all the studies included in an SoF table (and hence is distinct from the assessment of the risk of bias or methodological strength of the individual studies). The highest quality rating is for a body of evidence based on data from randomized trials without important limitations and the lowest quality rating is for a body of evidence based on case series/case reports.
The quality of the body of evidence involves consideration of 5 factors (Table 4) that allow authors of systematic reviews to make a transparent judgment in how they downgrade the quality rating. With observational studies or downgraded randomized trials, 3 factors permit upgrading to moderate or even high quality (Table 4).
A detailed description of the factors that reduce or increase the quality of the evidence is provided in Chapter 12 of the Cochrane Handbook31.
Systematic reviews need to be conducted according to high methodological standards. Designed to accompany the detailed Cochrane Handbook for Systematic Reviews of Interventions5, this report provides guidelines tailored to authors undertaking a review within the Cochrane Musculoskeletal Group scope. These guidelines are consistent with the November 2011 Methodological Standards for the Conduct of Cochrane Intervention Reviews6. These guidelines on developing and performing a systematic review will help improve the quality of reporting and promote high standards of conduct as well as consistency across CMSG reviews.
Acknowledgment
CMSG editors who contributed to the preparation of this manuscript: Isabelle Boutron, Angela Busch, Ernest Choy, Robin Christensen, Rob de Bie, Rhian Goodfellow, Tracey Howe, Anne Lyddiatt, Mário Lenza, Philippe Ravaud, Raphaèle Seror, Beverley Shea, Maria Suarez-Almazor, and Karine Toupin-April.
Footnotes
-
Supported through a grant from the Canadian Institutes of Health Research.
- Accepted for publication August 7, 2013.