Abstract
Objective. The goal of the Outcome Measures in Rheumatology (OMERACT) 12 (2014) equity working group was to determine whether and how comprehensibility of patient-reported outcome measures (PROM) should be assessed, to ensure suitability for people with low literacy and differing cultures.
Methods. The English, Dutch, French, and Turkish Health Assessment Questionnaires and English and French Osteoarthritis Knee and Hip Quality of Life questionnaires were evaluated by applying 3 readability formulas: Flesch Reading Ease, Flesch-Kincaid grade level, and Simple Measure of Gobbledygook; and a new tool, the Evaluative Linguistic Framework for Questionnaires, developed to assess text quality of questionnaires. We also considered a study assessing cross-cultural adaptation with/without back-translation and/or expert committee. The results of this preconference work were presented to the equity working group participants to gain their perspectives on the importance of comprehensibility and cross-cultural adaptation for PROM.
Results. Thirty-one OMERACT delegates attended the equity session. Twenty-six participants agreed that PROM should be assessed for comprehensibility and for use of suitable methods (4 abstained, 1 no). Twenty-two participants agreed that cultural equivalency of PROM should be assessed and suitable methods used (7 abstained, 2 no). Special interest group participants identified challenges with cross-cultural adaptation including resources required, and suggested patient involvement for improving translation and adaptation.
Conclusion. Future work will include consensus exercises on what methods are required to ensure PROM are appropriate for people with low literacy and different cultures.
Inequities in health refer to differences in health outcomes that are avoidable and unfair1. Patients’ low literacy and the lack of adequate cross-cultural adaptation of patient-reported outcome measures (PROM) used in clinical trials can contribute to inequities.
At the Outcome Measures in Rheumatology (OMERACT) 12 meeting (2014), the goal of the equity special interest group (SIG) session was to determine whether and how PROM should be assessed for readability and comprehensibility for patients with different literacy levels and cultures.
The new OMERACT Filter 2.0 outlines the importance of contextual factors as variables in trials2. Both the patient’s ability to understand the questions being asked and the appropriateness for the patient’s culture are contextual factors that need consideration. Filter 2.0 includes a checklist for developing core outcome measurement sets. The equity SIG’s longterm goal is to add checklist items that remind developers to consider equity, in terms of the readability of PROM and the potential need for cross-cultural adaptation.
Background
The distinction needs to be made between literacy and health literacy3: this article confines itself to the former. There is no internationally agreed-upon definition of literacy, but most definitions include skills for reading, writing, and numeracy4. Globally, the percentage of the population aged 15 years and older who can read, write, and understand simple statements is reportedly 84%5. However, this varies by country, region, and population group. Low literacy is associated with lack of health knowledge and preventive behaviors, increased hospitalizations, and poorer self-management of chronic disease6,7.
Literacy is also an important equity issue for clinical trials. Despite the attention of PROM developers, many PROM include high-level language and complex sentences, which may make understanding and completing them difficult for people with low literacy skills8. Because of poor understanding, people with low literacy skills may be less likely to be recruited into trials9. This presents ethical and equity issues, because all intended users of a health intervention have the right to participate in research10, and their exclusion may lead to a risk of selection bias. In addition, when they are included, they may answer questions less accurately or less completely.
Comprehensibility remains an important consideration when a questionnaire is adapted for use in another culture. Cross-cultural adaptation is a related but separate equity issue that is equally essential. Achieving equivalence between the original and adapted versions of a questionnaire refers to the extent to which an instrument is interpreted similarly in 2 or more cultures11. While translation has been a focus for PROM developers, cultural adaptation has received less attention12. In addition to translation, an adaptation process is required to ensure that items remain equivalent in content when applied in different cultural contexts. This ensures that a construct is measured the same way across cultures, supports fidelity of the culturally adapted tool, and allows for valid comparisons of trial results across countries.
A recent literature review identified 31 guidelines for cross-cultural adaptation, with no consensus on the best method13. A summary of those guidelines is provided in Table 1. Most guidelines recommend that questionnaires be translated, back-translated, and then reviewed by a committee to ensure equivalence to the original; however, empirical data are lacking for these recommendations. The role of the committee is to ensure that each item is functionally equivalent in the different setting, and that the translation will be understandable and elicit the same answers.
Failing to consider the role of readability, comprehensibility, and cultural differences in the development/implementation of PROM may lead to measurement errors. This can affect our ability to accurately evaluate the effect of interventions across all populations with rheumatic diseases, including disadvantaged groups, and may contribute to inequities. Special attention must be given to the equity aspect of PROM, particularly as clinical trials often use these instruments as primary outcome measures.
MATERIALS AND METHODS
Literature review
In preparation for the OMERACT 12 meeting, we conducted a literature review to identify methods for measuring comprehensibility and cultural adaptation of questionnaires and instruments used in health research in Medline. A sensitive search strategy was designed to retrieve systematic reviews describing the methods of measuring or validating the comprehensibility and cultural adaptation of questionnaires and instruments in health research. An electronic search strategy (see Appendix 1) was developed for OVID MEDLINE (1946–December 11, 2013) by a librarian (TR) and refined after expert review of a selection of sample citations retrieved by the OMERACT equity group. Experts in the field of comprehensibility of written materials and cultural adaptation were also consulted to identify relevant papers.
Comprehensibility and overall quality of questionnaires
Two questionnaires available in at least 2 languages, the Health Assessment Questionnaire (HAQ) and the Osteoarthritis Knee and Hip Quality of Life (OAKHQOL), were considered. The HAQ, developed in English, is widely used and has been adapted into over 60 languages14,15. The HAQ was assessed in English, Dutch, French, Spanish, and Turkish; and the OAKHQOL in English and French. The OAKHQOL is a recently developed disease-specific questionnaire16.
Two methods for assessing the comprehensibility of written text were applied: (1) Readability formulas: We used 3 standard readability formulas: the Flesch Reading Ease (FRE), the Flesch-Kincaid grade level (FK), and the Simple Measure of Gobbledygook (SMOG)17,18,19. These assess text complexity using sentence length and syllables per word (Table 2). To apply the formulas in this study, when a question stem had multiple responses, the text was modified to create complete sentences to allow assessment. (2) Evaluative Linguistic Framework for Questionnaires (ELF-Q): This tool, developed from the ELF20,21 is based upon systemic functional linguistics and provides a more meaningful assessment of the likely comprehensibility of written materials and how they can be improved22. The ELF-Q assesses characteristics related to the overall organizational or generic structure of the text, metadiscourse, headings, rhetorical elements, relationship between writer and reader, technicality of vocabulary, lexical density, context appropriateness, and format. An overall judgment can then be made about overall text quality.
Cultural adaptation
The results were presented of a study initiated at OMERACT 10 (2010) that compared various methods of cross-cultural adaptation with/without back-translation and with/without expert committee in an experimental design23. The expert committees each had 6 people: translator, linguist, clinician, health education theory expert, patient, and methodologist.
SIG session
Participants were introduced to the concept of readability and comprehensibility. The limitations of standard readability formulas were acknowledged and the ELF-Q was described. Results of the readability assessments and cross-cultural adaptation study were presented and discussed. The participants were asked to discuss the potential for these concepts to be included in the Filter 2.0 checklist for PROM developers.
RESULTS
Literature review
Our MEDLINE search identified 177 review articles. After duplicates were removed, 166 were screened for relevance by title and abstract. We obtained 19 for full-text review and 5 were included in the literature review. Of these, 4 reviews assessed patient questionnaires using readability formulas. Two studies used the Flesch-Kincaid grade level test and 2 studies compared a range of readability formulas including Windows-based software, Reading Calculations, FORCAST, Flesch Reading Ease, and Gunning FOG formulas22,24,25,26. However, no other frameworks to assess the readability and overall quality of the questionnaires were identified. In addition, we identified 94 articles that discussed approaches and methods for cross-cultural adaptation and 31 different guidelines, but no best practices for cross-cultural adaptation of surveys and questionnaires were identified, although some methods were used regularly in the literature27,28,29,30,31.
Comprehensibility and overall quality of questionnaires (Table 3)
The reading levels of all HAQ versions were above the recommended Grade 6 level using the FK and SMOG tests32. The FRE assessed the English HAQ as “fairly easy,” but the French, Spanish, and Dutch versions were “difficult.” The French OAKHQOL was rated as marginally easier than the English version according to the FK and SMOG, but both were “standard” using the FRE.
All versions of the HAQ and OAKHQOL were considered acceptable according to the ELF-Q; however, minor changes could be made to ensure optimal questionnaire comprehensibility. All versions used vocabulary that was considered difficult or rare according to lists of most commonly used words in the different languages33,34,35,36. Improvements could include explicitly stating the purpose of the questionnaire and simplifying the word choices. In terms of context appropriateness, items in the 2 questionnaires were considered generalizable to respondents in all social strata, social/national groups in the society, and they appeared clear and unambiguous. The response options were also clear and unambiguous.
Cultural adaptation
The study results indicated that, among 4074 patients and 15 bilingual people, back-translation had a moderate effect, but expert committees were more effective in ensuring accurate content when adapting a questionnaire. The adaptations made with a back-translation step were not considered better or worse than the others, whereas the adaptations using an expert committee were considered to have better face and content validity. The effects of back-translation and expert committees on other psychometric properties were not significant23.
SIG session
Thirty-one OMERACT 12 delegates from Australia, Europe, and North America attended the equity SIG, including 6 patient research partners.
Participants discussed the challenges of ensuring target patients are considered when assessing PROM for text comprehensibility and cross-cultural adaptation. Distinguishing comprehensibility of PROM and cross-cultural adaptation of PROM as separate but related concepts was considered important. Comprehensibility is an issue for PROM intended for use within 1 culture, but is also an issue for cross-cultural adaptation of PROM for use across different cultures. SIG participants agreed that back-translation does not guarantee accuracy. Challenges discussed included the resources required to complete translation and cross-cultural adaptation successfully, and that these may be barriers to using the committee approach. Patient involvement in the cross-cultural adaptation of questionnaires was discussed as a way of improving the process.
Twenty-six participants agreed that PROM should be assessed for comprehensibility and that suitable methods should be used (4 abstained, 1 no). Twenty-two participants agreed that the cultural equivalency of PROM in different cultures should be assessed and that suitable methods should be applied (7 abstained, 2 no).
DISCUSSION
For the first time, both comprehensibility and cross-cultural adaptation of PROM have been considered together at OMERACT and this was found to be a fruitful initiative.
The 2 concepts presented in this article, comprehensibility and cultural appropriateness of PROM, are important considerations to ensure equity in trials. Despite their wide use, readability formulas, which only consider text complexity, take no account of important discourse features or nontextual dimensions such as context and cultural differences. They do not measure “top-down” factors involved in reading comprehension such as recognizing the structure and organization of a text, or “bottom-up” factors such as the density of information and appropriateness of the language. Thus, they cannot provide useful information on text comprehensibility and therefore, their utility as assessment tools for PROM is questionable. In contrast, the ELF-Q considers the overall structure and organization of a text, the clarity of function, the language and vocabulary used, as well as the content, layout, and cultural appropriateness. These considerations are well known among linguists for being important in determining a person’s ability to comprehend text; and patient information that has been generated using linguistic considerations included in the ELF has been found by patients to be clearer and more effective in communicating information compared to information that has not21. The ELF-Q could help PROM developers ensure that their instrument is understandable and suitable for lower literacy groups by identifying aspects that could be improved. The ELF-Q could also be used during cross-cultural adaptation to increase the quality of the adaptation.
The readability assessments of the HAQ and OAKHQOL in the other languages demonstrate that the differences between languages make it difficult to use a standard readability formula to compare different versions of the same questionnaire. The readability tests we used are intended to assess English text and may not provide accurate assessments of the non-English text complexity.
Despite the existence of many different guidelines addressing the process of cross-cultural adaptation of questionnaires, there are currently no definite methods for cultural equivalence other than the ones included in the guidelines (e.g., use of an expert committee and/or a focus group of patients).
Assessing the comprehensibility of questionnaires and culturally adapting them for the intended audience requires the development of separate methodologies. Although only 6 of the 31 participants were patient research partners and none were representative of patients with low literacy, overall, equity SIG participants agreed that the literacy skills of the target population and the comprehensibility and cross-cultural adaptation of PROM are important considerations for PROM developers.
The equity SIG’s longterm goal is to include comprehensibility and cross-cultural adaptation as items in the OMERACT checklist for developing core outcome measurement sets in the new Filter 2.0 handbook. This goal was considered premature for Filter 2.0 but should be considered as an option for developers of PROM in OMERACT core sets. Developers should be encouraged to think through the contextual factors of their setting and target audience, including literacy levels, populations at risk for disadvantage, and/or different cultures.
For the assessment of comprehensibility of questionnaires, we will conduct a consensus exercise on the methods required to ensure appropriateness of instruments for groups at risk for disadvantage, especially those with lower literacy levels.
Future work of the equity SIG will also include an investigation into cross-cultural adaptation methods. This will include a consensus exercise on what constitutes adequate cross-cultural adaptation for OMERACT Filter 2.0.
Acknowledgment
The authors thank the OMERACT 2014 delegates who attended the SIG and contributed to the discussion. We also recognize additional members of the equity special interest working group: Laurence Carton, Kanta Kumar, Richard Osborne, and Jordi Pardo.
APPENDIX 1.
Footnotes
As an OMERACT Fellow, J. Epstein received a bursary from the European League Against Rheumatism and support from OMERACT to attend the meeting; F. Guillemin was supported by La Mission Recherche de la Direction de la recherche, des études, de l’évaluation et des statistiques as part of financial support provided to Institut de Recherche en Santé Publique, in the domain of handicap and loss of autonomy; R. Buchbinder is supported in part by an Australian National Health and Medical Research Council Practitioner Fellowship; conference attendance for A. Lyddiatt was made possible by The Arthritis Society, Canada; J. Barton received support from the American College of Rheumatology, Rheumatology Research Foundation, to attend the meeting; C. Flurey is funded by an Arthritis Research UK Fellowship and received support from OMERACT to attend the meeting; C. Barnabe receives salary support from the Canadian Rheumatology Association and The Arthritis Society Clinician Investigator Award, and is a Canadian Institutes of Health Research New Investigator (Community-Based Primary Health Care); R. Christensen is supported by grants from the Oak Foundation; J.A. Singh is supported by grants from the Agency for Health Quality and Research Center for Education and Research on Therapeutics U19 HS021110, US National Institute of Arthritis, Musculoskeletal and Skin Diseases P50 AR060772 and U34 AR062891, National Institute on Aging U01 AG018947, National Cancer Institute U10 CA149950, the resources and the use of facilities at the VA Medical Center at Birmingham, Alabama, and research contract CE-1304-6631 from the Patient Centered Outcomes Research Institute.