Abstract
The quality-adjusted life-year (QALY) is a construct that integrates the value or preference for a health state over the period of time in that health state. The main use of QALY is in cost-utility analysis, to help make resource allocation decisions when faced with choices. Although the concept of the QALY is appealing, there is ongoing debate regarding their usefulness and approaches to deriving QALY. In 2008, OMERACT engaged in an effort to agree on QALY approaches that can be used in rheumatology. Based on a Web questionnaire and a subsequent meeting, rheumatologists questioned whether it was relevant for OMERACT (1) to investigate use of a QALY that represents the patients’ perspective, (2) to explore the validity of the visual analog scale (VAS) to value health, and (3) to understand the validity of mapping health-specific instruments on existing preference instruments. This article discusses the pros and cons of these points in light of current insight from the point of view of health economics and decision-making theory. It also considers the further research agenda toward a QALY approach in rheumatology.
Over the past 20 years, the development of outcome instruments has taken a high profile in medicine, and their application has proven to be informative when evaluating the results of medical innovations, whether they are diagnostic, prognostic, or therapeutic. Classically, outcome instruments assess the impairments or limitations due to disease in different areas of health and are referred to as “health state” instruments. However, the impairments/limitations due to disease do not necessarily reflect the value or preference for the health state in a decision-making context. The distinction between a score of impairments/limitations opposed to a value or preference for the health state is explained using clear examples in the accompanying report of the Outcome Assessment in Rheumatology Clinical Trials (OMERACT) Patient Perspective Virtual Campus1. As will be explained below, a utility is a value or preference assessed under specific theoretical and methodological conditions.
In order to assess the full consequence of decisions on prioritizing research and distributing resources in healthcare, the quality-adjusted life-year (QALY) was proposed as a method to integrate value of health over time lived in such health state2.
This so-called conventional QALY is formulated as:

in which pst is the probability that a person will be in health state Hs at time t; V(Hst) is the value (or preference) assigned to the health of a person being in health state Hs at time t. S is the number of discrete health states and T the time horizon for decision-making3. Briefly, the QALY integrates value for health and the time in that particular health state. The main strength of the QALY is that V(Hs) is expressed, independent of the type of disease or intervention, on a scale in which zero represents death and 1 represents perfect health. This uniform scaling allows direct comparison of benefits across programs and interventions and therefore facilitates its use in cost-effectiveness analyses and decision-making.
Although QALY are developed for facilitating resource allocation decisions at the societal level, QALY also find their way into clinical medicine. At the societal level, the provision of healthcare is challenged by choices that aim to maximize health of the society. At the clinical level, choices on resource allocation are made in order to maximize the health of (groups of) patients or individual patients. In recent years, the use of QALY to combine the harms and benefits of therapeutic interventions for clinical decisions in groups of patients (e.g., guidelines) and individuals (e.g., decision aids) has received increasing attention. However, these approaches have different objectives and often need methodologically different techniques and perspectives4,5,6. Consequently, this article addresses only choices surrounding resource allocation at the group and not at the individual level.
The concepts, applications, and approaches to measuring QALY are heavily debated in the wider literature and are still evolving. In rheumatology, different approaches for measuring QALY have been found to provide very different results1,7,8,9, resulting in changes in incremental cost-utility ratios to the extent that some economic decisions would potentially be reversed10,11. Within the OMERACT Economics Group, the initiative was taken to move towards a consensus on a QALY approach in rheumatology, in order to ensure comparability across studies and disease, especially when performing cost-utility analyses12. In preparation for the first OMERACT Special Interest Group, a survey among eminent health scientists with expertise in QALY, as well as OMERACT members (mainly rheumatologists), was performed to explore the current practice and understanding of the preferred approach to calculating QALY and to help set research priorities. The survey revealed a sharp discrepancy between the opinion of the health scientists and rheumatology clinical researchers. Rheumatologists warned about the complexity of the instruments and favored approaches that are easy to apply and straightforward to interpret. They felt (1) that a QALY from the perspective of the patient might be valuable, specifically when dealing with choices within the same disease; (2) that the potential role of the visual analog scale (VAS) in calculating QALY should be better understood; and (3) that the validity of transformation of existing instruments (such as the Health Assessment Questionnaire) into a QALY should be better explored. Health scientists, in contrast, felt that health states should be valued preferably from the perspective of the society and that more research was needed on the relevance of the underlying theoretical concepts of the preference and utility approach12.
In this article, we report on the considerations and some additional research by the OMERACT Economics QALY subgroup for each of the research issues that were raised around OMERACT 9. At the end of the article, the main results of discussion at OMERACT 10 are summarized.
Before being able to address these issues, however, it is important to explain in more detail the concepts behind the QALY from the perspective of research in health economics and decision-making sciences; and to touch upon some new directions in QALY research that might inspire OMERACT QALY research.
General Economic Concept of Value
When dealing with choices on allocation of scarce resources, the “value or preferences” for different goods or services become important. There is theoretical and empirical support that the true value of something often becomes apparent when persons are confronted with choices between goods or services1. Therefore preference/value research combines elements from decision-making as well as economic theory13. The traditional approach to understanding preferences or values in the context of allocation of resources has been to use data collected on decisions made by individuals in the marketplace. Such data are commonly referred to as revealed preference data. However, revealed preferences for identifying the value of health gains from specific interventions or services are practically nonexistent in healthcare. Many aspects of healthcare are not traded explicitly in markets, have public-good characteristics, and due to universal or private health insurance are often free or heavily subsidized at the point of service14. Further, due to the unique combination of characteristics of many healthcare markets (the supplier induces the choice), it may not be possible to infer consumer preferences or infer value from the revealed preference data that are available.
A technique that can be used to overcome some of the limitations involved in using revealed preference data to assess the value of health gains is to use stated preference data. This involves using the results of what individuals say they would do rather than what they are observed doing. The most common method for assessing the value an individual assigns to a health gain is to confront the individual with a choice and assess what the individual would sacrifice in order to obtain it. It is common to use monetary units to measure this sacrifice in the form of willingness to pay (WTP) studies15. WTP assumes that the more quantity or quality of a particular good, the more of their own income they would be willing to sacrifice for it. In health, WTP has been used to value interventions or programs associated with health gains16. However, there are a number of concerns with valuing benefits in monetary terms. In many healthcare systems, people have little idea of how much healthcare costs, and rarely have such an understanding from previous experiences. In terms of measurement, individuals often appear to incorporate irrelevant factors into their answers. There are different question formats for measuring WTP17, but evidence suggests that WTP is insensitive to health outcomes. There are also ethical issues in the use of WTP, due to whether individuals are rich or poor, and moral issues concerning the attachment of a price on life18.
Consequently, health economists have searched for alternative aspects that individuals would sacrifice for health gains. The leading alternative to date is to sacrifice life, which is used in the valuation of preferences for the calculation of the QALY19.
Value or Preferences in Healthcare
Several approaches are available to assess stated value or preference for health based on the willingness to sacrifice life. Preferences are commonly described as if they were the same as utilities. However, strictly speaking, only questions presenting a choice with uncertain outcome are truly considered utilities. The measurement methods of utilities center on the concept of risk. The individual is given a choice of a certain outcome against a risk or sacrifice to obtain a benefit. For example, the choice between a duration of life in a guaranteed health state less than full health versus accepting an intervention that may provide a superior health state but at some risk of death. The resulting value, or utility, is on a scale in which 1 = full health and zero = death2,20.
Cardinal or Ordinal Techniques for Preference Assessment
Cardinal measurement of utility assumes that health-related quality of life (HRQOL) can be measured in meaningful and absolute numbers. This allows health states to be presented on an interval scale where the distance between 2 points is meaningful, for example, a health state of 1.0 is twice as good as a health state of 0.5. This contrasts with ordinal techniques, which order or “rank” preferences but do not allow provision of the distance between ranks. It is clear that for use in a QALY, cardinal techniques are preferred. The 3 main cardinal techniques to quantify preferences for health states are the time tradeoff (TTO)21,22,23,24, which asks individuals to sacrifice life-years for better health; the standard gamble (SG)21,25, which asks individuals to risk death for better health; and the rating scale, which asks individuals to simply rate health. SG is the only approach that includes a risk consideration when eliciting preferences and is strictly the only true utility measure. The rating scale, as will be clear, does not fulfil the theoretical paradigm that a true preference elicitation has to include a choice.
OMERACT 9 Issues Towards a QALY Approach
As mentioned, in preparation of OMERACT 9, rheumatologists identified 3 issues where research in rheumatology could contribute to a consensus on approaches for QALY. For each of these issues, the OMERACT Economics QALY subgroup summarized the state of the art and identified strengths and weaknesses that should be taken into consideration when performing future research. These summaries are based on the literature, including research by members of the group.
Role of the rating scale to value health
TTO and SG valuation methods can be criticized for being cognitively difficult for many respondents. The rating scale or VAS has been proposed as an alternative that avoids these problems. A rating scale asks the subject to rate their health by drawing a line on a VAS, where, for example, 100 is the best state of health they can imagine, and 0 is the worst. The value is simply the numerical point on the scale divided by 100. The method can be used to compare outcomes, and the intervals between the placement of the outcomes on the interval scale represent the relative difference in preference for the outcome26. However, the role of VAS measurements for assessing global health as a measure for use in QALY calculations in economic evaluations has been subject to debate27. OMERACT could engage in an effort to study direct comparison (qualitative and quantitative) between methods to elicit QALY that also include a VAS elicitation.
The strengths of the rating scale are clear:
-
The rating scale is the most convenient and is readily understood (by subjects asked to complete the task) compared to both other classic techniques of measuring preference21;
-
The results using rating scales have also been found to be reproducible25.
On the other hand, limitations are recognized:
-
The rating scale does not involve any choice between alternative states or incorporate any effect of lifetime sacrificed or risk into the rating22, and thus does not represent a true value or preference;
-
Some suggest that, from the decision-making perspective, the VAS does not add any information beyond the ability to rank health states28,29;
-
Having a global approach, it is unclear which domains of health contribute to global rating;
-
The scores may be sensitive to the presentation of the question; it was found that even when asking the same question, scores on 2 alternative VAS scales were very different and had only a moderate correlation. The noise surrounding the scores on the rating scale does not follow a particular direction30;
-
A rating scale suffers from anchor bias. A study investigating anchor-point bias found that absolute values tend to differ even though relative values are comparable31. VAS scales are also susceptible to measurement biases such as end-aversion bias. End-aversion bias, similar to problems of central tendency on Likert scales, occurs when a respondent is reluctant to value their health states toward the extreme ends of a continuous scale25,32. End-aversion bias has been found predominantly toward the healthy end of scales, where scores may be about 2 times too far away from the end, whereas minimal bias was found at the unhealthy end28,33.
The question remains whether the limitations are unique to the rating scale/VAS or whether they are solved by the classic approaches. Some argue that VAS scales are one of a number of imperfect valuation methods, but the simplicity, feasibility, and reliability of VAS may result in considerable advantages over more complex measures and provide useful alternatives for valuing health with preference-based approaches for cost-effectiveness analysis27.
Whose preferences: patients or society?
As QALY were developed as a tool for societal resource allocation, the initial discussion on the perspective of the preferences was also driven by the perspective of the decision-maker who has to maximize society’s health with limited budgets. Health budgets must be allocated carefully to ensure that the greatest benefit for society is obtained. In the societal perspective all factors that are influenced by an intervention are considered. This will include everyone affected by an intervention, and all the benefits irrespective of the recipient. The societal perspective is felt to be appropriate on the assumption that the best decisions for the public interest are made by those who do not stand to gain or lose from the decision (avoiding self-interest)34. Further, the use of a scarce resource for one setting (e.g., healthcare) precludes its use in another (e.g., climate); therefore, a societal perspective is considered appropriate to value the opportunity cost/benefit of the resource in the setting of budget allocations.
Considerations when using patients’ preferences include:
-
The preference of the patients would result in decisions at the expense of objectivity. Patients adapt positively or negatively to the disease and might have forgotten what perfect health means. On the other hand, they might also become selfish in their decision and forget the societal view of health. Although a disabled person may be able to achieve what they consider to be perfect health, in objective terms, a complete lack of disability would be considered by the societal perspective as the preferable state34.
-
Expectations of health have been reported to influence QOL. Expectations of health may change, particularly through the course of a chronic disease, and therefore the disparity in a patient’s rating of their health compared to their expectation of “optimal health” may vary over time, resulting in under- or overestimations of the severity of health35,36,37. This phenomenon is also referred to as “reference shift” or “response shift” and may also affect the rating of change over time. Not only expectations but also adaptation is seen as a major driver of change in the absolute reference of health.
Notwithstanding the relevance of the societal perspective, the role of the patient-derived QALY is recognized not only in clinical decisions but increasingly also in resource allocation decisions2,38,39,40. Thus, increasingly, disease-specific preference instruments are developed41,42. It is notable that a review on rheumatologic conditions confirms findings in other diseases that patients’ preferences result in values higher than societal preferences7. OMERACT could discuss the particular settings in which patient preferences are more relevant than societal preferences, and could further study reasons for the discrepancy between patients’ and societal perspective.
Indirect utilities and mapping/cross-walking
For societal decisions, obtaining preferences directly from members of society using TTO, SG, or even rating scales is generally too complicated, time-consuming, and expensive to be performed routinely in longterm studies43. Therefore, indirect utility or preference instruments have been developed. The indirect instruments are based on multidomain health status questionnaires completed by patients. These ratings result in a large number of possible health states. The utility or preference of each health state is obtained through a scoring function derived from direct value assessment of (a selection of all possible) health states by members of the healthy population. Indirect utilities/preferences have the advantage they can be assessed through self-report questionnaires in trials and can be converted to a societal preference or utility. A number of measures exist for this purpose and have been applied in rheumatology patients, such as the EuroQOL-5D (EQ-5D), Medical Outcome Study Short-Form 6D (SF-6D), and the Health Utility Index (HUI) versions 2 (HUI2) and 3 (HUI3)44,45. They were developed to elicit the societal perspective. These measures were reviewed using the OMERACT filter, and each was found to have some strengths, limitations, and issues to consider in their application to rheumatology patients9. There remains no consensus on the best utility measure for use in economic evaluation.
Although the instruments above are easy to complete, they are not always included in clinical trials. Moreover, they might be less appropriate to capture the specific aspects of health relevant for the global health state. Therefore, attempts are made to link disease-specific outcome measures to (usually indirect) utility values. This technique is called mapping or cross-walking. Mapping is gaining popularity as it enables health state utility/preference values to be predicted for use in cost per QALY analysis when no preference-based measure has been included in the study. Such mapping will result in a function that helps to convert the health state measure into a preference/utility score45.
Specific issues that need to be understood using the mapping approach:
-
Mapping disease-specific health status measures (such as HAQ, BASFI, etc.) on utilities (such as EQ-5D or SF-6D, etc.) will likely put most weight to the domains of health considered in the health status instrument.
-
Several approaches to map health state instruments on preference or utility measures have been proposed, but not directly compared, and it is not clear which method provides the highest internal and external validity46,47.
-
Mapped utility scores from the HAQ have been found to underestimate change in patients with inflammatory arthritis (mapping from HAQ to EQ-5D and SF-6D)48 and also underestimate changes over time in patients with knee pain (mapping, for example, the Western Ontario and McMaster Universities Osteoarthritis Index onto EQ-5D)49.
For this reason, it seems that use of mapping should be better explored by OMERACT to understand the quantitative but also qualitative difference between the mapped and directly measured preferences. At this time, it is recommended that at least one preference-based measure be included in future clinical studies48,50.
Future Directions
Ranking and discrete choice experiments
Preferences in the form of ordinal utilities are typically elicited using ranking and discrete choice experiments51. Rooted in random utility theory, cardinal values can be estimated from ordinal responses using analytical approaches such as a conditional logit model52. While advantageous in terms of cognitive ease and economic theory, there is a limiting factor of ordinal techniques: the results are that cardinal preferences are obtained on a latent utility scale — indicating the relative preference from one state to another — but not anchored on full health and death to be amenable for QALY calculations17. Newer techniques are being developed to overcome this limitation53,54,55.
Including patient experiences in the valuation procedure
A novel approach to overcoming the limitation of using preferences of non-informed members of the general population is to describe the experiences of patients in these health states better, including the way they adapt their lives. Early results suggest this new approach yields values that fall in between conventionally obtained patient and societal values56.
Preferences beyond informing resource allocation decisions
As stated initially, this article refers only to decisions regarding resource allocation. However, QALY also have a role in combining benefits and harms from different interventions in a single score. An example of this at a population level is a study comparing rofecoxib to naproxen, where values for different toxicities were combined with benefits from reduced pain57. At an individual level, measuring preferences can also result in improved outcomes58. However, this area of research requires a different methodology and perspective on the issues described above, notably the method of preference elicitation and the source of preferences — all requiring further research.
Discussion Points at OMERACT 10
In the context of decisions, “preferences” for health provide better information than simply assessments of health state. However, there is still no agreed method on how preferences should be assessed, and currently used approaches give different results. Notwithstanding, preference-based instruments are increasingly used in healthcare decisions, since they are the basis for QALY. Along with OMERACT 9, outstanding issues were identified that could benefit from research within the OMERACT community, and would help advance the initiative toward a consensus on a QALY approach. However, there were also some doubts whether this was among OMERACT’s goals, interests, and expertise. During a plenary discussion at the OMERACT 10 Patient Perspective Virtual Campus1 and also later during discussion between OMERACT members, the possible contribution of rheumatology research to the QALY was discussed, with the specific question whether it is worthwhile to concentrate on the patient perspective and disease-specific preferences. Overall conclusions were:
-
Research into the QALY is important in rheumatology.
-
Issues identified at OMERACT 9 are supported as receiving priority in future OMERACT research.
-
Preference and QALY research within OMERACT should test the approaches against the global OMERACT pathway: What to measure (concept, construct, domains) and How to measure (filter of validity). This could be especially helpful when evaluating the different theoretical frameworks that are deemed important in preference research and would help to understand the differences between the societal and patient perspectives. Close collaboration with experts in the field of health economics and decision-making sciences remains important.