TY - JOUR T1 - Simulation studies of surrogate endpoint validation using single trial and multitrial statistical approaches. JF - The Journal of Rheumatology JO - J Rheumatol SP - 616 LP - 619 VL - 34 IS - 3 AU - Marissa Lassere AU - Kent Johnson AU - Michael Hughes AU - Doug Altman AU - Marc Buyse AU - Sally Galbraith AU - George Wells Y1 - 2007/03/01 UR - http://www.jrheum.org/content/34/3/616.abstract N2 - OBJECTIVE: A schema was recently proposed for assessing the levels of evidence for surrogate validity that included 4 domains: Target, Study Design, Statistical Strength, and Penalties. This report examines one component of the schema. It surveys the literature on methods of statistical validation of surrogate markers and compares these methods head-to-head using simulated datasets. METHODS: Simulated datasets (continuous, multivariate normal) were generated to capture 3 possible relationships of surrogate (S) and true (T) outcome (none, weakly positive, strongly positive) each applied to 4 treatment effects (effect on both surrogate and true outcome, effect on neither, effect on surrogate only, and effect on true outcome only). These datasets were analyzed using single and multitrial statistical approaches, and the results were provided to participants for discussion. RESULTS: The multitrial surrogate threshold effect seemed to capture best the requirement that surrogate validation is demonstrated by a treatment-associated change in the surrogate predicting a treatment-associated change in the outcome. CONCLUSION: There was general agreement that neither a single trial nor any of the single trial statistical methods was adequate to establish surrogate validity. These exercises also showed that summary statistics developed specifically to establish surrogate validity, such as the proportion of the effect explained, were problematic. A sizable statistical research agenda remains, which includes investigating the additional advantage obtained with modeling subject-level data compared to modeling with only trial-level data; and developing and testing multitrial statistical approaches robust to settings with only a few trials. ER -