Sample size in Group Sequential Trials with survival outcomes in the presence of competing events
The following blog represents my personal perspective and experiences within this context.
I warmly welcome any different viewpoints and constructive feedback.
Please note that certain trial details have been adjusted due to the ongoing application process.
In my role as a senior statistician at the Imperial Clinical Trials Unit, I am currently responsible for designing trials in the critical care setting. This has been truly exciting since there are various methodological and logistical challenges to address within this field. Beyond dealing with rare populations that lead to recruitment challenges, selecting the appropriate primary outcome measure presents a significant challenge in this context. While mortality is the most commonly used primary outcome, it often requires large trials involving thousands of patients.
There are situations where the consideration of alternative patient-centred outcomes such as quality life or functional and cognitive improvement is possible. However, mortality remains a significant competing event in such scenarios.
Competing events are mutually exclusive events such as death or adverse events whose occurrence prevents the event of interest from taking place.
Effectively addressing mortality in trials within critical care settings, when it is not the primary outcome, is crucial. This necessitates the use of innovative and efficient trial designs along with statistical analysis methods that can accommodate this consideration.
The challenging question:
I've recently been involved in designing a trial that aims to compare two types of respiratory support within a limited pool of critically ill patients. The primary endpoint of the trial is the "time to liberation from respiratory support", a choice made after extensive discussions with clinicians and input from the public and patient representatives. It is essential to consider death as a significant competing event because if it occurs before liberation, it precludes us from observing and measuring the primary outcome of the study.
Initially, the trial was designed as a two-parallel-group fixed design, incorporating death as a competing risk. Calculating the sample size for time-to-event data in the presence of competing events was relatively straightforward, thanks to various available software tools. I used the "Logrank Tests Accounting for Competing Risks procedure" in PASS 2022 software for this purpose.
However, following the initial design and feedback from the funding committee (NIHR), we were prompted to consider formal stopping rules for futility. Consequently, the study's design was modified to a group sequential design—a well-established adaptive approach in clinical research. This design involves planned interim analyses with predefined criteria for potential trial termination based on evidence of efficacy or futility. Such adjustments, along with decision rules and timing, must be pre-specified. This change in the design necessitated recalculating the sample size for a group-sequential design with a time-to-event outcome in the presence of competing risks.
I found conventional software, including PASS 2022, lacked a method for calculating the sample size for a group sequential design when competing risks were involved. This prompted me to embark on an exploration of this area, diving deep into research. I discovered valuable resources in this field. However, for drafting this blog, my primary references were the papers authored by Baayen et al. in 2019 and Genet A et al. in 2023.
Understanding common competing risk analysis methods:
At the core of sample size calculation lies the selection of research question you are aiming to answer, the choice of the primary estimand and the statistical analysis approach for the trial.
Two main statistical frameworks exist to perform survival analysis in the presence of competing risks: the cause-specific hazard (CSH) and cumulative incidence (CI) or the sub-distribution approach.
The CSH approach focuses on the cause-specific hazard function and estimating rates. This approach models the rate (or hazard) of a specific event by right-censoring individuals at the time of the competing event. The most popular model for the CSH is the Cox proportional hazard (PH) model, which estimates the treatment effect as the hazard ratio.
In contrast, the CI approach estimates the risk of experiencing a specific event before a certain time point. A popular method for modelling the cumulative incidence function (CIF) is employing the Fine and Gray sub-distribution hazard model. The sub-distribution hazard function maintains a one-to-one relationship with its corresponding CIF, representing the instantaneous occurrence rate of the specific type of event in individuals who have not yet experienced that event. This means we're looking at the event rate in individuals who are currently event-free or have previously encountered a competing event. Consequently, the treatment effect is usually quantified as a sub-distribution hazard ratio.
Choosing between these approaches depends on the research question and the clinical context. Each approach has its strengths and limitations, often complementing each other.
In our context, these two methods address distinct clinical questions. If we use the CSH approach, we aim to assess how the treatment affects the chance of a patient being freed from respiratory support immediately following a certain time point, provided they are alive and on respiratory support at that time. On the other hand, the CI approach, facilitated through the Fine and Gray model, investigates how the treatment influences the likelihood of a patient being liberated from respiratory support within a specific number of days while still considering patients who died as potentially at risk for liberation.
Sample size calculation:
For our study, where liberation from respiratory support is considered a positive outcome, and death is regarded as a negative outcome, we chose the CSH model as our primary analysis approach. This choice enables us to clearly describe the treatment mechanism for the event of interest and the competing event. The CI approach, on the other hand, provides a form of benefit-risk assessment. It increases if the liberation rate rises but decreases if the death before the liberation rate increases. Therefore, relying solely on the CI in this setting makes it challenging to understand the treatment mechanism responsible for the difference in CI functions between the treatment and control groups, as various treatment mechanisms can yield the same difference in Cumulative Incidence Functions. To gain further insights, we also considered the CI approach and using the Fine and Gray sub-distribution hazard model as a supplementary analysis.
To address early stopping for futility, we opted for non-binding futility boundaries, aiming to present results from both CSH and CI analyses to the IDMC (Independent Data Monitoring Committee) for their consideration.
To calculate the sample size for a CSH model with interim analyses for futility, we adhered to the guidance of Baayen et al. and the following steps were taken:
1. We calculated the sample size for a fixed design in the presence of competing risks using the "Logrank Tests Accounting for Competing Risks" procedure in PASS 2022. This procedure in PASS uses cause-specific model formulas to calculate the sample size for time-to-event outcomes with competing events if the time-to-event of interest and time-to-competing risk failure are independent and exponentially distributed. Assuming we require a total of 'e' events to achieve 90% power at a one-sided type I error rate of 2.5% to detect a hazard ratio of 'ɵ', our study mandates the inclusion of 'n' subjects, considering recruitment time and follow-up time.
2. We employed the gsDesign package within R software to create a 3-stage group sequential design with planned interim analyses at 40% and 60% of observed events, integrating non-binding futility boundaries. This process involved adjusting the sample size obtained from the first step, using the 'n.fix' and 'nFixSurv' options within the gsDesign package. In the context of time-to-event outcomes, 'n.fix' denotes the number of events required for a fixed design, whereas 'nFixSurv' accounts for the fixed design sample size. In our specific example, this means employing 'n.fix=e' and 'nFixSurv=n' to determine the updated number of events and subjects while considering the 3-stage group sequential design.
3. Subsequently, we made adjustments to the sample size from stage 2 (denoted as 'n*') to accommodate anticipated dropout rates.
In conclusion, my journey into handling competing risks in clinical trials shed light on how statistical methods and research goals work together. When dealing with these complex situations, the first step is to think about what you want to study, the main question you're trying to answer, and then how you're going to analyse the data.
In the context of planning a group sequential design with a time-to-event outcome and the presence of competing risks, it may be advantageous to consider either the CSH or CI approaches as the primary methodology while also integrating the other as a supplementary analysis. This dual-pronged approach, coupled with the incorporation of non-binding boundaries, equips the IDMC with a wealth of valuable information and enhances its flexibility in determining the potential early termination of the trial.
I have found valuable resources available that can help us make important choices, with a few of them highlighted in the references.
Dr Leila Janani
Imperial Clinical Trials Unit (ICTU)
School of Public Health, Imperial College London
Baayen C, Volteau C, Flamant C, Blanche P. Sequential trials in the context of competing risks: Concepts and case study, with R and SAS code. Stat Med. 2019 Aug 30;38(19):3682-3702. doi: 10.1002/sim.8184. Epub 2019 May 17. PMID: 31099906
PASS 2022 Power Analysis and Sample Size Software (2022). NCSS, LLC. Kaysville, Utah, USA, ncss.com/software/pass.
Anderson K (2023). gsDesign: Group Sequential Design. R package version 3.5.0, https://CRAN.R-project.org/package=gsDesign
Core Team (2023). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. https://www.R-project.org/
Latouche A, Allignol A, Beyersmann J, Labopin M, Fine JP. A competing risks analysis should report results on all cause-specific hazards and cumulative incidence functions. J Clin Epidemiol. 2013 Jun;66(6):648-53. doi: 10.1016/j.jclinepi.2012.09.017. Epub 2013 Feb 14. PMID: 23415868
Genet A, Bogner K, Goertz R, Böhme S, Leverkus F. Safety analysis of new medications in clinical trials: a simulation study to assess the differences between cause-specific and subdistribution frameworks in the presence of competing events. BMC Med Res Methodol. 2023 Jul 13;23(1):168. doi: 10.1186/s12874-023-01985-7. PMID: 37442979; PMCID: PMC10339642.
Martens MJ, Logan BR. A group sequential test for treatment effect based on the Fine-Gray model. Biometrics. 2018 Sep;74(3):1006-1013. Doi: 10.1111/biom.12871. Epub 2018 Mar 13. PMID: 29534294; PMCID: PMC6146968.
Logan BR, Zhang MJ. The use of group sequential designs with common competing risks tests. Stat Med. 2013 Mar 15;32(6):899-913. doi: 10.1002/sim.5597. Epub 2012 Sep 4. PMID: 22945865; PMCID: PMC3574186.