Adaptive Enrichment in Biomarker-driven Oncology

A practical explainer of adaptive enrichment in biomarker-driven oncology, illustrated with a single-arm Bayesian phase 2 example.
Trial Design
Biomarkers
Bayesian
Author
Published

March 13, 2026

Articles · Adaptive Enrichment in Biomarker-driven Oncology

Note19MAR26 Update

R Shiny app available to explore this method: Launch app | GitHub

In early-phase oncology, selecting the right patient population is key to demonstrating a therapy’s efficacy. This is especially true for drugs with a strong mechanism of action linked to a biomarker: if the target population is mis-specified, the efficacy signal can be diluted or missed altogether.

Adaptive enrichment allows you to change the biomarker threshold for enrolment during the study so that you can better target patients who are expected to respond. This means you do not have to commit to a biomarker cut-off at the start of the trial based on potentially limited Phase 1 data, and it can increase the proportion of responders in the final trial population.

This article walks through the operational and statistical mechanics of adaptive enrichment via a worked example of a single-arm Phase 2 trial.

NotePerspective

Project Optimus has put a spotlight on trial design in early-phase oncology, and more recently the FDA has highlighted Bayesian methods by issuing a draft guidance. A natural next step in biomarker-driven oncology is broader use of advanced designs that improve patient selection.

Patient Selection — The “why”

The core idea is in the name: the biomarker cut-off used to enrol patients, and therefore the related inclusion and exclusion criteria, can be changed during the trial.

This lets you enrol a higher proportion of patients who are expected to respond, while avoiding the need to lock the biomarker cut-off at the start of the trial. That flexibility is particularly useful in a larger Phase 2 study, where you may have only limited Phase 1 or Phase 2a data on the biomarker-response relationship. By using adaptive enrichment, you can reassess that relationship and the corresponding cut-off part-way through the trial. The end result is that you enrol more patients in the “active region” and gain a clearer picture of the biomarker-response relationship and the drug’s efficacy.

Adaptive enrichment trial flow. At each interim the BLRM is fitted and the cutoff is updated if the posterior probability threshold is met; otherwise all-comers enrolment continues. The cutoff can move up or down at each interim.

Adaptive enrichment trial flow. At each interim the BLRM is fitted and the cutoff is updated if the posterior probability threshold is met; otherwise all-comers enrolment continues. The cutoff can move up or down at each interim.

Adaptive Enrichment — The “what”

Adaptive enrichment uses pre-specified statistical analyses to characterise the biomarker–response relationship within the patient population. Based on the outcome of pre-specified interim analyses, the biomarker threshold can be updated.

As in a cohort review during dose escalation, the statistical analysis aims to recommend whether the biomarker cut-off should increase, decrease, or remain unchanged.

The goal is to answer the following questions:

  • Is there a biomarker level that corresponds to an “active region” (where clinical response is greater)?
  • Does the current biomarker cut-off correspond to this “active region”?
  • If not, what is an updated biomarker cut-off that would better select the active region?

The worked example below is adapted primarily from the continuous-biomarker enrichment framework of Liu, Kairalla and Renfro,1 with later methodological developments described by Tu et al.,2 simplified here to a single-arm Phase 2 setting with a monotone BLRM and fixed interim decision thresholds.

Worked Example — Trial Design

For this worked example, I use a single-arm Phase 2 oncology trial with Objective Response Rate (ORR)3 as the primary efficacy endpoint.

The trial starts as an all-comers design: all patients are eligible regardless of biomarker expression, with the only requirement being that biomarker expression is collected at baseline for everyone. The total planned enrolment is n = 300, with interim analyses at n = 100 and n = 200. At each interim, the biomarker-response relationship is modelled and the design may adapt by adding a biomarker cut-off to the eligibility criteria. This sample size provides ≥ 85% power under moderate-to-strong B1 effect scenarios, as shown in the operating characteristics below.

The goal of the interim analysis is to restrict future enrolment to patients with a greater chance of response. Success is defined as achieving an ORR of ≥ 25% in the evaluated population. The final analysis therefore consists of a Go/No-Go determination driven by that target ORR. If no candidate cut-off satisfies the enrichment threshold at either interim, enrolment remains all-comers through to the final analysis.

NoteSingle-arm design caveat

As a single-arm study, this design may be particularly relevant in settings with limited treatment options. With no control arm, there is no assessment of treatment-by-biomarker interaction, so this design on its own cannot determine whether the biomarker is predictive or prognostic.

Statistical Methodology — The “how”

Here is how that worked example translates into an operational statistical framework.

First, decide on the endpoint. Are you:

  1. Comparing the difference in ORR between the enriched (“biomarker positive”) vs. non-enriched (“biomarker negative”) populations, or
  2. Assessing the ORR in the biomarker-positive population only?

Option (1) can support a stronger statistical statement at the end of the trial, whereas option (2) may be better suited when you expect to target mostly responders, for example based on biomarker distribution and prevalence.

For this example, I assume option (2), so the hypothesis is framed such that the enriched population must demonstrate an ORR above a prespecified level (“target ORR”) for the treatment to be considered efficacious.

Next, we specify the prior assumptions and the statistical mechanism used to decide whether the design should enrich to a specific region. Bayesian methods are a natural fit for this type of design and interim decision-making. At each interim, a Bayesian Logistic Regression Model (BLRM) is fitted and the posterior probability that the ORR at a candidate biomarker cut-off exceeds 0.25 is calculated. That posterior probability drives the enrichment decision. The BLRM assumes a monotonic relationship between biomarker and response, an important caveat discussed further in Limitations.

For the final Go/No-Go decision, the probability that the mean ORR in the selected region exceeds 0.25 is used.

Mathematically:

\[ \pi(c) = P(Y = 1 \mid B_1 \ge c, \theta) \]

Interim (cutoff selection)

\[ c_k^*=\min\left\{c\in\mathcal{C}:P\big(\pi(c)\ge0.25\mid D_k\big)>\tau_k\right\} \]

Final (selected-region go/no-go)

\[ P\big(\pi(c^*)\ge0.25\mid D_{\mathrm{final}}\big)>0.80 \]

where \(c^*\) is the cutoff carried forward from the last interim.

Note

The posterior probability is the output from the Bayesian model — driven by the data observed so far. A higher threshold \(\tau\) at later interims reduces the risk of premature enrichment.

Bayesian methods require prior assumptions. For this example, we use weakly informative priors so that the observed data largely drive the biomarker-response relationship. Another option is to use data gathered from Phase 1 and Phase 2a and incorporate those data into the prior to share information across trials.

Putting this all together, we end up with:

Target ORR — In this example, we are interested in a biomarker cut-off that selects a patient population with an ORR of 0.25 or greater.

Enrichment threshold — At the first interim, cast a wide net and update the cut-off based on a probability of the ORR being 0.25 or greater at 30%. For the second interim, tighten this to select the cut-off that corresponds to a 50% chance of the ORR being 0.25 or greater. For the final analysis (Go/No-Go), select a region which has an 80% probability of the ORR being 0.25 or greater. This allows us to avoid over-tightening at earlier interims where data is sparse.

Prior assumptions — Weakly informative priors: observed data largely drive the relationship.

Final Go/No-Go — Determine if any region (all-comers or subgroup) satisfies the target ORR threshold of 0.25 or greater.

BLRM fit (median + 95% credible interval) on an illustrative simulated interim dataset (n = 100). Points are jittered binary outcomes. Shaded region = active region where posterior median ORR ≥ 0.25.

BLRM fit (median + 95% credible interval) on an illustrative simulated interim dataset (n = 100). Points are jittered binary outcomes. Shaded region = active region where posterior median ORR ≥ 0.25.

Interim Decision Making

At each interim, the BLRM characterises the biomarker-response relationship and the enrichment threshold \(\tau\) determines whether the biomarker cut-off should be updated. The question is simple: is there enough evidence that a different cut-off would select a patient population with a higher response rate?

Protocol Requirements

To operationalise this design, the analysis methods, pre-specified interims, and decision rules must be included in the study protocol. In addition, simulation work is required to understand the operating characteristics of the design.

To understand the operating characteristics, the full adaptive process must be simulated repeatedly under a range of plausible scenarios.

The trial is simulated to completion, including the interim analyses and any enrichment decisions, and that process is then repeated many times, for example 1,000 simulations. From this, the power, Type I error, and expected biomarker cut-off can be estimated.

Typical scenarios include null cases used to measure Type I error, along with weak-to-strong effect scenarios. Below are simulated operating characteristics across scenarios:

True ORR
Scenario B1 < 50 B1 ≥ 50 P(Enrich) Median B1 Cutoff Go Rate E[ORR] Final Pop
Null - Inactive (flat ORR = 0.15) 15.0% 15.0% 20.3% 85.0 0.0% 15.1%
Null - Boundary (flat ORR = 0.25) 25.0% 25.0% 50.6% 65.0 34.7% 26.0%
Null - Inverted B1 effect (high ORR in low B1) 25.0% 10.0% 1.4% 80.0 0.0% 17.4%
Active - Global effect (flat ORR = 0.35) 35.0% 35.0% 18.8% 15.0 100.0% 35.3%
Active - Strong B1 effect (step at 50) 10.0% 35.0% 99.9% 65.0 99.6% 35.4%
Active - Moderate B1 effect (step at 50) 15.0% 30.0% 95.9% 75.0 86.5% 30.3%
Active - Weak B1 effect (shallow gradient) 15.8% 25.4% 90.9% 85.0 71.7% 28.9%
WarningType I error note

Under the boundary null scenario — flat ORR = 0.25 across all biomarker levels, exactly equal to the target — the design has a Go rate of 34.7%. This is the nominal Type I error for this design, and is characteristically elevated for Bayesian Go/No-Go Phase 2 designs: when the true ORR exactly meets the target, some Go decisions are expected and by design. The strict null (flat ORR = 0.15) gives a Go rate of 0.0%. Confirmatory Phase 3 trials provide the definitive test.

  • Model: Bayesian logistic regression (BLRM) with two parameters — \(\text{logit}(p_i) = \alpha + \beta \cdot B1_{i,\text{scaled}}\), where \(B1_{i,\text{scaled}} = (B1_i - 50)/50\) maps the raw biomarker to \([-1, 1]\). This assumes a monotonic (log-linear) biomarker–response relationship.

  • Priors: Weakly informative: \(\alpha \sim \text{Normal}(0, 2^2)\), \(\beta \sim \text{Normal}(0, 2^2)\) on the logit scale. This places most prior weight across the full range of plausible response rates and effect directions, while remaining weakly regularising to aid convergence.

  • Posterior approximation: Laplace (normal) approximation to the posterior, combining the MLE Fisher information with the prior precision. Posterior draws (\(K = 4{,}000\)) are sampled from the resulting multivariate normal.

  • Go Rate = probability of a Go decision at Stage 3 across all simulations (power for effect scenarios; Type I error for null scenarios).

  • P(Enrich) = probability that enrolment criteria are restricted to \(B1 \geq \text{cutoff}\) at any interim (Stage 1 or Stage 2).

  • Median B1 cutoff = median selected enrichment threshold, conditional on enrichment occurring. “—” indicates enrichment was rarely or never triggered.

  • E[ORR] Final Pop = expected observed ORR in the final analysed population (all-comers if no enrichment, or \(B1 \geq \text{cutoff}\) if enriched), averaged across simulations.

  • Enrichment thresholds: Stage 1 = 0.30, Stage 2 = 0.50, Stage 3 (final Go/No-Go) = 0.80. The cutoff may increase or decrease at each interim.

  • Cutoff selection strategy: At each interim, candidate cutoffs are evaluated over a grid (\(B1 \in \{0, 5, 10, \ldots, 100\}\)). The lowest cutoff \(c\) satisfying \(P(\text{mean ORR in } B1 \geq c \geq 0.25 \mid \text{data}) \geq \text{stage threshold}\) is selected (broadest viable population).

  • Biomarker distribution: \(B1 \sim \text{Uniform}(0, 100)\) in the unselected population. After enrichment at cutoff \(c\), new patients are drawn from \(B1 \sim \text{Uniform}(c, 100)\).


Two plausible biomarker–response relationships. Left: threshold-type monotone curve — minimal response below B1 ≈ 65, rising sharply above it (e.g. target amplification). Right: optimal expression window — response peaks at intermediate biomarker levels and drops at high expression (e.g. receptor occupancy or enzymatic saturation). The BLRM handles the left well; the right requires a more flexible model.

Two plausible biomarker–response relationships. Left: threshold-type monotone curve — minimal response below B1 ≈ 65, rising sharply above it (e.g. target amplification). Right: optimal expression window — response peaks at intermediate biomarker levels and drops at high expression (e.g. receptor occupancy or enzymatic saturation). The BLRM handles the left well; the right requires a more flexible model.

Limitations and Extensions

Adaptive enrichment can be applied to many types of trial design and endpoints — randomised trials, trials with multiple biomarkers, time-to-event endpoints, and difference-in-response endpoints (comparative trials).

Statistically, the BLRM is a fairly rigid model that works well for simple biomarker–response relationships — for example, a “classic” monotonic relationship. Flexible models such as splines or Gaussian Processes (GP) may be more appropriate for complex biomarker–response relationships.

Summary

Adaptive enrichment with Bayesian modelling is a powerful tool for biomarker-driven trials where the optimal cut-off is not known upfront. By combining a BLRM with pre-specified enrichment thresholds and an escalating ratchet on the posterior probability, the design balances flexibility with appropriate pre-specification rigour. The decision criterion tightens as evidence accumulates, while the biomarker cut-off remains free to move in either direction as the biomarker-response relationship becomes clearer.

The single-arm variant presented here is most appropriate in settings where a control arm is impractical, with the understanding that biomarker predictiveness cannot be formally assessed without randomisation.

Thorough evaluation of operating characteristics must be performed across a range of plausible biomarker-response shapes and effect sizes. Those operating characteristics, along with a clear pre-specified plan for any adaptive changes to the trial, are required for regulatory submission. As the FDA’s evolving Bayesian guidance and Project Optimus both signal, designs like this are becoming more mainstream in modern early-phase oncology.

Back to top

Footnotes

  1. Liu Y, Kairalla JA, Renfro LA. Bayesian adaptive trial design for a continuous biomarker with possibly nonlinear or nonmonotone prognostic or predictive effects. Biometrics 2022; 78(4): 1441-1453.↩︎

  2. Tu Y, Liu Y, Mack WJ, Renfro LA. Bayesian adaptive enrichment design for continuous biomarkers. Stat Med 2025; 44(20–22): e70262.↩︎

  3. ORR is any Complete Response (CR) or Partial Response (PR) as per RECIST 1.1.↩︎